Belief Propagation API

Status: Generated from current Python docstrings and type hints.

Inference backend surface for factor graphs, lowering, exact inference, junction tree inference, TRW-BP, Mean Field VI, and engine results.

gaia.engine.bp

BP v2 — belief propagation aligned with theory and Gaia IR.

Theory: docs/foundations/theory/06-factor-graphs.md, 07-belief-propagation.md IR lowering: docs/foundations/gaia-ir/07-lowering.md

CLI 主路径使用 InferenceEngine.run() 自动 dispatch： junction_tree → treewidth ≤ 20，精确 trw_bp → n ≤ 2000 且 treewidth > 20，有界近似 mean_field → n > 2000，大图快速近似

本模块下方的 infer() 是旧的便利函数，仍保留 loopy_bp 强制模式和大图 loopy-BP fallback 以兼容旧调用；新代码需要和 gaia run infer 一致时，应直接使用 InferenceEngine。

BeliefPropagation

BeliefPropagation(damping: float = 0.5, max_iterations: int = 100, convergence_threshold: float = 1e-06)

Sum-product loopy Belief Propagation on a FactorGraph (v2).

Implements bp.md §3 exactly, with the following design principles: - All messages are 2-vectors [P(x=0), P(x=1)], always normalized. - Synchronous schedule: all new messages computed from old, then swapped. - Damping per bp.md §4 prevents oscillation in loopy graphs. - Relation variables (CONTRADICTION/EQUIVALENCE) participate fully. - BPDiagnostics always collected (full belief history).

damping: α in bp.md §4. Default 0.5. Range (0, 1]. 1.0 = fully replace old message (fast, may oscillate). 0.5 = half-step (default, balanced stability). Lower values increase stability but slow convergence. max_iterations: Upper bound on sweep iterations. convergence_threshold: Stop early when max|Δbelief| < threshold across all variables.

Initialize loopy BP with damping and convergence controls.

Parameters:

Name	Type	Description	Default
`damping`	`float`	Message damping factor in `(0, 1]`.	`0.5`
`max_iterations`	`int`	Maximum number of synchronous BP sweeps.	`100`
`convergence_threshold`	`float`	Stop when the maximum belief change falls below this value.	`1e-06`

Raises:

Type	Description
`ValueError`	If `damping` is outside `(0, 1]`.

Source code in gaia/engine/bp/bp.py

def __init__(
    self,
    damping: float = 0.5,
    max_iterations: int = 100,
    convergence_threshold: float = 1e-6,
) -> None:
    """Initialize loopy BP with damping and convergence controls.

    Args:
        damping: Message damping factor in ``(0, 1]``.
        max_iterations: Maximum number of synchronous BP sweeps.
        convergence_threshold: Stop when the maximum belief change falls below this value.

    Raises:
        ValueError: If ``damping`` is outside ``(0, 1]``.
    """
    if not (0.0 < damping <= 1.0):
        raise ValueError(f"damping must be in (0, 1], got {damping}")
    self._damping = damping
    self._max_iter = max_iterations
    self._threshold = convergence_threshold

run

run(graph: FactorGraph) -> BPResult

Run loopy BP on graph and return beliefs + diagnostics.

Always returns a BPResult with full diagnostics (never None).

graph: A validated FactorGraph. Variables referenced by factors must be registered. Cromwell clamping is enforced at graph construction.

Returns:

Type	Description
`BPResult`	A BPResult containing posterior `P(x=1)` beliefs and full run diagnostics.

Source code in gaia/engine/bp/bp.py

def run(self, graph: FactorGraph) -> BPResult:
    """Run loopy BP on *graph* and return beliefs + diagnostics.

    Always returns a BPResult with full diagnostics (never None).

    Args:
    graph:
        A validated FactorGraph. Variables referenced by factors must
        be registered. Cromwell clamping is enforced at graph construction.

    Returns:
        A BPResult containing posterior ``P(x=1)`` beliefs and full run diagnostics.
    """
    diag = BPDiagnostics()

    # --- Edge case: empty graph ---
    if not graph.variables:
        diag.converged = True
        return BPResult(beliefs={}, diagnostics=diag)

    # --- Edge case: no factors — beliefs = unary factors or neutral measure ---
    if not graph.factors:
        diag.converged = True
        initial_beliefs = _unfactored_beliefs(graph)
        for vid, p in initial_beliefs.items():
            diag.belief_history[vid] = [p]
        return BPResult(beliefs=initial_beliefs, diagnostics=diag)

    # --- Build reverse index: var -> list of factor indices ---
    var_to_factors = graph.get_var_to_factors()

    # --- Initialize unary factors as 2-vectors ---
    priors = _graph_prior_messages(graph)

    # --- Initialize all messages to uniform [0.5, 0.5] ---
    # f2v_msgs[(fi, vid)] = message from factor fi to variable vid
    # v2f_msgs[(vid, fi)] = message from variable vid to factor fi
    f2v_msgs, v2f_msgs = _initial_message_maps(graph)

    # --- Compute initial beliefs from unary factors only ---
    prev_beliefs = _initialize_belief_history(graph, diag)

    max_change = 0.0

    # --- Main BP loop ---
    for iteration in range(self._max_iter):
        # Step 1: Compute all variable→factor messages (synchronous)
        new_v2f = _compute_all_v2f(v2f_msgs, priors, var_to_factors, f2v_msgs)

        # Step 2: Compute all factor→variable messages (synchronous)
        new_f2v = _compute_all_f2v(graph, f2v_msgs, new_v2f)

        # Step 3: Damp and normalize both sets of messages
        _damp_f2v_messages(f2v_msgs, new_f2v, self._damping)
        _damp_v2f_messages(v2f_msgs, new_v2f, self._damping)

        # Step 4: Compute beliefs
        beliefs = _compute_beliefs(graph, priors, var_to_factors, f2v_msgs, diag)

        # Step 5: Check convergence
        max_change = max(abs(beliefs[vid] - prev_beliefs[vid]) for vid in beliefs)
        prev_beliefs = beliefs

        if max_change < self._threshold:
            _complete_diagnostics(
                diag, converged=True, iterations_run=iteration + 1, max_change=max_change
            )
            return BPResult(beliefs=beliefs, diagnostics=diag)

    # Did not converge within max_iterations
    _complete_diagnostics(
        diag, converged=False, iterations_run=self._max_iter, max_change=max_change
    )
    return BPResult(beliefs=prev_beliefs, diagnostics=diag)

EngineConfig `dataclass`

EngineConfig(jt_max_treewidth: int = JT_MAX_TREEWIDTH, mf_node_limit: int = MF_NODE_LIMIT, trw_damping: float = 0.5, trw_max_iter: int = 200, trw_threshold: float = 1e-08, mf_max_iter: int = 500, exact_max_vars: int = EXACT_MAX_VARS)

InferenceEngine 的配置参数。.

jt_max_treewidth: treewidth ≤ 此值时使用 JT（精确）。 mf_node_limit: 节点数 > 此值时使用 Mean Field VI。 trw_damping: TRW-BP 阻尼系数。 trw_max_iter: TRW-BP 最大迭代次数。 trw_threshold: TRW-BP 收敛阈值。 mf_max_iter: Mean Field 最大迭代次数。 exact_max_vars: 暴力枚举最大变量数。

InferenceEngine

InferenceEngine(config: EngineConfig | None = None)

统一推断引擎，自动选择最优算法。.

自动路由策略（method='auto'）： 1. n > mf_node_limit → Mean Field VI（大图快速近似） 2. treewidth ≤ jt_max_treewidth → JT（精确） 3. 其他 → TRW-BP（有界近似）

config: EngineConfig，控制路由阈值和算法参数。

Initialize the inference engine with optional configuration.

Source code in gaia/engine/bp/engine.py

def __init__(self, config: EngineConfig | None = None) -> None:
    """Initialize the inference engine with optional configuration."""
    self._config = config or EngineConfig()
    cfg = self._config
    self._jt = JunctionTreeInference()
    self._trw = TRWBeliefPropagation(
        damping=cfg.trw_damping,
        max_iterations=cfg.trw_max_iter,
        convergence_threshold=cfg.trw_threshold,
    )
    self._mf = MeanFieldVI(max_iterations=cfg.mf_max_iter)

run

run(graph: FactorGraph, method: MethodChoice = 'auto') -> InferenceResult

在 graph 上运行推断。.

graph: 已 lower 好的 FactorGraph。 method: 'auto'（默认）：按 n 和 treewidth 自动选择。 'jt'：强制 JT（精确，treewidth ≤ 20）。 'trw_bp'：强制 TRW-BP。 'mean_field'：强制 Mean Field VI。 'exact'：强制暴力枚举（仅适用于小图）。

Returns:

Type	Description
`InferenceResult`	InferenceResult，包含边缘概率、算法元数据和耗时。

Source code in gaia/engine/bp/engine.py

def run(
    self,
    graph: FactorGraph,
    method: MethodChoice = "auto",
) -> InferenceResult:
    """在 graph 上运行推断。.

    Args:
    graph:
        已 lower 好的 FactorGraph。
    method:
        'auto'（默认）：按 n 和 treewidth 自动选择。
        'jt'：强制 JT（精确，treewidth ≤ 20）。
        'trw_bp'：强制 TRW-BP。
        'mean_field'：强制 Mean Field VI。
        'exact'：强制暴力枚举（仅适用于小图）。

    Returns:
        InferenceResult，包含边缘概率、算法元数据和耗时。
    """
    cfg = self._config
    t0 = time.perf_counter()
    result: TRWResult | MFResult

    if method == "exact":
        n = len(graph.variables)
        if n > cfg.exact_max_vars:
            raise ValueError(
                f"图有 {n} 个变量，超过暴力枚举上限 {cfg.exact_max_vars}。"
                "请使用 method='jt' 进行精确推断。"
            )
        beliefs, _Z = exact_inference(graph)
        diag = TRWDiagnostics()
        diag.converged = True
        for v, b in beliefs.items():
            diag.belief_history[v] = [b]
        result = TRWResult(beliefs=beliefs, diagnostics=diag)
        elapsed = (time.perf_counter() - t0) * 1000
        logger.info("InferenceEngine: exact, %d vars, %.1fms", n, elapsed)
        return InferenceResult(
            result=result,
            method_used="exact",
            treewidth=-1,
            elapsed_ms=elapsed,
            is_exact=True,
        )

    if method == "auto":
        n = len(graph.variables)
        if n > cfg.mf_node_limit:
            warnings.warn(
                "Mean Field VI fallback "
                f"(n > {cfg.mf_node_limit}) for {n} variables. "
                "This large-graph path is approximate and not production-grade; "
                "use method='trw_bp' when belief values need higher accuracy.",
                UserWarning,
                stacklevel=2,
            )
            method = "mean_field"
        else:
            tw = jt_treewidth(graph)
            method = "jt" if tw <= cfg.jt_max_treewidth else "trw_bp"

    if method == "jt":
        tw = jt_treewidth(graph)
        result = self._jt.run(graph)
        elapsed = (time.perf_counter() - t0) * 1000
        logger.info("InferenceEngine: JT (exact), treewidth=%d, %.1fms", tw, elapsed)
        return InferenceResult(
            result=result,
            method_used="jt",
            treewidth=tw,
            elapsed_ms=elapsed,
            is_exact=True,
        )

    if method == "trw_bp":
        tw = jt_treewidth(graph) if len(graph.variables) <= cfg.mf_node_limit else -1
        result = self._trw.run(graph)
        elapsed = (time.perf_counter() - t0) * 1000
        logger.info("InferenceEngine: TRW-BP, treewidth=%d, %.1fms", tw, elapsed)
        return InferenceResult(
            result=result,
            method_used="trw_bp",
            treewidth=tw,
            elapsed_ms=elapsed,
            is_exact=False,
        )

    if method == "mean_field":
        result = self._mf.run(graph)
        elapsed = (time.perf_counter() - t0) * 1000
        logger.info(
            "InferenceEngine: Mean Field, %d vars, %.1fms", len(graph.variables), elapsed
        )
        return InferenceResult(
            result=result,
            method_used="mean_field",
            treewidth=-1,
            elapsed_ms=elapsed,
            is_exact=False,
        )

    raise ValueError(
        f"method 必须是 'auto', 'jt', 'trw_bp', 'mean_field', 或 'exact'；收到 {method!r}"
    )

benchmark

benchmark(graph: FactorGraph) -> dict[str, dict[str, object]]

运行所有可行算法并返回对比结果。.

Source code in gaia/engine/bp/engine.py

def benchmark(self, graph: FactorGraph) -> dict[str, dict[str, object]]:
    """运行所有可行算法并返回对比结果。."""
    results: dict[str, dict[str, object]] = {}
    for m in ("jt", "trw_bp", "mean_field"):
        r = self.run(graph, method=m)
        results[m] = {
            "beliefs": r.beliefs,
            "elapsed_ms": r.elapsed_ms,
            "is_exact": r.is_exact,
            "treewidth": r.treewidth,
        }
    if len(graph.variables) <= self._config.exact_max_vars:
        r = self.run(graph, method="exact")
        results["exact"] = {
            "beliefs": r.beliefs,
            "elapsed_ms": r.elapsed_ms,
            "is_exact": True,
            "treewidth": -1,
        }
    return results

InferenceResult `dataclass`

InferenceResult(result: TRWResult | MFResult, method_used: str = 'unknown', treewidth: int = -1, elapsed_ms: float = 0.0, is_exact: bool = False)

InferenceEngine 的返回值，包含推断结果和算法元数据。.

result: 底层算法的结果（TRWResult 或 MFResult）。 method_used: 实际使用的算法：'jt', 'trw_bp', 'mean_field', 或 'exact'。 treewidth: 因子图的估计树宽（未计算时为 -1）。 elapsed_ms: 推断耗时（毫秒）。 is_exact: True 表示算法保证返回精确边缘概率。

beliefs `property`

beliefs: dict[str, float]

快捷访问 beliefs 字典。.

diagnostics `property`

diagnostics: TRWDiagnostics | MFDiagnostics

快捷访问 diagnostics。.

Factor `dataclass`

Factor(factor_id: str, factor_type: FactorType, variables: list[str], conclusion: str, p1: float | None = None, p2: float | None = None, cpt: tuple[float, ...] | None = None)

Factor in a factor graph with variables and potential function.

all_vars `property`

all_vars: list[str]

Return all variables involved in this factor.