Inference Pipeline
Overview
gaia run infer runs local (or cross-package joint) inference on a compiled
knowledge package. External priors come from claim metadata set by
priors.py during compilation, plus dependency beliefs injected through
node_priors. Legacy reason+prior DSL pairing is still recognized for
compatibility, but new v0.5 packages should not use it as the primary prior
assignment path.
Command signature:
gaia run infer [PATH] [--depth N]
| Argument / Option | Default | Description |
|---|---|---|
PATH |
. |
Path to the knowledge package directory |
--depth N |
0 |
Dependency depth for joint inference. 0 = flat prior injection, 1 = direct deps, 2+ = recursive transitive deps, -1 = all transitive deps |
Pipeline:
ensure_package_env() # uv sync --quiet
-> load_gaia_package() # import package module, collect declarations
-> apply_package_priors() # resolve register_prior records into metadata["prior"]
-> compile # produce LocalCanonicalGraph
-> staleness check # verify ir_hash matches .gaia/ir_hash
-> validate IR structure
-> lower to factor graph # lower_local_graph(), optionally merge deps
-> validate factor graph
-> InferenceEngine.run() # auto-select JT / TRW-BP / Mean Field VI
-> write .gaia/beliefs.json
Source: gaia/cli/commands/infer.py
gaia run infer intentionally does not consult .gaia/review_manifest.json.
It is the authoring-time numerical preview of the compiled graph. Review
manifests remain qualitative gating artifacts for gaia build check --gate,
gaia inquiry review, trace review, and publish/register workflows; they do
not supply priors and they do not suppress preview beliefs.
Prior Sources And MaxEnt Contract
External priors are embedded in claim metadata at compile time and read by
the lowering layer (lower_local_graph()) from metadata["prior"]. There is no
separate parameterization or review sidecar step.
The v0.5 contract is:
- external priors belong only on independent probabilistic inputs to exported goals;
- zero-premise
observe(...)claims get a default pin to1 - CROMWELL_EPS; a resolvedmetadata["prior"]from the priors layer can still override that value in local preview; - derived claims and helper claims do not receive manual priors;
- claims without an explicit unary prior do not get a synthetic
0.5factor. They remain unconstrained variables, and the exact-inference layer applies maximum entropy over the remaining independent degrees of freedom subject to declared hard constraints.
Priority order
The lowering layer (gaia/engine/bp/lowering.py) resolves each claim's prior with
two branches depending on whether the claim is a relation conclusion
(conclusion of EQUIVALENCE, CONTRADICTION, COMPLEMENT, or IMPLICATION):
Relation conclusions:
structural relation assertion (add_evidence(1))
The structural default applies unconditionally — neither node_priors nor
metadata["prior"] is consulted for graph relation conclusions. Relation
conclusions are asserted true by construction.
Regular claims:
node_priors > metadata["prior"] > no unary prior
node_priors— explicit overrides passed intolower_local_graph(), used for foreign node flat prior injection fromdep_beliefs/(see below).metadata["prior"]— winning resolved value fromregister_prior(...)records, inlineclaim(prior=...)compatibility shortcuts, generated continuous-inference records, or legacyreason+priorDSL pairing.- No unary prior — the variable is left free; MaxEnt is applied at the joint-distribution level, not by multiplying every unassigned claim by an independent 0.5 prior factor.
priors.py
Each package may contain a priors.py module that imports existing package
claims and calls register_prior(...) on independent probabilistic inputs.
Importing priors.py runs those calls, appending records under
Claim.metadata["prior_records"]. apply_package_priors() then applies the
package RESOLUTION_POLICY (or the default policy) and writes the winning value
to Knowledge.metadata["prior"] and
Knowledge.metadata["prior_justification"] before compilation.
# my_package/priors.py
from gaia.engine.lang import register_prior
from my_package import claim_A, claim_B
register_prior(
claim_A,
0.7,
justification="Widely reproduced experimental result",
)
register_prior(
claim_B,
0.4,
justification="Preliminary evidence, single study",
)
Rules:
priors.pymust NOT declare new Knowledge objects — it may only reference claims already declared by the package.priors.pymust NOT export the legacyPRIORS = {...}dict. v0.5+ rejects it with a migration error because it cannot preserve source provenance.- Prior values must satisfy Cromwell's rule:
[CROMWELL_EPS, 1 - CROMWELL_EPS]whereCROMWELL_EPS = 1e-3. justification=must be a non-empty string.source_id=defaults touser_priors; engines, reviewers, calibration jobs, and agents should pass explicit source IDs so the resolution policy can rank them.- No-op when the package has no
priors.py.
Source: gaia/engine/packaging.py :: apply_package_priors() and
gaia/engine/lang/dsl/register_prior.py :: resolve_priors_to_metadata().
Legacy reason+prior DSL pairing
Older strategy and operator DSL functions accept paired reason + prior
keyword arguments. This path is retained for compatibility; new packages should
prefer priors.py for independent input priors and action/relation verbs for
warrants.
from gaia.engine.lang.compat import support, equivalence
# Legacy soft support with warrant prior
support([A, B], C, reason="Evidence converges", prior=0.85)
# Legacy operator with helper claim prior
equivalence(X, Y, reason="Same underlying mechanism", prior=0.99)
The pairing is enforced: providing reason without prior (or vice versa) is
an error. In new v0.5 authoring, do not assign external priors to derived,
structural, relation-helper, or generated-helper claims.
Flat Prior Injection (--depth 0)
With the default --depth 0, foreign knowledge nodes (nodes whose QID does not
start with the local {namespace}:{package}:: prefix) receive flat upstream
beliefs from .gaia/dep_beliefs/:
.gaia/dep_beliefs/
dep_package_1.json # beliefs.json downloaded from upstream
dep_package_2.json
collect_foreign_node_priors() scans these files and builds a
{knowledge_id: belief} dict. For each foreign node in the compiled graph,
if a matching upstream belief exists, it is passed as node_priors to
lower_local_graph(), overriding any other prior source.
This is the lightweight mode: the local package uses upstream conclusions as fixed priors without loading the upstream reasoning structure.
Source: gaia/engine/packaging.py :: collect_foreign_node_priors()
Joint Cross-Package Inference (--depth > 0)
With --depth N (N > 0), dependency packages' compiled factor graphs are merged
for joint inference instead of using flat prior injection.
Dependency discovery
load_dependency_compiled_graphs() scans [project].dependencies in
pyproject.toml for entries ending in -gaia, locates each dependency's
.gaia/ir.json, and deserializes to LocalCanonicalGraph.
--depth 1: direct dependencies only.--depth 2+: recursive transitive dependencies, decrementing depth at each level.--depth -1: unlimited — all transitive dependencies.- Dependencies are deduplicated by
{namespace}:{package_name}prefix.
Graph merging
Each dependency graph is lowered to a FactorGraph independently, then
merge_factor_graphs() combines them with the local factor graph:
- Dep variables first — a dependency graph is authoritative for variables it owns (those starting with its QID prefix). Foreign references in the dep graph may carry neutral placeholder priors.
- Local variables second — the local graph overwrites only locally-owned
nodes (those starting with
local_prefix). - Dep factors are copied with prefixed IDs (
dep_{import_name}_{fid}) to avoid collision. - Local factors are copied with
local_prefix.
The merged graph is then run through the inference engine as a single factor graph, allowing beliefs to propagate across package boundaries.
Source: gaia/engine/bp/lowering.py :: merge_factor_graphs()
Output Format
Output is written to .gaia/beliefs.json under the package directory.
beliefs.json
{
"ir_hash": "sha256:...",
"gaia_lang_version": "0.3.0",
"beliefs": [
{
"knowledge_id": "github:my_pkg::my_claim",
"label": "my_claim",
"belief": 0.683
}
],
"diagnostics": {
"converged": true,
"iterations_run": 12,
"max_change_at_stop": 3.2e-7,
"treewidth": -1,
"belief_history": {
"github:my_pkg::my_claim": [0.7, 0.691, 0.685, "..."]
},
"direction_changes": {
"github:my_pkg::my_claim": 0
}
}
}
| Field | Purpose |
|---|---|
ir_hash |
Content hash of the compiled IR. Must match .gaia/ir_hash when downstream commands (gaia run render, gaia pkg register) verify freshness. |
gaia_lang_version |
Which gaia-lang version produced these beliefs. Useful for detecting numerical drift across patch releases. |
beliefs |
Array sorted by knowledge_id. Includes only knowledge nodes present in the compiled graph (internal auxiliary variables are excluded). Each entry has knowledge_id, label, and posterior belief (P(claim=1)). |
diagnostics |
BP convergence information — see Inference Engine section below. |
Staleness Detection
Before lowering, gaia run infer performs a three-part staleness check:
.gaia/ir_hashmust exist — otherwisegaia build compilehas not been run..gaia/ir.jsonmust exist and be valid JSON.- The stored
ir_hashmust match the freshly recompiledcompiled.graph.ir_hash, AND the stored IR JSON must match the fresh compiled JSON byte-for-byte.
If any check fails, the command exits with an error directing the user to run
gaia build compile again. This ensures inference always runs against the latest
compiled artifacts.
Inference Engine
InferenceEngine (in gaia/engine/bp/engine.py) automatically selects the best
algorithm based on the factor graph's treewidth:
| Method | Condition (auto mode) | Exactness | Typical use |
|---|---|---|---|
| Mean Field VI | n > 2000 variables |
Approximate | Very large graphs where treewidth computation and exact inference are too costly |
| JT (Junction Tree) | n <= 2000 and treewidth <= 20 |
Exact | Small and medium graphs with modest treewidth |
| TRW-BP | n <= 2000 and treewidth > 20 |
Bounded approximate | Dense or cyclic graphs that are too wide for JT |
For Gaia's typical factor graphs with modest treewidth, JT is usually selected, giving exact results in milliseconds. Wide or very large graphs are routed to TRW-BP or Mean Field VI instead of failing just because exact inference is too expensive.
Default parameters
| Parameter | Value | Description |
|---|---|---|
bp_damping |
0.5 | Blending coefficient alpha. 1.0 = full replacement, 0.5 = half-step. |
bp_max_iter |
200 | Upper bound on sweep iterations |
bp_threshold |
1e-8 | Convergence threshold |
jt_max_treewidth |
20 | JT selected when treewidth <= this |
mf_node_limit |
2000 | Mean Field VI selected when variable count exceeds this |
Convergence
BP stops early when the maximum belief change across all variables falls below
bp_threshold. If bp_max_iter is exhausted without convergence, the result
is returned with diagnostics.converged = False.
Diagnostics
The diagnostics object in beliefs.json records:
converged— whether inference reached the convergence thresholditerations_run— number of complete iterations or exact-pass count (JT exact records2, for collect + distribute)max_change_at_stop— maximum belief change in the final iterationtreewidth— estimated treewidth of the factor graph (-1 if not computed)belief_history—{var_id: [belief_at_iter_0, ...]}per variabledirection_changes—{var_id: count}of sign reversals in belief deltas (high counts indicate oscillation)
Console output
After inference completes, gaia run infer prints:
Inferred 42 beliefs
Method: JT (exact), 3ms
Output: /path/to/package/.gaia/beliefs.json
Lowering to Factor Graph
lower_local_graph() in gaia/engine/bp/lowering.py converts a
LocalCanonicalGraph into a FactorGraph suitable for inference.
Variable nodes
Each type=claim Knowledge becomes a variable node. Prior resolution follows
the priority order described in the Prior Sources section above.
Helper claims (labels starting with __) are excluded from user-supplied
node_priors — their priors are determined by operator semantics (relation
operators are asserted through add_evidence(1), compositional operators get
0.5).
Factor types
The FactorType enum defines 10 factor types:
| FactorType | Parameters | Arity constraint |
|---|---|---|
IMPLICATION |
none (deterministic) | exactly 2 variables: antecedent and consequent, plus helper conclusion |
NEGATION |
none (deterministic) | exactly 1 premise |
CONJUNCTION |
none (deterministic) | 2+ premises |
DISJUNCTION |
none (deterministic) | 2+ premises |
EQUIVALENCE |
none (deterministic) | exactly 2 premises |
CONTRADICTION |
none (deterministic) | exactly 2 premises |
COMPLEMENT |
none (deterministic) | exactly 2 premises |
SOFT_ENTAILMENT |
p1, p2 (require p1 + p2 > 1) |
exactly 1 premise |
CONDITIONAL |
cpt (length 2^k) |
1+ premises |
PAIRWISE_POTENTIAL |
cpt (length 4: joint weights) |
exactly 1 variable plus the paired conclusion variable |
Deterministic factors use strict {0, 1} delta potentials. Cromwell clamping
applies to unary evidence/priors and soft probability parameters, not to the
deterministic truth-table potential itself.
Strategy lowering
Strategies are lowered by type. In v0.5 the canonical authoring path is through Action verbs (derive / observe / compute / infer / associate / equal / contradict / exclusive / decompose) plus the lifted bayes.model(...) / bayes.compare(...) helpers for predictive distributions; the entries below describe how each underlying strategy type lowers, regardless of whether it came from an Action verb or a legacy v5 strategy verb.
infer:CONDITIONALfactor with full CPT. Withgiven=Gon the action, the CPT gates onGso that the relation collapses to MaxEnt (0.5) when any ofGis false (the infer-with-given gating introduced in v0.5). If the author omitsp_e_given_not_h, Gaia uses the neutral0.5background likelihood and warns that an explicit background/false-positive likelihood is preferable when known. Wheninfer_use_degraded_noisy_and=True, falls back toCONJUNCTION + SOFT_ENTAILMENT.deduction(lowering target ofderiveand the deprecateddeductionstrategy):CONJUNCTIONfor multiple premises, then deterministicIMPLICATION. Review gates publication quality; it does not supply a numeric prior and does not suppressgaia run inferlocal preview output.support(deprecated v5 strategy; lowering preserved for compatibility): soft implication viaSOFT_ENTAILMENT; legacyprior=folds into its effectivep1.noisy_and(deprecated):CONJUNCTION + SOFT_ENTAILMENT. Single premise omits conjunction.associate:PAIRWISE_POTENTIALfactor over two Claims with joint weights derived fromp_a_given_b,p_b_given_a, and declared endpoint marginals when present. If neither endpoint has a declared marginal, lowering closes the local 2x2 table by Jaynes MaxEnt and warns that authors should preferregister_prior(...)on at least one endpoint when they know the marginal. No helper conclusion variable.- Bayes comparison (
bayes.compare(...)): emits oneinferstrategy per hypothesis with[0.5, clamp(exp(logL_i - logL_max))], plus rigid relation operators driven by theexclusivitysetting. - Other named formal types (legacy v5:
elimination,case_analysis,analogy,extrapolation, ...): auto-formalized viaformalize_named_strategy(), then expanded to deterministic factors. FormalStrategy: each embedded operator maps to a deterministic factor via_OPERATOR_MAP.CompositeStrategy: recursively lowers each sub-strategy.Compose: not a strategy — preserved as a first-class IR node (composes: list[Compose]) and does not produce a BP factor directly; its child actions retain their own lowerings.
Reference: Gaia IR lowering, BP factor potentials, BP formal-strategy lowering.
Package Environment Setup
Before loading the package, ensure_package_env() runs uv sync --quiet in
the package directory. This ensures all dependencies (including other -gaia
packages) are installed and importable. Skipped when pyproject.toml is absent
or uv is not on $PATH. Failures are non-fatal.