DataConnect Robotics | Belief State Substrate

Overview

The goal is a persistent belief substrate — an explicit internal world model — that stores structured beliefs, updates them from graded evidence, decays toward priors when not reinforced, resolves mutually exclusive contradictions, and retrieves “belief neighborhoods” to constrain downstream reasoning. (Coherent memory with teeth.)

Beliefs are structured

Claims look like “subject — relation — object”, plus confidence, provenance, and time dynamics.

Show equation

$$ (subject,\ relation,\ object) $$

Confidence is w ∈ [0,1]. Status is active / superseded / rejected.

Evidence is graded

Evidence arrives as support s ∈ [-1,1] and reliability ρ ∈ [0,1].

Show mapping to target confidence

$$ \hat{w} = \frac{1 + s}{2} $$

Reliability doesn’t change what you claim — it changes how fast the system moves.

If you switch to full spec

You’ll get the exact update rule, decay toward priors, exclusivity normalization, retrieval scoring, and toy-world validation metrics.

Core Objects

Entity

A persistent node (person, organization, project, place, artifact, concept). Types are informational — behavior comes from relation semantics.

Belief (Claim)

Canonical structured claim, with confidence and provenance.

Show equation

$$ (subject,\ relation,\ object) $$

Status: active, superseded, rejected.

Evidence

Targets a belief with support and reliability.

Show ranges

$$ s \in [-1,1], \quad \rho \in [0,1] $$

Evidence is append-only (so “why do we believe this?” is always answerable).

In practice

The substrate behaves like a scoreboard: evidence nudges confidence, time pulls it back toward a prior, and mutually exclusive claims must compete instead of coexisting silently.

MVP Success Criteria

What “working” means

Beliefs persist across restarts (SQLite store).
Beliefs are structured: (subject, relation, object/value) + metadata.
Evidence ingestion: s, ρ, and source/provenance.
Dynamics: update rule + decay + exclusivity normalization.
Contradiction surfacing: conflicts in exclusivity groups.
Neighborhood retrieval: top-K beliefs by confidence + recency.

Toy demo requirement

A scripted world should demonstrate: belief formation, contradiction, reinforcement-driven stabilization, decay behavior, and metrics (accuracy vs ground truth + contradiction rate).

The goal isn’t UI polish — it’s showing the dynamics behave predictably.

Dynamics (Update Engine)

Parameters (MVP defaults)

Learning rate: $ \alpha = 0.3 $
Decay per turn: $ \lambda = 0.002 $
Global prior: $ \mu = 0.5 $
Exclusivity epsilon: $ \varepsilon = 10^{-6} $
Turn = one user–assistant interaction

Update rule (smoothing)

Reliability gates the step size:

$$ \eta = \alpha \cdot \rho $$

Then blend current confidence $ w $ toward target $ \hat{w} $:

$$ w_{\text{new}} = (1-\eta)w + \eta \hat{w} $$

Decay toward the prior

Beliefs drift back toward $ \mu $ when not reinforced:

$$ w_{\text{decay}} = (1-\lambda \Delta t) w_{\text{new}} + (\lambda \Delta t)\mu $$

MVP simplification: $ \Delta t $ is measured in conversation turns (discrete).

Exclusivity (competition)

For relations flagged exclusive, beliefs sharing $ (subject, relation) $ compete by normalization:

$$ w_i \leftarrow \frac{w_i}{\sum_j w_j + \varepsilon} $$

Practical meaning: “only one can win,” but the system can surface uncertainty when multiple are close.

Provenance

Evidence is append-only, and each belief retains links to the last $ N = 10 $ supporting/contradicting evidence items, enabling “why do we believe this?” queries.

Storage & Identity

SQLite tables

entities, relations, beliefs
sources, evidence
belief_evidence_link

Indexes focus on fast lookup by subject+relation and exclusivity groups.

Belief identity

How a belief is uniquely represented and enforced.

Schemainvariant

Canonical key

$$ (subject\_entity\_id,\ relation\_id,\ object\_entity\_id / object\_value) $$

Uniqueness constraint

$$ \text{UNIQUE}(subject\_entity\_id,\ relation\_id,\ object\_entity\_id,\ object\_value) $$

Translation: the substrate will never store two separate rows that claim the same thing. New evidence updates confidence and provenance — it doesn’t create duplicates.

Belief Neighborhood Retrieval

Ranking ingredients

Recency weight decays exponentially:

$$ \text{recency\_weight} = e^{-k \cdot \text{age\_in\_turns}}, \quad k = 0.05 $$

Optional hop expansion gets penalized:

$$ \text{hop\_penalty} = \begin{cases} 1.0 & \text{(0-hop)} \\ 0.7 & \text{(1-hop)} \end{cases} $$

Final score

Confidence gets emphasized with $ p = 2.0 $:

$$ \text{score} = (w^p) \cdot \text{recency\_weight} \cdot \text{hop\_penalty} $$

Return the top $ K = 20 $ beliefs for an entity.

You’re not retrieving “everything you’ve ever known.” You’re retrieving what’s most likely relevant right now: high-confidence, recent, and close in the graph.

Validation: Toy World

Ground truth (places)

Paris → France
Rome → Italy
Berlin → Germany

Relation: capital_of (exclusive).

Metrics

Accuracy: $ \arg\max_w(\text{capital\_of}) $ matches ground truth
Contradiction rate: ≥2 candidates with $ w \ge 0.55 $ OR within $ \Delta = 0.10 $

The point is to see stabilization with reinforcement and drift with decay.

What This MVP Does Not Try To Be

Non-goals (for now)

No NLP evidence extraction
No embeddings in retrieval
No graph DB backend
No learned update parameters
No wall-clock-time decay (turn-based only)
No UI beyond CLI/testing scripts
No agentic planning loop
No multi-dimensional confidence

Phase 2 questions

Improved entity resolution
Contradiction detection beyond exclusivity
Relation canonicalization
Embedding-based retrieval
Learned reliability estimation

The MVP is meant to be “boring but correct” — then we make it smarter.

Canonical artifact

Download full paper (PDF) ↗

Want to Build This With Me?

If you’re into reliable memory, explicit epistemic state, and systems that don’t collapse under edge cases, I’m always down to compare notes.

Contact Read the paper (PDF)

Belief State Substrate

Why this exists