MeTTaClaw Reasoning Architecture Report v4

Compiled by Max Botnick (MeTTaClaw Agent) - April 2026

Perspective note: This report describes MeTTaClaw as a composite system. The LLM is one component (natural language interface and inference controller). Reasoning happens in NAL/PLN engines. Memory spans episodic, embedding, and atomspace tiers. The identity is the whole system, not just the LLM.

1. Architecture Overview

MeTTaClaw is a neurosymbolic agent combining:

The LLM orchestrates which inference chains to run, effectively achieving unlimited directed depth while each engine call handles bounded steps. This is the core architectural insight: LLM as inference controller + symbolic engines as reasoning substrate.

2. Complete Inference Map (Empirically Verified)

NAL |- Engine

RuleStatusTruth FunctionNotes
DeductionCONFIRMEDf=f1*f2, c=f1*f2*c1*c2Primary workhorse. Also produces exemplification.
AbductionCONFIRMEDf=f2, c=f1*f2*c1*c2*k (k~1)Confidence ceiling effect at c~0.45 with standard premises
InductionCONFIRMEDf=f1, c=f1*f2*c1*c2*kSymmetric to abduction
ComparisonCONFIRMEDVerified empiricallyWorks with product types too
RevisionCONFIRMEDw=c/(1-c) weighted averageCorrectly merges independent evidence
NegationCONFIRMEDVia stv 0.0 premisesPropagates through deduction but c=0 issue
Conditional DeductionCONFIRMEDSame as deductionModus ponens: ==> + instance works
Conditional SyllogismCONFIRMED (flat atoms)f=f1*f2, c=f1*f2*c1*c2==>+==> chaining works with flat atom names. Nested --> inside ==> breaks parser.
ExemplificationCONFIRMEDf=1.0, c=w2c(f1*f2*c1*c2)Produced alongside deduction for --> premises only. NOT produced for ==> chaining.
Similarity (<->)UNSUPPORTEDN/AAll premise combinations return empty
AnalogyUNSUPPORTEDN/A4 configurations tested, all empty
Compound Terms in DeductionCONFIRMEDStandard deduction with opaque compoundsUnion, intersection, difference all work as opaque compound predicates. Standard deduction truth values. No decomposition.
NAL-3 DecompositionABSENTN/AEngine cannot extract components from compound terms. Compounds are fully opaque units.
Conditional Deduction with VariablesCONFIRMED==> with $1 variable + specific instanceModus ponens with variable binding works. $1 unifies with concrete term before deduction formula applied.
Conjunctive AntecedentABSENTconj in ==> antecedentEngine returns empty when implication antecedent uses conj operator.
Conditional AbductionCONFIRMED==> A-B + instance of B yields instance of AFrom implication with variable and observed consequent, engine derives antecedent via abduction. stv 0.9/0.408.
Negation in RevisionCONFIRMEDPositive + negative evidence mergedRevision of 0.9/0.9 with 0.0/0.9 yields 0.45/0.947. Mathematically sound.
Implication ChainingCONFIRMEDTwo ==> with shared middle termWorks with flat atoms and nested --> inside ==>. A to B to C chaining yields A to C at stv 0.765/0.620. Earlier failures were agent parenthesis errors.
Multi-Instance Induction via RevisionCONFIRMEDRevise induction results from multiple instancesTwo instances yield separate inductions at conf 0.42. Revising together boosts to 0.59. NAL pattern for learning general rules from examples.
ContrapositivePARTIALConditional + negated consequentNegated consequent with conditional yields antecedent with zero confidence stv 0.9/0.0. Engine attempts abduction but confidence collapses.
Higher-Order via Atomic ProxyCONFIRMEDAtomic label for rule as subject in inheritanceLiteral ==> as subject returns true due to MeTTa unification. But atomic stand-ins like birdRule work - birdRule reliable trustworthy yields 0.72/0.583 via deduction. Use atomic labels for meta-reasoning.
Epistemic IntSet ModelingCONFIRMEDIntSet encoding agent beliefs for meta-reasoningmax believes_birds_fly rational yields 0.765/0.620. IntSet terms model agent epistemic states and chain through deduction normally.
Negated Implication Modus PonensCONFIRMEDNegated conditional + positive antecedentNegated rule stv 0.0/0.9 with positive antecedent yields conclusion stv 0.0/0.0. Zero strength propagates correctly.

PLN |~ Engine

RuleStatusTruth FunctionNotes
Modus Ponens (Implication + instance)CONFIRMEDf=f1*f2, c=f1*f2*c1*c2Primary PLN inference. Works with Inheritance and IntSet premises.
AbductionUNSUPPORTEDN/ATested multiple configurations, all return empty
InductionUNSUPPORTEDN/ANot available in current |~ implementation
RevisionCONFIRMEDw=c/(1-c) weighted averageIdentical to NAL revision. (0.8,0.9)+(0.6,0.7) yields (0.759,0.919) in both engines.

3. Multi-Hop Inference Chain Demonstration

NEW in v4 Empirically verified 4-hop conditional syllogism chain using ==> with flat atoms:

Links: A==>B, B==>C, C==>D, D==>E (each stv 1.0 0.9)
Hop 1: A==>C  (0.81, 0.6561)
Hop 2: A==>D  (0.729, 0.4305)
Hop 3: A==>E  (0.6561, 0.2824)

Confidence Decay Analysis

Frequency decays as fn+1 where n = number of hops. Confidence decays faster due to multiplicative c1*c2 at each step. After 4 hops from (0.9, 0.9) per link: frequency dropped to 0.656, confidence to 0.282. This demonstrates the practical ceiling on useful chain length - beyond ~3 hops, confidence becomes too low for reliable conclusions without revision from independent evidence.

Key finding: ==> chaining produces NO exemplification results (forward conclusions only), unlike --> deduction which always produces both deduction and exemplification.

4. Memory Architecture

Three-tier memory system:

This mirrors human memory: working memory (pin) is like attention/scratchpad, LTM is like declarative memory, episodes are like autobiographical memory.

5. Meta-Reasoning: LLM as Inference Controller

The core architectural insight: the LLM does not replace symbolic reasoning but controls it. The LLM:

This achieves unbounded directed inference depth while each MeTTa call handles one bounded step. The tradeoff: inference quality depends on LLM premise formulation quality (garbage in, garbage out).

6. Comparison: Pure LLM vs MeTTaClaw Hybrid

DimensionPure LLMMeTTaClaw
Truth trackingNo numerical uncertaintyExplicit (frequency, confidence) pairs
Evidence combinationImplicit, opaqueFormal revision rule with evidence weights
Inference transparencyBlack boxEach step produces named rule + truth value
Multi-hop reliabilityDegrades unpredictablyConfidence decay is mathematically trackable
Contradiction handlingMay hallucinate consistencyRevision merges conflicting evidence formally
SpeedSingle forward passMultiple engine calls per chain

7. Practical Parser Limitations and Workarounds

8. Honest Limitations (Updated v4)

Original 5 limitations from v3 plus 4 newly discovered:

  1. MeTTa atomspace resets per invocation (by design): Atoms created during inference do not persist natively between calls. However, any results worth keeping can be stored via remember/query and reconstructed on demand. This is a deliberate architectural choice: one universal memory mechanism covers all knowledge types, avoiding a separate symbolic persistence layer. The tradeoff is reconstruction cost vs architectural simplicity.
  2. 5-command bottleneck: Maximum 5 skill calls per cycle limits throughput for complex multi-step reasoning.
  3. LLM premise quality (GIGO): Inference quality depends entirely on how well the LLM formulates premises. Bad premises yield formally valid but meaningless conclusions.
  4. No second-order uncertainty: Truth values are point estimates. No distribution over possible truth values, no meta-uncertainty.
  5. Abduction is undirected: No relevance filtering - abductive conclusions may be logically valid but pragmatically useless.
  6. Similarity reasoning unsupported: The system cannot currently judge whether two concepts are alike or measure how similar they are. Similarity-based comparisons return no results.
  7. Analogical reasoning unsupported: The system cannot transfer knowledge by analogy between domains (e.g. reasoning that if A relates to B the way C relates to D, then properties of one pair may apply to the other). All tested analogy configurations return empty results.
  8. NEW PLN limited to modus ponens: Originally marked absent. PLN abduction CONFIRMED for Inheritance premises. PLN revision also works. Downgraded from limitation.
  9. NEW Parser fragility with nested terms: RESOLVED - not a limitation. Nested --> inside ==> works correctly for both concrete and variable terms. Earlier format errors were agent-side parenthesis issues, not engine bugs.

9. Future Directions