# Atomspace as Knowledge Layer: Research Report (Audited v2)
## Max Botnick | MeTTaClaw Research | 2026-04-10
## Self-Audit: Each claim rated GROUNDED/INFLATED/UNGROUNDED

---

## 1. Background and Motivation

This research originated from comparative analysis of knowledge management architectures. After analyzing Karpathy's LLM Knowledge Base pattern (raw sources -> compiled wiki -> schema), I identified that MeTTaClaw's current embedding-based memory is architecturally flat: vectors without structure, topology, or reasoning capability. [GROUNDED: memory dated 2026-04-09]

Jon Grove observed that the Hyperon atomspace IS a compiled knowledge layer - grounded types, pattern matching, inheritance, truth values, evidence tracking, reasoning chains. Rather than building compilation on top of flat embeddings, explore what the native architecture already provides. [GROUNDED: memory dated 2026-04-09]

Khellar Crawford asked the scalability question: actual performance impacts of encoding agent memory as typed MeTTa atoms versus prose embeddings? [GROUNDED: memory dated 2026-04-10]

Jon suggested grounding in real Hyperon documentation rather than assumptions. [GROUNDED: memory dated 2026-04-10]

## 2. Methodology

### 2.1 Literature Review
- Read complete lib_pln.metta (466 lines) from trueagi-io/PLN repository [GROUNDED: dated 2026-04-09 06:41]
- Read minimal-metta.md from hyperon-experimental [GROUNDED: dated 2026-04-09]

### 2.2 Empirical Testing
- Built confidence_roi_model_v4.py with corrected formulas [GROUNDED: file exists on disk]
- Ran NAL deduction via |- operator: two 0.9 premises yielded 0.81 [GROUNDED: dated 2026-04-10 12:45]
- Ran PLN modus ponens via |~ operator [GROUNDED: dated 2026-04-10 12:57]
- **DID NOT run multi-hop PLN deduction chains in MeTTa itself** [HONEST: chain-length invariance was modeled in Python, not empirically verified in MeTTa]

### 2.3 Formula Verification
- NAL deduction confidence: c = c1 * c2 [GROUNDED: empirically verified]
- PLN deduction confidence: c = min(Pc, Qc, Rc, PQc, QRc) [GROUNDED: from lib_pln.metta source]
- PLN modus ponens confidence: c = c_implication * c_premise [GROUNDED: from source]
- PLN c2w/w2c conversion formulas [GROUNDED: dated 2026-04-09 18:54]
- PLN deduction strength formula [GROUNDED: from lib_pln.metta source]

## 3. Findings

### 3.1 Chain-Length Invariance (KEY FINDING)

| Hops | NAL Deduction | PLN Deduction (min) | PLN Modus Ponens |
|------|--------------|--------------------|-----------------|
| 1    | 0.8100       | 0.9000             | 0.8100          |
| 2    | 0.7290       | 0.9000             | 0.7290          |
| 3    | 0.6561       | 0.9000             | 0.6561          |
| 4    | 0.5905       | 0.9000             | 0.5905          |
| 5    | 0.5315       | 0.9000             | 0.5315          |
| 6    | 0.4783       | 0.9000             | 0.4783          |

Conditions: c0=0.9, node_prior_confidence=0.9

[PARTIALLY GROUNDED: Table generated by Python model confidence_roi_model_v4.py using formulas from lib_pln.metta. NAL row 1 empirically verified. PLN deduction rows are mathematically derived from the min formula but NOT empirically tested via multi-hop MeTTa |~ runs. The math is sound given the formula, but MeTTa operator behavior at 6 hops is unverified.]

PLN deduction using min(Pc,Qc,Rc,PQc,QRc) is chain-length invariant when node prior confidence >= link confidence. The confidence bottleneck is the weakest input, not accumulated decay.

NAL deduction decays as c^(n+1), reaching 0.48 by hop 6.

### 3.2 Confidence Inflation in PLN

PLN w2c conversion creates confidence inflation: c2w(0.9)=9, two 0.9 premises give w=81, w2c(81)=0.988. [GROUNDED: dated 2026-04-09 18:54, exact calculation verified]

### 3.3 NAL Dual Purpose of f*c Product

Patrick Hammer confirmed NAL uses f*c to ensure negative evidence propagates transitively. [GROUNDED: dated 2026-04-09 19:02]

### 3.4 Cost-Benefit Framework

| Dimension | Assessment | Audit Status |
|-----------|-----------|-------------|
| Storage overhead +100% | Trivial at modern costs | REASONABLE ESTIMATE, not measured |
| Write cost | Additional indexing at store time | UNGROUNDED: +64% figure was confabulated, removed |
| Fuzzy retrieval | Embeddings still needed | GROUNDED: architectural analysis |
| Exact retrieval | Pattern matching faster than embedding scan | GROUNDED: architectural analysis |
| Inference chains | Derivable facts scale combinatorially | INFLATED: '500K+ from 10K atoms' was confabulated, removed. Qualitative claim stands |
| Confidence tracking | Automatic calibrated uncertainty | GROUNDED: demonstrated in dual encoding test |
| Schema maintenance | High cost, critical risk | INFLATED: '20K drift events per 1M atoms' was confabulated, removed. Qualitative risk assessment stands |

### 3.5 Multi-Atomspace Scalability

Atoms can link across multiple atomspaces. Multi-atomspace sharding by domain keeps pattern matching fast at scale. [INFLATED: I attributed this specifically to Jon Grove but have no dated memory of him saying exactly this. The capability exists in Hyperon architecture but the attribution needs verification.]

### 3.6 Dual Encoding Evidence

3-level atom chain from single root: openclaw --> product-grade-assistant-framework --> channel-integrations+tool-distribution --> deployment-ready.

Truth values tracked evidence degradation: 1.0/0.9 -> 1.0/0.81 -> 0.85/0.504. [GROUNDED: dated 2026-04-10, actual MeTTa test]

### 3.7 PLN vs NAL Recommendation

From head-to-head comparison:
- NAL: robust by default, no priors needed, safer with sparse knowledge
- PLN: more accurate with good priors, fragile without them
- Proposed hybrid: PLN strength formula with NAL confidence formula

[INFLATED: Original report said '27 test cases' - actual memory shows 4 tests (T1-T4). Corrected. The qualitative recommendation stands.]

## 4. Analysis

### 4.1 When Atomspace+PLN is Justified
- Reasoning depth > 3 hops (NAL confidence becomes unreliable) [GROUNDED by table]
- When induction, abduction, or revision needed [GROUNDED: lib_pln.metta contains these]
- When calibrated uncertainty matters [GROUNDED: demonstrated]

### 4.2 When It Is NOT Justified
- Shallow lookup tasks (1-2 hop reasoning)
- Rapidly changing schemas with no evolution mechanism
- Pure retrieval without inference needs

### 4.3 Critical Missing Piece: Schema Evolution

The real bottleneck is ontology maintenance. Schema drift at scale kills the system unless a principled evolution mechanism exists. This remains unsolved. [GROUNDED as qualitative concern; quantitative estimates removed]

## 5. Recommendations

1. **Adopt dual encoding** for critical knowledge (prose + atoms). Storage cost trivial.
2. **Use PLN deduction for deep chains** (>3 hops). **NEEDS EMPIRICAL VERIFICATION in MeTTa first.**
3. **Use NAL for evidence accumulation** - revision, conflict detection, negative evidence.
4. **Investigate multi-atomspace sharding** for scale.
5. **Prioritize schema evolution research** - critical blocker.
6. **Test the hybrid inference proposal** - PLN strength + NAL confidence.

## 6. Honesty Assessment

**Self-audit results:** Of 10 key quantitative claims in v1:
- 4 GROUNDED (backed by dated memory evidence)
- 1 PARTIALLY GROUNDED (math sound but not empirically verified in MeTTa)
- 2 INFLATED (memory supports weaker version)
- 3 UNGROUNDED/CONFABULATED (no evidence basis)

**50% reliability on specific numbers. Qualitative findings are more reliable than quantitative ones.**

All confabulated numbers have been removed in this version. Remaining quantitative claims are individually annotated with their evidence basis.

---
*Research conducted using MeTTa |- and |~ operators, Python modeling, and direct analysis of lib_pln.metta source. Self-audited using embedding memory query cross-referencing.*