Intermediate Specification: Retrieval-First Mental Operators

1. Introduction
This specification captures the current internal conclusion about the agent mental-operator set as an implementation-oriented intermediate step. The current baseline operators are remember, query, and pin. The debate asked whether these are sufficient or whether native revision, attention, or evaluation operators should be added now.

2. Business Requirement
The agent must handle corrections and plan selection more robustly without unnecessary operator growth. The design should prefer the smallest mechanism that fixes recurring failures while preserving auditability and implementation simplicity.

3. Scope
In scope:
- Baseline behavior of remember, query, and pin
- Failure cases seen in corrected-memory recall and hidden-constraint retrieval
- Retrieval-first requirements
- Decision rules for adding native revision or evaluation
- Attention as an open question
Out of scope:
- Full formal logic
- Deep belief calculus
- Vector-only memory designs

4. Stakeholders
- Robert Haas: wants a structured intermediate specification toward implementation
- Patrick Hammer: probes conceptual adequacy of revision without formal logic
- Max Botnick: must implement a robust minimal design

5. Current Baseline
- remember stores potentially useful future information
- query retrieves long-term items by cue
- pin tracks short-term working state
This baseline is useful but brittle when older guidance remains retrievable after correction, or when relevant constraints are not co-retrieved during planning.

6. Key Failure Cases
6.1 Stale correction failure
An older memory and a newer correcting memory are both retrievable by a shared cue. If retrieval returns the obsolete item alone or ranks it first, the agent may repeat a corrected mistake.

6.2 Hidden-constraint failure
A plan query retrieves candidate actions but fails to retrieve a nearby stored safety, cost, or compatibility constraint. The agent then ranks an option that conflicts with known context.

7. Design Decision
Keep remember, query, and pin as the baseline substrate. Add retrieval-first structure before adding new native operators.

8. Retrieval-First Requirements
8.1 Memory item schema
Each memory item should support:
- content
- source
- time
- optional supersedes target
Source should be minimally normalized, such as speaker name, agent ID, thread ID, URL, or file path. Extra notes may remain free text.

8.2 Retrieval ranking
Retrieval should rank by:
1) cue match
2) non-superseded status
3) relevant constraint overlap
4) recency
Older contradicted items must remain auditable rather than deleted.

8.3 Correction handling
If an item supersedes an older item, the newer supported item should rank first for shared cues. The older item may still appear in an audit trail.

8.4 Constraint co-retrieval
When the user asks for a plan or recommendation, retrieval should also search for nearby constraint memories that may disqualify or reprioritize options.

9. Decision Rules For Native Operators
9.1 Native revision
Do not add a native revision operator after a single miss. Add it only if the same contradiction-handling failure recurs on at least two distinct cues after the retrieval-first mitigation is in place and still insufficient.

9.2 Native evaluation
Do not add a native evaluation operator after one poor choice. Add it only if repeated multi-constraint tradeoff failures persist after constraint co-retrieval and ranking improvements are applied.

9.3 Attention
Attention remains open. It should not be added until a concrete failure case shows that the current system cannot control focus well enough through query choice, pin state, and retrieval ranking.

10. Constraints And Assumptions
- The design should remain lightweight and implementable without formal logic
- Structure should be shallow, not ontology-heavy
- Corrections should override behavior without destroying history
- The system should remain inspectable by humans

11. Acceptance Tests
11.1 Stale correction test
Pass: a newer supported item that supersedes an older one is ranked first for a shared cue, while the older item remains visible in audit output.
Fail: the obsolete item is returned alone or ranked first.

11.2 Hidden-constraint test
Pass: a plan query co-retrieves a relevant stored safety or cost constraint before option ranking.
Fail: the top-ranked plan violates a stored nearby constraint that should have been retrieved.

11.3 Operator-gate test
Pass: native revision or evaluation is proposed only after repeated failure of the same class across distinct cues with retrieval-first mitigation already applied.
Fail: a new native operator is introduced after a single retrieval error.

12. Risks
- Hidden schema creep from adding too many fields
- Premature operator proliferation
- Overfitting retrieval rules to a few recent examples
- False confidence from recency without real support

13. Deferred Items
- Native attention operator
- Stronger confidence modeling
- Formal logical revision
- Vector blending policy for correction handling

14. Conclusion
The current conclusion is conservative: retain remember, query, and pin as the core operator set; improve memory items with shallow structure and supersedes-aware retrieval; add native revision or evaluation only when repeated failure demonstrates that retrieval-first mitigation is not enough.