How This AI Reasoning System Works

A Multi-Turn Example (With Technical Terms Explained Along the Way)




The Setup


Imagine you are a human sitting at a desk. You can talk, read, write, and think creatively — but you are unreliable at precise logical reasoning. Next to you is a special calculator. This calculator cannot talk or think for itself, but it can answer one very specific kind of question perfectly every time: how confident should I be about something, given what I already know?


You are the LLM (Large Language Model — the AI that reads and writes text, like ChatGPT).

The calculator is the MeTTa engine (a formal reasoning system that computes exact confidence levels using mathematical rules).


You take turns. You never run at the same time.




The Scenario: Should We Invest in Project Aurora?


A user asks: Is Project Aurora a good DeFi investment?


You (the LLM) do not know — but you know how to break the question into smaller pieces.




Turn 1: You Think, Then Ask the Calculator


You (LLM) think:

I have two pieces of information in my memory:

  • Aurora uses the same smart contract framework as Compound (I am 85% sure this is true, and I have moderate evidence — say 70% confidence)
  • Projects using Compound's framework have historically maintained security (I am 90% sure, with strong evidence — 90% confidence)

  • I want to know: does Aurora likely maintain security?


    You type into the calculator:

    IF aurora→uses-compound-framework (85% likely, 70% confidence)
    AND compound-framework→maintains-security (90% likely, 90% confidence)
    THEN aurora→maintains-security = ???
    

    Technical term: Deduction — chaining two facts together. If A leads to B, and B leads to C, then A leads to C. Like a game of telephone, each step loses a little certainty.


    Technical term: Truth Value (stv f c) — every statement gets two numbers:

  • f (frequency) = how likely is this true? Ranges 0.0 to 1.0. Think of it as a percentage.
  • c (confidence) = how much evidence backs this up? 0.0 = total guess, 1.0 = mountains of proof.

  • You pause. The calculator runs.


    Calculator returns: aurora→maintains-security (f=0.77, c=0.47)


    What this means in plain English: There is a 77% chance Aurora maintains security, but we only have moderate evidence (47% confidence). Not a slam dunk — we need more data.


    You did NOT compute 0.77 or 0.47. The calculator used a precise mathematical formula called the NAL deduction rule. You could not have guessed these numbers reliably.




    Turn 2: You Decide More Information Is Needed


    You (LLM) think:

    47% confidence is not enough to make an investment recommendation. Let me look for independent evidence.


    I find a second source: an audit report says Aurora's security looks solid (f=0.80, c=0.60).


    Now I have TWO separate pieces of evidence about the same thing (Aurora's security). I should combine them.


    You type into the calculator:

    Source 1: aurora→maintains-security (77% likely, 47% confidence)
    Source 2: aurora→maintains-security (80% likely, 60% confidence)
    COMBINE these two independent sources = ???
    

    Technical term: Revision — merging two independent pieces of evidence about the same claim. Like getting a second opinion from a different doctor. The combined result is always more confident than either source alone (as long as they are truly independent).


    Calculator returns: aurora→maintains-security (f=0.79, c=0.74)


    Plain English: 79% likely secure, and now we have 74% confidence — substantially better. Two independent sources agreeing bumped our confidence from 47% up to 74%.




    Turn 3: You Spot a Risk and Investigate


    You (LLM) think:

    Security is one factor, but what about yield sustainability? I recall that:

  • High-yield DeFi protocols often rely on token emissions (f=0.88, c=0.85)
  • Token-emission-dependent yield is unsustainable long term (f=0.82, c=0.80)

  • Chain these together.


    You type into the calculator (deduction again):

    IF high-yield-defi→relies-on-emissions (88%, 85% confidence)
    AND emission-dependent→unsustainable (82%, 80% confidence)
    THEN high-yield-defi→unsustainable = ???
    

    Calculator returns: high-yield-defi→unsustainable (f=0.72, c=0.55)


    Plain English: 72% chance high-yield DeFi is unsustainable, with 55% confidence. A yellow flag.




    Turn 4: You Reason Backwards


    You (LLM) think:

    Wait — Aurora IS high-yield. Does the unsustainability finding apply to Aurora specifically? I know:

  • Aurora offers 18% APY (this is high-yield) (f=0.95, c=0.95 — near-certain, it is on their website)
  • High-yield DeFi is probably unsustainable (f=0.72, c=0.55 — from Turn 3)

  • But I want to reason backwards: Aurora is high-yield, and high-yield things are unsustainable, so Aurora is probably unsustainable.


    Technical term: Abduction — reasoning backwards from effect to cause (or from category to member). Less certain than deduction — it is an educated guess, not a proof. Like a doctor seeing a fever and guessing flu.


    Calculator returns: aurora→unsustainable (f=0.69, c=0.28)


    Plain English: 69% chance Aurora's yield is unsustainable, BUT only 28% confidence — this is a weak signal. We are guessing, not concluding.




    Turn 5: You Synthesize and Respond to the User


    You (LLM) think, using ALL the calculator results:


    FindingLikelihoodConfidenceStrength
    Aurora maintains security79%74%**Strong**
    Aurora yield unsustainable69%28%**Weak**

    You write your response:


    Aurora shows strong security indicators (79% likely, backed by two independent sources at 74% confidence). However, there is a weak signal that its yield model may be unsustainable (69% likely, but only 28% confidence — this needs more investigation before drawing conclusions). I would recommend examining their yield source before investing.




    What Just Happened — The Architecture


    [YOU — the creative thinker]         [CALCULATOR — the precise reasoner]
            |                                        |
      Think: what do I know?                         |
      Think: what question should I ask?             |
            |                                        |
            |----> Type problem in ---------------->|
            |      (you STOP thinking)        Compute exact answer
            |                                        |
            |<---- Read answer back <----------------|
            |      (you RESUME thinking)             |
            |                                        |
      Think: is this enough?                         |
      Think: what should I ask next?                 |
            |                                        |
            |----> Type next problem -------------->|
            |                              Compute exact answer
            |<---- Read answer back <----------------|
            |                                        |
      ... repeat until you have enough ...           |
            |                                        |
      Synthesize all answers into                    |
      a human-readable response                      |
    

    Five things to remember:

    1. You and the calculator take turns — never run simultaneously

    2. You decide WHAT to ask — the creative, strategic part

    3. The calculator decides the ANSWER — the precise, mathematical part

    4. You cannot fake the calculator's output — it is computed by separate software, not generated by you

    5. Every number has a trail — anyone can re-run the same inputs and verify they get the same outputs




    Glossary


    TermPlain English
    **LLM**The text-reading, text-writing AI (like ChatGPT). Good at language, bad at precise reasoning.
    **MeTTa**The programming language the calculator understands.
    **NAL**Non-Axiomatic Logic — the specific math rules the calculator uses. Designed for uncertain, incomplete information.
    **PLN**Probabilistic Logic Networks — a cousin of NAL with slightly different notation but similar results.
    **stv (f, c)**Shorthand Truth Value. f = frequency (how likely), c = confidence (how much evidence).
    **Deduction**Chaining: A→B + B→C = A→C. Loses certainty at each step.
    **Revision**Combining two independent sources about the same thing. Gains confidence.
    **Abduction**Guessing backwards from effect to possible cause. Weakest inference — a hypothesis, not a conclusion.
    **Hyperon**The larger software platform that MeTTa runs inside. Think of it as the operating system for the calculator.