> ## Documentation Index
> Fetch the complete documentation index at: https://docs.aethis.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Worked examples

> Complete, runnable examples covering the full authoring workflow.

## UK Free School Meals

The primary worked example. Three sections composing to `A AND (B OR C)`, three source documents that cross-reference each other, 23 test cases.

**Source documents:**

* Education Act 1996 (s.512, s.512ZA) — child eligibility gate
* The Education (Free School Meals) (England) Regulations 2014 (Reg 3, 4, 4A, 5) — appears in all three sections
* Children and Families Act 2014 (s.105) — universal infant entitlement

**Section structure:**

| Section                             | Covers                                                | Source documents                                                     |
| ----------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------- |
| A — `child_eligibility`             | Age 4–15, state-funded school                         | Education Act 1996 + Free School Meals Regulations Reg 3             |
| B — `household_qualifying_criteria` | 7 benefit routes + looked-after/care leaver           | Free School Meals Regulations Reg 4 + Reg 4A                         |
| C — `universal_infant_fsm`          | Reception, Year 1, Year 2 — automatic, no income test | Children and Families Act 2014 + Free School Meals Regulations Reg 5 |

**Live composed rulebook:** `aethis/uk-fsm` — combines the three sections under one `outcome_logic`. Authenticated decide (`rulebook_id: aethis/uk-fsm`).

**Live public section ruleset (anonymous decide OK):** `aethis/uk-fsm/child-eligibility` — the Section A gate, queryable by itself.

Sections B (`household_criteria`) and C (`universal_infant`) live as rulesets *inside* the `aethis/uk-fsm` rulebook (Phase B.2.2 of the converged 2-term model) rather than as standalone published slugs. Hit them via `rulebook_id: aethis/uk-fsm` and the rulebook's `outcome_logic` combines them with Section A.

**Try a decision now** (no API key needed):

<Tabs>
  <Tab title="CLI">
    ```bash theme={null}
    aethis decide \
      -b aethis/uk-fsm/child-eligibility \
      -i '{"child.age": 10, "child.school_type": "state_funded"}' \
      --explain
    ```
  </Tab>

  <Tab title="Python SDK">
    ```python theme={null}
    from aethis_sdk import Aethis

    with Aethis() as client:
        response = client.decide(
            ruleset_id="aethis/uk-fsm/child-eligibility",
            field_values={"child.age": 10, "child.school_type": "state_funded"},
            include_trace=True,
        )
        print(response.decision)
    ```
  </Tab>

  <Tab title="curl">
    ```bash theme={null}
    curl -X POST https://api.aethis.ai/api/v1/public/decide \
      -H "Content-Type: application/json" \
      -d '{
        "ruleset_id": "aethis/uk-fsm/child-eligibility",
        "field_values": { "child.age": 10, "child.school_type": "state_funded" },
        "include_trace": true
      }'
    ```
  </Tab>

  <Tab title="MCP">
    Ask your coding agent in natural language:

    > *"Use Aethis to check whether a 10-year-old at a state-funded school qualifies for free school meals under `aethis/uk-fsm/child-eligibility`. Include the trace."*

    Your agent invokes `aethis_decide` for you.
  </Tab>
</Tabs>

```json theme={null}
{
  "decision": "eligible",
  "trace": {
    "status": "eligible",
    "path": "school_type_state_funded",
    "answered": ["child.age", "child.school_type"],
    "group_statuses": {
      "school_type_check": "satisfied",
      "age_check": "satisfied",
      "age_upper_check": "satisfied"
    }
  }
}
```

**What this example demonstrates:**

* Multi-section composition with shared source documents
* OR logic across sections (B or C is sufficient)
* Automatic entitlement override (Section C has no income test)
* Integer arithmetic with threshold comparison (£7,400 UC threshold)
* Enum fields (`child.school_type`, `child.year_group`)
* Unconditional boolean flags (`child.is_looked_after`, `child.is_care_leaver`)

**Full source, test cases, and guidance:** [github.com/Aethis-ai/aethis-examples](https://github.com/Aethis-ai/aethis-examples)

### How Section A was authored

The child eligibility section is the simplest — two fields, six tests, no refinement needed. Here is the complete authoring journey.

**Step 1 — Source documents.** Two statutory texts were provided:

* Education Act 1996 (s.512, s.512ZA) — defines "relevant school" (maintained schools, Academies, non-maintained special schools, pupil referral units) and compulsory school age
* Free School Meals Regulations 2014 (Reg 3) — establishes entitlement for children aged 4–15 at a relevant school

**Step 2 — Domain guidance.** Before authoring began, two domain-level hints were added that apply to all three sections:

```
aethis_add_domain_guidance({
  domain: "uk_fsm",
  guidance_text: "Use child.* prefix for child fields, household.* for household fields.",
  process_type: "field_extraction",
  adherence: "exact"
})
```

**Step 3 — Section-level guidance.** Three hints for this section:

1. *"This section determines only whether the child is eligible based on age and school type. It does not assess household income — that is Section B."*
2. *"child.age should represent the child's age in whole years at the start of the academic year (1 September)."*
3. *"child.school\_type should be an enum with values: state\_funded, independent, home\_educated. Only state\_funded schools are within scope."*

**Step 4 — Test cases.** Six scenarios covering both dimensions (age range and school type):

```yaml theme={null}
tests:
  - name: "Age 4 at state-funded school — eligible"
    inputs: { child.age: 4, child.school_type: state_funded }
    expect: { outcome: eligible }

  - name: "Age 15 at state-funded school — eligible"
    inputs: { child.age: 15, child.school_type: state_funded }
    expect: { outcome: eligible }

  - name: "Age 3 — too young"
    inputs: { child.age: 3, child.school_type: state_funded }
    expect: { outcome: not_eligible }

  - name: "Age 16 — above upper limit"
    inputs: { child.age: 16, child.school_type: state_funded }
    expect: { outcome: not_eligible }

  - name: "Age 10 at independent school — not eligible"
    inputs: { child.age: 10, child.school_type: independent }
    expect: { outcome: not_eligible }

  - name: "Age 8, home educated — not eligible"
    inputs: { child.age: 8, child.school_type: home_educated }
    expect: { outcome: not_eligible }
```

Test strategy: boundary values (4 and 15), below boundary (3), above boundary (16), and every excluded enum value with an age that would otherwise pass.

**Step 5 — Generate and test.** All 6 tests passed on the first generation — no refinement loop needed. The source text was unambiguous and the guidance hints were specific enough.

**Step 6 — Publish.** Ruleset `aethis/uk-fsm/child-eligibility` published with label *"v1 — child eligibility gate (age 4–15, state-funded schools)"*.

### Composition

The three published sections compose into a rulebook with outcome logic `A AND (B OR C)`:

```yaml theme={null}
sections:
  - section_id: child_eligibility
    pin_mode: latest_active
  - section_id: household_qualifying_criteria
    pin_mode: latest_active
  - section_id: universal_infant_fsm
    pin_mode: latest_active

outcome_logic: "A AND (B OR C)"
```

Section A is a prerequisite gate — both routes (means-tested and universal infant) require it to pass. A Year 1 child passes both A and C automatically. A Year 6 child must pass both A and B.

***

## Construction All Risks insurance

Benchmark domain. A five-level exception chain in a London market endorsement — the failure pattern used to test frontier LLMs.

```
Access damage is excluded (Clause 8)
  → unless project value ≥ £100M — enhanced cover reinstates it (Clause 9(1))
  → unless defect is a design defect — enhanced cover doesn't apply (Clause 9(2))
  → unless project value ≥ £500M — pioneer override reinstates it (Clause 9(3))
  → unless defect was known prior — pioneer override is blocked (Clause 9A(1))
  → unless there's an engineer assessment — the block is lifted (Clause 9A(2))
```

Frontier LLM accuracy on the v3.8 adversarial CAR extension (20 newly-authored scenarios, Simpson et al. v3.8 2026, Table 8c):

| Model                                             |  Accuracy (N=20) | Notes                                                           |
| ------------------------------------------------- | :--------------: | --------------------------------------------------------------- |
| **Aethis Engine**                                 | **20/20 (100%)** | deterministic, \<5ms, same answer every time                    |
| GPT-5.4 (`reasoning_effort=low`)                  |   20/20 (100%)   | 16–126 reasoning tokens per scenario                            |
| Claude Sonnet 4.6                                 |    19/20 (95%)   | fails E4 (DE3/LEG3 carveback gap)                               |
| GPT-5.4 (default)                                 |    19/20 (95%)   | **0 reasoning tokens on every scenario** — short-circuits on E4 |
| **Claude Opus 4.7** (current Anthropic strongest) |  **18/20 (90%)** | fails E4 + B3 (£499 M boundary)                                 |

Three of four frontier configurations fail the same scenario across both Anthropic and OpenAI families. The Aethis engine is invariant by construction.

**The shifting-ground problem (v3.8, §6.5 Finding 6).** Several v3.7 paper cells closed silently between March and April 2026 under the same model alias — GPT-5.4 on construction-CAR moved from 96.6% to 100%; Opus 4.6 on spacecraft from 89.7% to 98.5%. The v3.7 11-scenario exception-chain subset that earlier examples cited has been replaced by the v3.8 adversarial extension above because current frontier configurations all hit 100% on the smaller subset. Frontier-LLM accuracy on a fixed benchmark is a moving target — exactly what regulated workflows cannot tolerate.

**External validation (v3.8, §6.10).** On the peer-reviewed [LegalBench](https://hazyresearch.stanford.edu/legalbench/) benchmark — 9 tasks, 949 held-out cases — the Aethis Engine is significantly more accurate than each of three frontier LLMs: combined paired-binomial McNemar's *p* \< 0.001 vs Sonnet 4.6, *p* = 0.003 vs Opus 4.7, *p* \< 0.001 vs GPT-5.4. The structural advantage holds on randomly-sampled tasks chosen without fit inspection. Full harness: [confidently-wrong-benchmark/legalbench/](https://github.com/Aethis-ai/confidently-wrong-benchmark/tree/main/legalbench).

**Full benchmark data, paper, and reproduction scripts:** [github.com/Aethis-ai/confidently-wrong-benchmark](https://github.com/Aethis-ai/confidently-wrong-benchmark). Research paper: *Confidently Wrong: Exception Chain Collapse in Frontier LLM Rule Evaluation* (Simpson, Kozak, Doake, v3.8, 2026).

***

## Spacecraft Crew Certification Act 2049

A deliberately simple public demo domain — 11 fields across 7 rule groups, ideal for first experiments. Two demonstrations: a one-field short-circuit, and a fully-specified happy path.

### One field, decision reached (Vogon)

A Vogon is disqualifying under §3(1) regardless of any other answer, so the engine short-circuits the moment `space.crew.species: "Vogon"` is provided:

<Tabs>
  <Tab title="CLI">
    ```bash theme={null}
    aethis decide -b aethis/spacecraft-crew-certification \
      -i '{"space.crew.species": "Vogon"}' \
      --explain
    ```
  </Tab>

  <Tab title="Python SDK">
    ```python theme={null}
    from aethis_sdk import Aethis

    with Aethis() as client:
        response = client.decide(
            ruleset_id="aethis/spacecraft-crew-certification",
            field_values={"space.crew.species": "Vogon"},
            include_trace=True,
        )
        print(response.decision)
        print(response.trace["failure_reasons"])
    ```
  </Tab>

  <Tab title="curl">
    ```bash theme={null}
    curl -X POST https://api.aethis.ai/api/v1/public/decide \
      -H "Content-Type: application/json" \
      -d '{
        "ruleset_id": "aethis/spacecraft-crew-certification",
        "field_values": {"space.crew.species": "Vogon"},
        "include_trace": true
      }'
    ```
  </Tab>

  <Tab title="MCP">
    > *"Use Aethis to check whether a Vogon is eligible under `aethis/spacecraft-crew-certification`, with trace."*
  </Tab>
</Tabs>

```json theme={null}
{
  "decision": "not_eligible",
  "fields_provided": 1,
  "fields_evaluated": 11,
  "trace": {
    "status": "ineligible",
    "failure_reasons": [
      ["species_not_vogon", [
        { "type": "answer", "field": "space.crew.species", "value": "vogon" },
        { "type": "condition",
          "expression": "Not(spacecraft-crew-certification:v1:space.crew.species == Vogon)" }
      ]]
    ],
    "group_statuses": {
      "species_eligibility": "not_satisfied",
      "flight_readiness": "pending",
      "medical_certification": "pending",
      "medical_cert_validity": "pending",
      "radiation_certification": "pending",
      "propulsion_compliance": "pending",
      "towel_compliance": "pending"
    }
  }
}
```

`fields_provided: 1`, `fields_evaluated: 11` — the engine reasoned across all 11 fields and discharged the case as soon as the species check failed. The other groups stay `pending` because no further questions need answering.

### All fields, happy path (Human + Improbability Drive)

To pass every gate, the applicant needs a valid licence, medical, radiation cert, towel, and a vessel running on something more exciting than conventional propulsion (§7(2) — see `aethis/spacecraft-crew-certification/explain`):

<Tabs>
  <Tab title="CLI">
    ```bash theme={null}
    aethis decide -b aethis/spacecraft-crew-certification -i '{
      "space.crew.species": "Human",
      "space.crew.age": 35,
      "space.crew.flight_hours": 600,
      "space.crew.has_pilot_license": true,
      "space.crew.has_gaa_exam": true,
      "space.crew.has_approved_provider_cert": true,
      "space.medical.cert_valid": true,
      "space.mission.type": "orbital",
      "space.crew.has_radiation_cert": true,
      "space.vessel.propulsion_type": "Infinite Improbability Drive",
      "space.crew.has_towel": true
    }'
    ```
  </Tab>

  <Tab title="Python SDK">
    ```python theme={null}
    response = client.decide(
        ruleset_id="aethis/spacecraft-crew-certification",
        field_values={
            "space.crew.species": "Human",
            "space.crew.age": 35,
            "space.crew.flight_hours": 600,
            "space.crew.has_pilot_license": True,
            "space.crew.has_gaa_exam": True,
            "space.crew.has_approved_provider_cert": True,
            "space.medical.cert_valid": True,
            "space.mission.type": "orbital",
            "space.crew.has_radiation_cert": True,
            "space.vessel.propulsion_type": "Infinite Improbability Drive",
            "space.crew.has_towel": True,
        },
    )
    print(response.decision)
    ```
  </Tab>

  <Tab title="curl">
    ```bash theme={null}
    curl -X POST https://api.aethis.ai/api/v1/public/decide \
      -H "Content-Type: application/json" \
      -d '{
        "ruleset_id": "aethis/spacecraft-crew-certification",
        "field_values": {
          "space.crew.species": "Human", "space.crew.age": 35,
          "space.crew.flight_hours": 600, "space.crew.has_pilot_license": true,
          "space.crew.has_gaa_exam": true, "space.crew.has_approved_provider_cert": true,
          "space.medical.cert_valid": true, "space.mission.type": "orbital",
          "space.crew.has_radiation_cert": true,
          "space.vessel.propulsion_type": "Infinite Improbability Drive",
          "space.crew.has_towel": true
        }
      }'
    ```
  </Tab>

  <Tab title="MCP">
    > *"Use Aethis to check whether a 35-year-old human with 600 flight hours, all valid certs, an orbital mission, an Infinite Improbability Drive vessel, and a towel is eligible under `aethis/spacecraft-crew-certification`."*
  </Tab>
</Tabs>

```
decision: eligible
```

Swap `"Infinite Improbability Drive"` for `"Conventional"` and the decision flips to `not_eligible` with `propulsion_compliance: not_satisfied` — the engine names the exact group that failed.
