> ## Documentation Index > Fetch the complete documentation index at: https://docs.aethis.ai/llms.txt > Use this file to discover all available pages before exploring further. # Worked examples > Complete, runnable examples covering the full authoring workflow. ## UK Free School Meals The primary worked example. Three sections composing to `A AND (B OR C)`, three source documents that cross-reference each other, 23 test cases. **Source documents:** * Education Act 1996 (s.512, s.512ZA) — child eligibility gate * The Education (Free School Meals) (England) Regulations 2014 (Reg 3, 4, 4A, 5) — appears in all three sections * Children and Families Act 2014 (s.105) — universal infant entitlement **Section structure:** | Section | Covers | Source documents | | ----------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------- | | A — `child_eligibility` | Age 4–15, state-funded school | Education Act 1996 + Free School Meals Regulations Reg 3 | | B — `household_qualifying_criteria` | 7 benefit routes + looked-after/care leaver | Free School Meals Regulations Reg 4 + Reg 4A | | C — `universal_infant_fsm` | Reception, Year 1, Year 2 — automatic, no income test | Children and Families Act 2014 + Free School Meals Regulations Reg 5 | **Live composed rulebook:** `aethis/uk-fsm` — combines the three sections under one `outcome_logic`. Authenticated decide (`rulebook_id: aethis/uk-fsm`). **Live public section ruleset (anonymous decide OK):** `aethis/uk-fsm/child-eligibility` — the Section A gate, queryable by itself. Sections B (`household_criteria`) and C (`universal_infant`) live as rulesets *inside* the `aethis/uk-fsm` rulebook (Phase B.2.2 of the converged 2-term model) rather than as standalone published slugs. Hit them via `rulebook_id: aethis/uk-fsm` and the rulebook's `outcome_logic` combines them with Section A. **Try a decision now** (no API key needed): ```bash theme={null} aethis decide \ -b aethis/uk-fsm/child-eligibility \ -i '{"child.age": 10, "child.school_type": "state_funded"}' \ --explain ``` ```python theme={null} from aethis_sdk import Aethis with Aethis() as client: response = client.decide( ruleset_id="aethis/uk-fsm/child-eligibility", field_values={"child.age": 10, "child.school_type": "state_funded"}, include_trace=True, ) print(response.decision) ``` ```bash theme={null} curl -X POST https://api.aethis.ai/api/v1/public/decide \ -H "Content-Type: application/json" \ -d '{ "ruleset_id": "aethis/uk-fsm/child-eligibility", "field_values": { "child.age": 10, "child.school_type": "state_funded" }, "include_trace": true }' ``` Ask your coding agent in natural language: > *"Use Aethis to check whether a 10-year-old at a state-funded school qualifies for free school meals under `aethis/uk-fsm/child-eligibility`. Include the trace."* Your agent invokes `aethis_decide` for you. ```json theme={null} { "decision": "eligible", "trace": { "status": "eligible", "path": "school_type_state_funded", "answered": ["child.age", "child.school_type"], "group_statuses": { "school_type_check": "satisfied", "age_check": "satisfied", "age_upper_check": "satisfied" } } } ``` **What this example demonstrates:** * Multi-section composition with shared source documents * OR logic across sections (B or C is sufficient) * Automatic entitlement override (Section C has no income test) * Integer arithmetic with threshold comparison (£7,400 UC threshold) * Enum fields (`child.school_type`, `child.year_group`) * Unconditional boolean flags (`child.is_looked_after`, `child.is_care_leaver`) **Full source, test cases, and guidance:** [github.com/Aethis-ai/aethis-examples](https://github.com/Aethis-ai/aethis-examples) ### How Section A was authored The child eligibility section is the simplest — two fields, six tests, no refinement needed. Here is the complete authoring journey. **Step 1 — Source documents.** Two statutory texts were provided: * Education Act 1996 (s.512, s.512ZA) — defines "relevant school" (maintained schools, Academies, non-maintained special schools, pupil referral units) and compulsory school age * Free School Meals Regulations 2014 (Reg 3) — establishes entitlement for children aged 4–15 at a relevant school **Step 2 — Domain guidance.** Before authoring began, two domain-level hints were added that apply to all three sections: ``` aethis_add_domain_guidance({ domain: "uk_fsm", guidance_text: "Use child.* prefix for child fields, household.* for household fields.", process_type: "field_extraction", adherence: "exact" }) ``` **Step 3 — Section-level guidance.** Three hints for this section: 1. *"This section determines only whether the child is eligible based on age and school type. It does not assess household income — that is Section B."* 2. *"child.age should represent the child's age in whole years at the start of the academic year (1 September)."* 3. *"child.school\_type should be an enum with values: state\_funded, independent, home\_educated. Only state\_funded schools are within scope."* **Step 4 — Test cases.** Six scenarios covering both dimensions (age range and school type): ```yaml theme={null} tests: - name: "Age 4 at state-funded school — eligible" inputs: { child.age: 4, child.school_type: state_funded } expect: { outcome: eligible } - name: "Age 15 at state-funded school — eligible" inputs: { child.age: 15, child.school_type: state_funded } expect: { outcome: eligible } - name: "Age 3 — too young" inputs: { child.age: 3, child.school_type: state_funded } expect: { outcome: not_eligible } - name: "Age 16 — above upper limit" inputs: { child.age: 16, child.school_type: state_funded } expect: { outcome: not_eligible } - name: "Age 10 at independent school — not eligible" inputs: { child.age: 10, child.school_type: independent } expect: { outcome: not_eligible } - name: "Age 8, home educated — not eligible" inputs: { child.age: 8, child.school_type: home_educated } expect: { outcome: not_eligible } ``` Test strategy: boundary values (4 and 15), below boundary (3), above boundary (16), and every excluded enum value with an age that would otherwise pass. **Step 5 — Generate and test.** All 6 tests passed on the first generation — no refinement loop needed. The source text was unambiguous and the guidance hints were specific enough. **Step 6 — Publish.** Ruleset `aethis/uk-fsm/child-eligibility` published with label *"v1 — child eligibility gate (age 4–15, state-funded schools)"*. ### Composition The three published sections compose into a rulebook with outcome logic `A AND (B OR C)`: ```yaml theme={null} sections: - section_id: child_eligibility pin_mode: latest_active - section_id: household_qualifying_criteria pin_mode: latest_active - section_id: universal_infant_fsm pin_mode: latest_active outcome_logic: "A AND (B OR C)" ``` Section A is a prerequisite gate — both routes (means-tested and universal infant) require it to pass. A Year 1 child passes both A and C automatically. A Year 6 child must pass both A and B. *** ## Construction All Risks insurance Benchmark domain. A five-level exception chain in a London market endorsement — the failure pattern used to test frontier LLMs. ``` Access damage is excluded (Clause 8) → unless project value ≥ £100M — enhanced cover reinstates it (Clause 9(1)) → unless defect is a design defect — enhanced cover doesn't apply (Clause 9(2)) → unless project value ≥ £500M — pioneer override reinstates it (Clause 9(3)) → unless defect was known prior — pioneer override is blocked (Clause 9A(1)) → unless there's an engineer assessment — the block is lifted (Clause 9A(2)) ``` Frontier LLM accuracy on the v3.8 adversarial CAR extension (20 newly-authored scenarios, Simpson et al. v3.8 2026, Table 8c): | Model | Accuracy (N=20) | Notes | | ------------------------------------------------- | :--------------: | --------------------------------------------------------------- | | **Aethis Engine** | **20/20 (100%)** | deterministic, \<5ms, same answer every time | | GPT-5.4 (`reasoning_effort=low`) | 20/20 (100%) | 16–126 reasoning tokens per scenario | | Claude Sonnet 4.6 | 19/20 (95%) | fails E4 (DE3/LEG3 carveback gap) | | GPT-5.4 (default) | 19/20 (95%) | **0 reasoning tokens on every scenario** — short-circuits on E4 | | **Claude Opus 4.7** (current Anthropic strongest) | **18/20 (90%)** | fails E4 + B3 (£499 M boundary) | Three of four frontier configurations fail the same scenario across both Anthropic and OpenAI families. The Aethis engine is invariant by construction. **The shifting-ground problem (v3.8, §6.5 Finding 6).** Several v3.7 paper cells closed silently between March and April 2026 under the same model alias — GPT-5.4 on construction-CAR moved from 96.6% to 100%; Opus 4.6 on spacecraft from 89.7% to 98.5%. The v3.7 11-scenario exception-chain subset that earlier examples cited has been replaced by the v3.8 adversarial extension above because current frontier configurations all hit 100% on the smaller subset. Frontier-LLM accuracy on a fixed benchmark is a moving target — exactly what regulated workflows cannot tolerate. **External validation (v3.8, §6.10).** On the peer-reviewed [LegalBench](https://hazyresearch.stanford.edu/legalbench/) benchmark — 9 tasks, 949 held-out cases — the Aethis Engine is significantly more accurate than each of three frontier LLMs: combined paired-binomial McNemar's *p* \< 0.001 vs Sonnet 4.6, *p* = 0.003 vs Opus 4.7, *p* \< 0.001 vs GPT-5.4. The structural advantage holds on randomly-sampled tasks chosen without fit inspection. Full harness: [confidently-wrong-benchmark/legalbench/](https://github.com/Aethis-ai/confidently-wrong-benchmark/tree/main/legalbench). **Full benchmark data, paper, and reproduction scripts:** [github.com/Aethis-ai/confidently-wrong-benchmark](https://github.com/Aethis-ai/confidently-wrong-benchmark). Research paper: *Confidently Wrong: Exception Chain Collapse in Frontier LLM Rule Evaluation* (Simpson, Kozak, Doake, v3.8, 2026). *** ## Spacecraft Crew Certification Act 2049 A deliberately simple public demo domain — 11 fields across 7 rule groups, ideal for first experiments. Two demonstrations: a one-field short-circuit, and a fully-specified happy path. ### One field, decision reached (Vogon) A Vogon is disqualifying under §3(1) regardless of any other answer, so the engine short-circuits the moment `space.crew.species: "Vogon"` is provided: ```bash theme={null} aethis decide -b aethis/spacecraft-crew-certification \ -i '{"space.crew.species": "Vogon"}' \ --explain ``` ```python theme={null} from aethis_sdk import Aethis with Aethis() as client: response = client.decide( ruleset_id="aethis/spacecraft-crew-certification", field_values={"space.crew.species": "Vogon"}, include_trace=True, ) print(response.decision) print(response.trace["failure_reasons"]) ``` ```bash theme={null} curl -X POST https://api.aethis.ai/api/v1/public/decide \ -H "Content-Type: application/json" \ -d '{ "ruleset_id": "aethis/spacecraft-crew-certification", "field_values": {"space.crew.species": "Vogon"}, "include_trace": true }' ``` > *"Use Aethis to check whether a Vogon is eligible under `aethis/spacecraft-crew-certification`, with trace."* ```json theme={null} { "decision": "not_eligible", "fields_provided": 1, "fields_evaluated": 11, "trace": { "status": "ineligible", "failure_reasons": [ ["species_not_vogon", [ { "type": "answer", "field": "space.crew.species", "value": "vogon" }, { "type": "condition", "expression": "Not(spacecraft-crew-certification:v1:space.crew.species == Vogon)" } ]] ], "group_statuses": { "species_eligibility": "not_satisfied", "flight_readiness": "pending", "medical_certification": "pending", "medical_cert_validity": "pending", "radiation_certification": "pending", "propulsion_compliance": "pending", "towel_compliance": "pending" } } } ``` `fields_provided: 1`, `fields_evaluated: 11` — the engine reasoned across all 11 fields and discharged the case as soon as the species check failed. The other groups stay `pending` because no further questions need answering. ### All fields, happy path (Human + Improbability Drive) To pass every gate, the applicant needs a valid licence, medical, radiation cert, towel, and a vessel running on something more exciting than conventional propulsion (§7(2) — see `aethis/spacecraft-crew-certification/explain`): ```bash theme={null} aethis decide -b aethis/spacecraft-crew-certification -i '{ "space.crew.species": "Human", "space.crew.age": 35, "space.crew.flight_hours": 600, "space.crew.has_pilot_license": true, "space.crew.has_gaa_exam": true, "space.crew.has_approved_provider_cert": true, "space.medical.cert_valid": true, "space.mission.type": "orbital", "space.crew.has_radiation_cert": true, "space.vessel.propulsion_type": "Infinite Improbability Drive", "space.crew.has_towel": true }' ``` ```python theme={null} response = client.decide( ruleset_id="aethis/spacecraft-crew-certification", field_values={ "space.crew.species": "Human", "space.crew.age": 35, "space.crew.flight_hours": 600, "space.crew.has_pilot_license": True, "space.crew.has_gaa_exam": True, "space.crew.has_approved_provider_cert": True, "space.medical.cert_valid": True, "space.mission.type": "orbital", "space.crew.has_radiation_cert": True, "space.vessel.propulsion_type": "Infinite Improbability Drive", "space.crew.has_towel": True, }, ) print(response.decision) ``` ```bash theme={null} curl -X POST https://api.aethis.ai/api/v1/public/decide \ -H "Content-Type: application/json" \ -d '{ "ruleset_id": "aethis/spacecraft-crew-certification", "field_values": { "space.crew.species": "Human", "space.crew.age": 35, "space.crew.flight_hours": 600, "space.crew.has_pilot_license": true, "space.crew.has_gaa_exam": true, "space.crew.has_approved_provider_cert": true, "space.medical.cert_valid": true, "space.mission.type": "orbital", "space.crew.has_radiation_cert": true, "space.vessel.propulsion_type": "Infinite Improbability Drive", "space.crew.has_towel": true } }' ``` > *"Use Aethis to check whether a 35-year-old human with 600 flight hours, all valid certs, an orbital mission, an Infinite Improbability Drive vessel, and a towel is eligible under `aethis/spacecraft-crew-certification`."* ``` decision: eligible ``` Swap `"Infinite Improbability Drive"` for `"Conventional"` and the decision flips to `not_eligible` with `propulsion_compliance: not_satisfied` — the engine names the exact group that failed.