Generate rules from source text, run tests, iterate with guidance until all tests pass, then publish.
Authoring is invite-only private beta — request access. Generation uses an LLM to compile your sources into constraint logic; you supply an Anthropic API key per request and it is never stored.
Rule generation is test-driven. The engine generates rules from your source text. Your test suite validates the output. Failing tests become the basis for targeted guidance — feedback that points to the specific source clause the engine missed. The loop repeats until all tests pass.
Rules are compiled from your source text and guidance — not reverse-engineered from your tests. Tests validate the output. Better tests catch more edge cases; better guidance converges faster.
If you completed Phase 2 (field vocabulary), you already have confirmed field names and a project_id — write your test cases using those names and skip to Step 2.If starting fresh on a single-section domain, you need test cases before creating the ruleset. Two approaches:
Inspect an existing ruleset in the same domain with aethis_schema to learn the field naming convention, then write tests using those names.
Make your best guess at field names, create the ruleset with preliminary tests, then run aethis_discover_fields to confirm. If names don’t match, add corrected tests via aethis_create_ruleset with a new project — the cost is low.
Either way, the field names in your test cases must exactly match what the engine discovers. Phase 2 exists to prevent this mismatch.
aethis_create_ruleset({ name: "Child eligibility gate", section_id: "child_eligibility", domain: "uk_fsm", source_text: "...", // text of the source documents for this section test_cases: [ { name: "Age 3 — too young, not eligible", field_values: { "child.age": 3, "child.school_type": "state_funded" }, expected_outcome: "not_eligible" }, { name: "Age 4 — minimum age, eligible", field_values: { "child.age": 4, "child.school_type": "state_funded" }, expected_outcome: "eligible" }, { name: "Age 15 — maximum age, state school, eligible", field_values: { "child.age": 15, "child.school_type": "state_funded" }, expected_outcome: "eligible" }, { name: "Age 16 — over maximum age, not eligible", field_values: { "child.age": 16, "child.school_type": "state_funded" }, expected_outcome: "not_eligible" }, { name: "Age 10, independent school — not eligible", field_values: { "child.age": 10, "child.school_type": "independent" }, expected_outcome: "not_eligible" }, { name: "Age 10, state school — eligible", field_values: { "child.age": 10, "child.school_type": "state_funded" }, expected_outcome: "eligible" } ]})
Returns:
{ "project_id": "proj_8CzLVwyx53rTGEJv" }
Write test cases after running aethis_discover_fields. Test field names must exactly match the engine’s discovered field names. A mismatched name causes the engine to treat that field as absent — the test may silently pass or fail for the wrong reason.
Takes 60–120 seconds for most sections. Returns (first attempt — partial failure):
{ "ruleset_id": "aethis/uk-fsm/child-eligibility", "tests_passing": 4, "tests_total": 6, "failures": [ { "name": "Age 4 — minimum age, eligible", "expected": "eligible", "got": "not_eligible", "hint": "The lower age bound may be using strict > rather than ≥ 4" }, { "name": "Age 10, independent school — not eligible", "expected": "not_eligible", "got": "eligible", "hint": "The school type restriction may not be captured. Check Regulation 3(2)(b)." } ]}
Two failures: the boundary condition at age 4 is wrong (strict > instead of ≥), and the school type restriction isn’t compiled. Each failure includes a hint pointing to the likely cause.
Add guidance that references the specific source clause:
aethis_refine({ project_id: "proj_8CzLVwyx53rTGEJv", feedback: "FSM Regulations 2014 Regulation 3(1): eligibility applies to children aged 4 to 15 inclusive. The lower bound is ≥ 4, not > 4. Regulation 3(2)(b): eligibility is restricted to children attending state-funded schools. The child.school_type field must be checked — only 'state_funded' qualifies. Independent and home-educated do not."})
aethis_refine adds the guidance, then makes the minimal edit to fix the failing tests — seeded from the section’s active ruleset and keeping the passing tests green, rather than re-authoring the whole section. Returns (second attempt — all passing):
Useful when you want to accumulate several pieces of guidance before triggering a generation run:
aethis_add_guidance({ project_id: "proj_8CzLVwyx53rTGEJv", guidance_text: "Regulation 3(2)(b): school_type must be state_funded. Independent and home_educated do not qualify.", process_type: "rule_generation"})
{ "guidance": [ { "id": "hint_001", "process_type": "rule_generation", "guidance_text": "Regulation 3(2)(b): school_type must be state_funded. Independent and home_educated do not qualify.", "created_at": "2026-04-16T10:22:00Z" } ]}
Add once, applies to all projects in the domain — no need to repeat cross-section principles on each ruleset:
aethis_add_domain_guidance({ domain: "uk_fsm", guidance_text: "Discretionary clauses (where the authority 'may' act) must produce 'undetermined', not 'not_eligible'. The system flags for human review — it never exercises discretion on behalf of the decision-maker.", process_type: "rule_generation"})
Real guidance from the UK Free School Meals household criteria section (11 tests, 8 qualifying routes). Each hint addresses a specific compilation gap.
OR logic across routes
Problem: The engine treats all qualifying criteria as AND conditions — a household must meet all routes to qualify, when it should be any one.Guidance:
This section uses OR logic across multiple qualifying routes. A householdqualifies if it meets ANY ONE of the following: Universal Credit with netearnings at or below £7,400/year; Income Support; income-based JSA;income-related ESA; Child Tax Credit only (no Working Tax Credit) withincome at or below £16,190/year; NASS support; the child is looked-after;or the child is a care leaver.
Compound AND within an OR route
Problem: The Universal Credit route passes when receives_universal_credit is true, ignoring the income cap.Guidance:
The Universal Credit route requires TWO conditions simultaneously:household.receives_universal_credit must be true ANDhousehold.annual_net_earnings must be less than or equal to 7400(pounds sterling, annual, after tax and NI). These are AND conditionswithin the UC route, which is then OR'd with other routes.
Unconditional boolean overrides
Problem: Looked-after children are being checked against benefit criteria when they should bypass them entirely.Guidance:
Looked-after children (child.is_looked_after) and care leavers(child.is_care_leaver) qualify automatically with no income or benefitrequirement. These are unconditional boolean fields — if either is true,Section B passes regardless of all other fields.
Negative conditions (absence of a benefit)
Problem: The Child Tax Credit route qualifies anyone receiving CTC, missing the requirement that they must not also receive Working Tax Credit.Guidance:
The Child Tax Credit route (household.receives_child_tax_credit_only)requires that the household receives CTC but does NOT receive WorkingTax Credit. Use a single boolean field that encodes "CTC without WTC".
Pattern: Every effective hint references a specific regulatory clause, names the fields involved, and states the logical relationship (AND, OR, NOT, unconditional). Vague feedback (“fix the UC check”) doesn’t converge.
If a test is failing but the hint isn’t clear enough, use aethis_explain_failure to get a deeper diagnosis:
aethis_explain_failure({ ruleset_id: "aethis/uk-fsm/child-eligibility", field_values: { "child.age": 10, "child.school_type": "independent" }, expected_outcome: "not_eligible", test_name: "Age 10, independent school — not eligible"})
Returns:
{ "test_name": "Age 10, independent school — not eligible", "expected": "not_eligible", "got": "eligible", "failing_criterion": "school_type_check", "compiled_form": "child.age IN [4..15]", "diagnosis": "The compiled rule checks age only. The school_type constraint is missing — Regulation 3(2)(b) restricts eligibility to state-funded schools but is not yet reflected in the compiled rules.", "suggested_guidance": "Add guidance referencing Regulation 3(2)(b): eligibility requires child.school_type = state_funded."}
The ruleset_id is now ready for aethis_decide. Published rulesets are locked and versioned.
To update rules (e.g. for a legislative change), generate a new ruleset in the same project and publish again. The previous ruleset remains available by its specific version ID.
Generation builds its context from every active source on the project, so source hygiene directly affects rule quality.Token budget. Every upload response includes estimated_tokens per file, a project_estimated_tokens running total, and the generation_token_budget it must fit (the model’s context window minus headroom reserved for the generation loop). Before any generation starts, the engine counts the exact prompt — if it exceeds the budget, the request is rejected with 422 token_budget_exceeded (including the count, the budget, and the largest sources) before any model cost is incurred.Duplicates. Re-uploading identical content (or a same-named file) is flagged in the response under possible_duplicates — surfaced, never silently merged. Conflicting versions of the same guidance are a correctness hazard: resolve them by superseding the stale copy.Lifecycle.GET /projects/{project_id}/sources lists every source with its status; PATCH /projects/{project_id}/sources/{source_id} sets it:
active — included in generation (the default)
superseded — replaced by a newer upload (optionally point at it with superseded_by); kept for provenance, excluded from generation
reference_only — kept on the project, excluded from generation
DELETE removes a source outright; prefer superseded when generated rulesets already cite it.
Every generated rule cites the source passages it is grounded in. After generation, the job (and the generate-and-test response) carries a provenance_report:
totals — how many citations resolved against the uploaded sources (verified), cited passages that don’t exist (flagged), and rules or fields with no citation (uncited)
coverage — per source: how many passages are cited by at least one rule, plus a sample of passages nothing cites
The coverage list is the review signal golden tests can’t give you: a statutory exception that no rule cites — and no test exercises — stays green in testing but shows up here. The report never blocks generation; treat it as the reviewer’s punch list.
Rule generation takes 5–15 minutes for complex sections. If the client times out, the server continues generating. Do not re-trigger generation — it creates a duplicate run.
Wait 10–15 minutes
Call aethis_list_rulesets({ project_id: "proj_8CzLVwyx53rTGEJv" }) to check if a new ruleset appeared
If a ruleset is present, run your test suite and publish
Defines how section outcomes compose. Each section is assigned a letter (A, B, C…) in the order listed in ruleset_refs.Supported operators: AND, OR, NOT, parentheses for grouping.
# Activate (validates all ruleset references exist)curl -X POST .../rulebooks/{rulebook_id}/activate -H "x-api-key: ak_live_..."# Evaluate against the composed rulebook (requires an API key with `decide` scope —# anonymous /decide only accepts public `ruleset_id`/slug, not `rulebook_id`)curl -X POST .../decide \ -H "x-api-key: $AETHIS_API_KEY" \ -d '{ "rulebook_id": "{rulebook_id}", "field_values": { ... } }'# Get combined schema from all sectionscurl .../rulebooks/{rulebook_id}/schema# Get combined explanationscurl .../rulebooks/{rulebook_id}/explain
The /decide endpoint accepts either a ruleset_id (single section) or a rulebook_id (composed multi-section). When evaluating against a rulebook, the response includes section_results showing each section’s individual decision.See the UK Free School Meals worked example for a complete multi-section rulebook with live rulesets.