Phase 3 — Rule generation

Private beta. Rule authoring is invite-only private beta — approval required. Decision tools (decide, explain, schema, next question) are available now with no sign-up. Join the authoring waitlist →

Bring your own key. Rule generation uses an LLM to compile your source documents into constraint logic. You provide your own Anthropic API key — it is used for that request only and is never stored.

The TDD loop

Rule generation is test-driven. The engine generates rules from your source text. Your test suite validates the output. Failing tests become the basis for targeted guidance — feedback that points to the specific source clause the engine missed. The loop repeats until all tests pass.

aethis_generate_and_test → tests failing → aethis_refine (with guidance) → repeat → aethis_publish

Rules are compiled from your source text and guidance — not reverse-engineered from your tests. Tests validate the output. Better tests catch more edge cases; better guidance converges faster.

Step 1 — Create a ruleset

If you completed Phase 2 (field vocabulary), you already have confirmed field names and a project_id — write your test cases using those names and skip to Step 2. If starting fresh on a single-section domain, you need test cases before creating the ruleset. Two approaches:

Inspect an existing ruleset in the same domain with aethis_schema to learn the field naming convention, then write tests using those names.
Make your best guess at field names, create the ruleset with preliminary tests, then run aethis_discover_fields to confirm. If names don’t match, add corrected tests via aethis_create_ruleset with a new project — the cost is low.

Either way, the field names in your test cases must exactly match what the engine discovers. Phase 2 exists to prevent this mismatch.

aethis_create_ruleset({
  name: "Child eligibility gate",
  section_id: "child_eligibility",
  domain: "uk_fsm",
  source_text: "...",  // text of the source documents for this section
  test_cases: [
    {
      name: "Age 3 — too young, not eligible",
      field_values: { "child.age": 3, "child.school_type": "state_funded" },
      expected_outcome: "not_eligible"
    },
    {
      name: "Age 4 — minimum age, eligible",
      field_values: { "child.age": 4, "child.school_type": "state_funded" },
      expected_outcome: "eligible"
    },
    {
      name: "Age 15 — maximum age, state school, eligible",
      field_values: { "child.age": 15, "child.school_type": "state_funded" },
      expected_outcome: "eligible"
    },
    {
      name: "Age 16 — over maximum age, not eligible",
      field_values: { "child.age": 16, "child.school_type": "state_funded" },
      expected_outcome: "not_eligible"
    },
    {
      name: "Age 10, independent school — not eligible",
      field_values: { "child.age": 10, "child.school_type": "independent" },
      expected_outcome: "not_eligible"
    },
    {
      name: "Age 10, state school — eligible",
      field_values: { "child.age": 10, "child.school_type": "state_funded" },
      expected_outcome: "eligible"
    }
  ]
})

Returns:

{ "project_id": "proj_8CzLVwyx53rTGEJv" }

Write test cases after running aethis_discover_fields. Test field names must exactly match the engine’s discovered field names. A mismatched name causes the engine to treat that field as absent — the test may silently pass or fail for the wrong reason.

Step 2 — Generate and test

aethis_generate_and_test({ project_id: "proj_8CzLVwyx53rTGEJv" })

Takes 60–120 seconds for most sections. Returns (first attempt — partial failure):

{
  "ruleset_id": "aethis/uk-fsm/child-eligibility",
  "tests_passing": 4,
  "tests_total": 6,
  "failures": [
    {
      "name": "Age 4 — minimum age, eligible",
      "expected": "eligible",
      "got": "not_eligible",
      "hint": "The lower age bound may be using strict > rather than ≥ 4"
    },
    {
      "name": "Age 10, independent school — not eligible",
      "expected": "not_eligible",
      "got": "eligible",
      "hint": "The school type restriction may not be captured. Check Regulation 3(2)(b)."
    }
  ]
}

Two failures: the boundary condition at age 4 is wrong (strict > instead of ≥), and the school type restriction isn’t compiled. Each failure includes a hint pointing to the likely cause.

Step 3 — Refine with guidance

Add guidance that references the specific source clause:

aethis_refine({
  project_id: "proj_8CzLVwyx53rTGEJv",
  feedback: "FSM Regulations 2014 Regulation 3(1): eligibility applies to children aged 4 to 15 inclusive. The lower bound is ≥ 4, not > 4. Regulation 3(2)(b): eligibility is restricted to children attending state-funded schools. The child.school_type field must be checked — only 'state_funded' qualifies. Independent and home-educated do not."
})

aethis_refine adds the guidance and immediately re-generates. Returns (second attempt — all passing):

{
  "ruleset_id": "aethis/uk-fsm/child-eligibility",
  "tests_passing": 6,
  "tests_total": 6,
  "failures": []
}

All tests pass after one refinement. Move to publish.

Guidance variants

Add guidance without regenerating

Useful when you want to accumulate several pieces of guidance before triggering a generation run:

aethis_add_guidance({
  project_id: "proj_8CzLVwyx53rTGEJv",
  guidance_text: "Regulation 3(2)(b): school_type must be state_funded. Independent and home_educated do not qualify.",
  process_type: "rule_generation"
})

Then trigger generation separately:

aethis_generate_and_test({ project_id: "proj_8CzLVwyx53rTGEJv" })

Check accumulated guidance

aethis_list_guidance({ project_id: "proj_8CzLVwyx53rTGEJv" })

Returns:

{
  "guidance": [
    {
      "id": "hint_001",
      "process_type": "rule_generation",
      "guidance_text": "Regulation 3(2)(b): school_type must be state_funded. Independent and home_educated do not qualify.",
      "created_at": "2026-04-16T10:22:00Z"
    }
  ]
}

Domain-level guidance

Add once, applies to all projects in the domain — no need to repeat cross-section principles on each ruleset:

aethis_add_domain_guidance({
  domain: "uk_fsm",
  guidance_text: "Discretionary clauses (where the authority 'may' act) must produce 'undetermined', not 'not_eligible'. The system flags for human review — it never exercises discretion on behalf of the decision-maker.",
  process_type: "rule_generation"
})

Guidance examples

Real guidance from the UK Free School Meals household criteria section (11 tests, 8 qualifying routes). Each hint addresses a specific compilation gap.

OR logic across routes

Problem: The engine treats all qualifying criteria as AND conditions — a household must meet all routes to qualify, when it should be any one.Guidance:

This section uses OR logic across multiple qualifying routes. A household
qualifies if it meets ANY ONE of the following: Universal Credit with net
earnings at or below £7,400/year; Income Support; income-based JSA;
income-related ESA; Child Tax Credit only (no Working Tax Credit) with
income at or below £16,190/year; NASS support; the child is looked-after;
or the child is a care leaver.

Compound AND within an OR route

Problem: The Universal Credit route passes when receives_universal_credit is true, ignoring the income cap.Guidance:

The Universal Credit route requires TWO conditions simultaneously:
household.receives_universal_credit must be true AND
household.annual_net_earnings must be less than or equal to 7400
(pounds sterling, annual, after tax and NI). These are AND conditions
within the UC route, which is then OR'd with other routes.

Unconditional boolean overrides

Problem: Looked-after children are being checked against benefit criteria when they should bypass them entirely.Guidance:

Looked-after children (child.is_looked_after) and care leavers
(child.is_care_leaver) qualify automatically with no income or benefit
requirement. These are unconditional boolean fields — if either is true,
Section B passes regardless of all other fields.

Negative conditions (absence of a benefit)

Problem: The Child Tax Credit route qualifies anyone receiving CTC, missing the requirement that they must not also receive Working Tax Credit.Guidance:

The Child Tax Credit route (household.receives_child_tax_credit_only)
requires that the household receives CTC but does NOT receive Working
Tax Credit. Use a single boolean field that encodes "CTC without WTC".

Pattern: Every effective hint references a specific regulatory clause, names the fields involved, and states the logical relationship (AND, OR, NOT, unconditional). Vague feedback (“fix the UC check”) doesn’t converge.

Diagnosing a specific failure

If a test is failing but the hint isn’t clear enough, use aethis_explain_failure to get a deeper diagnosis:

aethis_explain_failure({
  ruleset_id: "aethis/uk-fsm/child-eligibility",
  field_values: { "child.age": 10, "child.school_type": "independent" },
  expected_outcome: "not_eligible",
  test_name: "Age 10, independent school — not eligible"
})

Returns:

{
  "test_name": "Age 10, independent school — not eligible",
  "expected": "not_eligible",
  "got": "eligible",
  "failing_criterion": "school_type_check",
  "compiled_form": "child.age IN [4..15]",
  "diagnosis": "The compiled rule checks age only. The school_type constraint is missing — Regulation 3(2)(b) restricts eligibility to state-funded schools but is not yet reflected in the compiled rules.",
  "suggested_guidance": "Add guidance referencing Regulation 3(2)(b): eligibility requires child.school_type = state_funded."
}

Test coverage strategy

Good coverage catches failures before production:

Boundary values — test just below, at, and above every threshold (age 3, 4, 15, 16)
Every Enum value — one test per enum value for each Enum field
Every code path — combinations of fields that exercise a distinct branch
Five tests minimum per section — more is better; sparse test suites let edge-case bugs through

Step 4 — Publish

Once all tests pass:

aethis_publish({ project_id: "proj_8CzLVwyx53rTGEJv" })

Returns:

{
  "ruleset_id": "aethis/uk-fsm/child-eligibility",
  "version": "v1",
  "tests_passing": 6,
  "tests_total": 6
}

The ruleset_id is now ready for aethis_decide. Published rulesets are locked and versioned.

To update rules (e.g. for a legislative change), generate a new ruleset in the same project and publish again. The previous ruleset remains available by its specific version ID.

Handling generation timeouts

Rule generation takes 5–15 minutes for complex sections. If the client times out, the server continues generating. Do not re-trigger generation — it creates a duplicate run.

Wait 10–15 minutes
Call aethis_list_rulesets({ project_id: "proj_8CzLVwyx53rTGEJv" }) to check if a new ruleset appeared
If a ruleset is present, run your test suite and publish
If not, wait and check again

DATE fields

DATE fields use integer ordinals (days since year 1), not ISO strings:

# Python
python3 -c "from datetime import date; print(date(2025, 4, 13).toordinal())"
# → 739354

// JavaScript
Math.floor(new Date('2025-04-13').getTime() / 86400000) + 719163
// → 739354

Passing "2025-04-13" as a string fails with a 422 type error.

After all sections are published

If your domain has multiple sections, compose them into a rulebook — a container that references published rulesets and defines outcome logic.

Create a rulebook

Via the REST API:

curl -X POST https://api.aethis.ai/api/v1/public/rulebooks/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: ak_live_..." \
  -d '{
    "name": "UK Free School Meals Eligibility",
    "domain": "uk_fsm",
    "ruleset_refs": [
      { "section_id": "child_eligibility", "pin_mode": "latest_active" },
      { "section_id": "household_qualifying_criteria", "pin_mode": "latest_active" },
      { "section_id": "universal_infant_fsm", "pin_mode": "latest_active" }
    ],
    "outcome_logic": { "expr": "A AND (B OR C)" }
  }'

Ruleset references

Each ruleset_ref has:

Field	Type	Required	Description
`section_id`	string	yes	Section identifier matching the published ruleset
`ruleset_id`	string	no	Pin to a specific ruleset version. Omit to use `pin_mode`.
`pin_mode`	string	no	`"pinned"` (specific version, default) or `"latest_active"` (auto-updates when new version published)

Outcome logic

Defines how section outcomes compose. Each section is assigned a letter (A, B, C…) in the order listed in ruleset_refs. Supported operators: AND, OR, NOT, parentheses for grouping.

Example	Meaning
`A AND B`	Both sections must be eligible
`A AND (B OR C)`	A required; either B or C sufficient
`A AND B AND NOT C`	A and B required; C must not be eligible

Activate and evaluate

# Activate (validates all ruleset references exist)
curl -X POST .../rulebooks/{rulebook_id}/activate -H "x-api-key: ak_live_..."

# Evaluate against the composed rulebook (requires an API key with `decide` scope —
# anonymous /decide only accepts public `ruleset_id`/slug, not `rulebook_id`)
curl -X POST .../decide \
  -H "x-api-key: $AETHIS_API_KEY" \
  -d '{ "rulebook_id": "{rulebook_id}", "field_values": { ... } }'

# Get combined schema from all sections
curl .../rulebooks/{rulebook_id}/schema

# Get combined explanations
curl .../rulebooks/{rulebook_id}/explain

The /decide endpoint accepts either a ruleset_id (single section) or a rulebook_id (composed multi-section). When evaluating against a rulebook, the response includes section_results showing each section’s individual decision. See the UK Free School Meals worked example for a complete multi-section rulebook with live rulesets.

Getting started

Interfaces

Recipes

Authoring (Beta)

Reference

Phase 3 — Rule generation

The TDD loop

Step 1 — Create a ruleset

Step 2 — Generate and test

Step 3 — Refine with guidance

Guidance variants

Add guidance without regenerating

Check accumulated guidance

Domain-level guidance

Guidance examples

Diagnosing a specific failure

Test coverage strategy

Step 4 — Publish

Handling generation timeouts

DATE fields

After all sections are published

Create a rulebook

Ruleset references

Outcome logic

Activate and evaluate

Getting started

Interfaces

Recipes

Authoring (Beta)

Reference

Documentation Index

​The TDD loop

​Step 1 — Create a ruleset

​Step 2 — Generate and test

​Step 3 — Refine with guidance

​Guidance variants

​Add guidance without regenerating

​Check accumulated guidance

​Domain-level guidance

​Guidance examples

​Diagnosing a specific failure

​Test coverage strategy

​Step 4 — Publish

​Handling generation timeouts

​DATE fields

​After all sections are published

​Create a rulebook

​Ruleset references

​Outcome logic

​Activate and evaluate

The TDD loop

Step 1 — Create a ruleset

Step 2 — Generate and test

Step 3 — Refine with guidance

Guidance variants

Add guidance without regenerating

Check accumulated guidance

Domain-level guidance

Guidance examples

Diagnosing a specific failure

Test coverage strategy

Step 4 — Publish

Handling generation timeouts

DATE fields

After all sections are published

Create a rulebook

Ruleset references

Outcome logic

Activate and evaluate