The Core Idea

CodAs departs from a single, arguably simple observation: the International Classification of Health Interventions (ICHI) of the WHO is not merely a lookup table. It is, in a meaningful sense, a structured language.

Like any natural language, ICHI possesses a vocabulary – the axis elements of Action, Target and Means. It has a syntax – the combinatorial coding rules defined in the ICHI Reference Guide. And it has an extensive body of worked examples – thousands of pre-coordinated intervention codes that illustrate how its grammar is applied in practice.

Large Language Models (LLMs) are, at their core, systems trained to recognise and apply patterns across language-like structures. They operate across natural languages, formal languages, programming languages, and code systems alike. The central hypothesis behind CodAs is therefore:

If ICHI shares fundamental structural properties with a language – vocabulary, syntax, examples – then LLMs may be uniquely predisposed to work with it.

CodAs is the practical exploration of that hypothesis.

The Long-Context Approach

CodAs operationalises this insight through a long-context approach: the complete ICHI vocabulary and the Reference Guide are transmitted to the model with every request, establishing a closed, authoritative knowledge environment for each coding session.

This means the model does not rely on memorised or approximated knowledge of ICHI. Instead, for every single request, it works directly from the source material – the same documents a human coder would consult. The three axis vocabularies (Action, Target, Means), the extension codes, and the Reference Guide rules are all present in full:

  • Vocabulary: The complete set of Action, Target, Means, and Extension axis elements with their definitions, inclusions and exclusions
  • Syntax: The combinatorial rules from the ICHI Reference Guide governing how elements may and may not be combined
  • Examples: Thousands of pre-coordinated stem codes serve as worked instances of correct application – analogous to a language learning corpus.

CodAs operationalises this insight through a long-context approach: the complete ICHI vocabulary and the Reference Guide are transmitted to the model with every request, establishing a closed, authoritative knowledge environment for each coding session.

What CodAs Does

CodAs accepts free-text descriptions of health interventions in any language and constructs WHO-compliant ICHI codes consisting of the stem code (Action · Target · Means) and applicable extension codes, following the combinatorial logic defined in the official ICHI Reference Guide.

For each coding request, CodAs generates:

  • The complete constructed code with a plain-language interpretation of its meaning
  • A transparent, component-by-component justification (Action, Target, Means, Extensions) with definitions, inclusions, exclusions, and direct references to the source data
  • A rule-based rationale citing the applicable sections of the ICHI Reference Guide
  • Direct links to the official ICHI verification tool for each generated stem code

This transparency is deliberate. CodAs is not intended as a "plug-and-play" black box. It is designed as a discursive partner for the human coder – making its reasoning visible and verifiable at every step, consistent with the principle that AI-assisted coding requires, and actively supports, qualified human oversight.

⚠️ Important Notice

CodAs is a research and decision-support tool. All generated codes must be verified by qualified professionals. LLMs can produce plausible but incorrect outputs. CodAs supports the coding process; it does not replace professional expertise or institutional quality control. No liability is assumed for the accuracy of generated codes.

The ICHI verification tool allows any stem code to be checked against the official WHO dataset.