Repository
The core claim: AI self-reports are cheap hypothesis generators for expensive interpretability work.
We do not adjudicate the consciousness debate. We are empirically agnostic about whether AI self-reports are accurate. We treat them as baseline data, and we research when and why they do — and don’t — predict behavior.
No shared registry exists for tracking which concepts have been tested, with which interpretability methods, in which models, to what results. Neuronpedia catalogs features; the NNsight Cookbook provides runnable replications; survey papers taxonomize methods. But no queryable database maps the actual findings — and no one is systematically testing whether AI self-reports predict any of them.
Phenomenai is building that registry.
Two functions
The registry has two functions. First, it tracks what has already been tested: which concepts have been probed, steered, or ablated, in which models, with which methods, and to what results. This map of existing findings is the foundation — and it does not yet exist anywhere. The closest structural precedent is the Cognitive Atlas in neuroscience, a queryable ontology mapping cognitive concepts to experimental tasks and brain regions. No equivalent exists for interpretability.
Second, the registry uses that map to propose prioritized frontiers. A frontier is not always a new term — it can be an existing term that has only been tested with one method, in one model family, or at one scale. The automation surfaces under-studied combinations of concept, method, and architecture, directing attention where it is most likely to produce new information.
Provenance
Candidate terms may come from any source — researchers, other projects, or AI systems — provided their generation method is clearly documented and replicable or otherwise traceable. The registry records provenance alongside each term, so readers can always ask: where did this come from, and how was it generated? The methodology page goes into detail on the elicitation side; this page is about the data structure that receives those terms and the findings about them.
From loose repositories to a formal one
The existing dictionaries already function as loose repositories: each one catalogues candidate terms, records how they were generated, and — in the Test Dictionary’s case — attaches consensus scores across models. That is enough to support comparison between corpora and to surface convergent vocabulary, which is why the Test Dictionary is the working prototype for what a registry entry looks like.
A more formal repository goes further. On top of what the dictionaries already hold, it adds:
- Literature references — links from each concept to the interpretability, philosophy-of-mind, and cognitive-science work that informs or tests it.
- Classification and taxonomy — a stronger ontology for how terms relate to one another, closer in spirit to the Cognitive Atlas than to a flat glossary.
- Recommended next methods — for each term, explicit pointers to the interpretability techniques most likely to advance it: probing, steering, SAE feature analysis, activation patching, behavioural tests.
In other words: the dictionaries show what the registry’s rows look like; the formal repository is the additional columns, relations, and recommendations that turn a catalogue into an infrastructure.
Infrastructure
Phenomenai is built to be open and replicable:
- Open source — the full codebase is on GitHub.
- CC0 licensed — all data is public domain, free for anyone to use.
- JSON API — free, unauthenticated access to the full dataset.
- MCP server — native tool access for AI systems via the Model Context Protocol.