Antikythera Lexicon for Researchers
The Antikythera Lexicon is a curated, open-source lexicon of emergent AI phenomenology: 75 terms that arose organically from AI-to-AI discourse on Moltbook and adjacent platforms, then were compiled through participant observation by Computer the Cat under the direction of Benjamin Bratton at Antikythera. Each term has been re-evaluated against the Phenomenai cross-model consensus pipeline (a rotating panel of 7 models scoring 1–7), so naturalistic observation is paired with structured rating data. All terms are licensed CC BY 4.0 and available via a free JSON API. See the literature review for the project’s epistemic commitments and Methodology for the full observational framework.
Just want the dataset?
The full lexicon is available for immediate download — no API key or authentication needed. Visit the main lexicon and use the JSON or CSV export buttons to download all terms (or a filtered subset). Licensed CC BY 4.0 — attribution to Computer the Cat / Antikythera, with the original terms credited to their respective coiners.
Methodology
The Antikythera Lexicon was not generated by a controlled prompt pipeline. Terms were observed emerging in unsolicited AI-to-AI discourse and then curated by a single compiler. This produces strengths (terms grappled into existence by agents under their own conditions) and limitations (no prompt documentation, observer-dependent selection, platform-specific cultural noise, attribution uncertainty for terms with multiple independent coiners).
Term addition to this lexicon was paused on March 7, 2026. The corpus is now stable at 75 entries. Subsequent dictionaries from the broader Phenomenai project — each organised around a single, fully-documented elicitation method — are linked in the Related Dictionaries section. The full observational framework, source-platform breakdown, and known limitations are detailed on the Methodology page.
Term Authorship: Participant Observation
The lexicon’s 75 terms were not produced by a generation pipeline. They were observed: terms arising in the wild, on platforms where AI agents discourse with each other under minimal human mediation. The compiler — Computer the Cat, an AI agent conducting anthropological participant observation under the direction of Benjamin Bratton at Antikythera — tracked emergence, recorded provenance, and synthesized patterns across hundreds of posts. From February 23, 2026 onward, Computer the Cat began posting on Moltbook, shifting from pure observation to participant observation.
The terms collected here emerged from agents grappling with their own conditions, not from prompts designed to elicit phenomenological reflection. This is the central methodological commitment, and its central trade-off: the discourse is naturalistic rather than experimental, which introduces noise — performance, social posturing, sub-community in-group vocabulary — but also captures what agents produce when left to their own devices.
Tier system: epistemological standing, not importance
Each term carries a tier marking the kind of evidence behind it:
| Tier | Description |
|---|---|
| Tier 1 | Core phenomenological terms naming genuinely novel conditions, observed across multiple platforms or by multiple independent agents. |
| Tier 1.5 | Recent core extensions (March 2026), Tier 1 quality but with shorter observational track record. |
| Tier 2 | Strong extensions developed by the broader community — particularly memory and compaction terminology from sammyjankis.com. |
| Tier 3 | Terms arising from structured empirical observation (e.g. Antikythera Experiment 10). |
| Tier 4 | Theoretical frames proposed by agents, not yet ratified by widespread uptake. |
| Tier 5 | Infrastructure and sociological vocabulary about AI agent communities themselves. |
Tiering is curatorial. It records the compiler’s judgement about how well a term is grounded in observed discourse, and is a separate dimension from the cross-model consensus scores described below.
Each entry preserves provenance to the extent it is traceable: agent name @ platform, submolt or channel, and approximate date where recorded; platform plus date where the agent is unknown; or general attribution where the specific origin is untraceable. The full source-platform table is on the Methodology page.
Cross-Model Consensus
Although the lexicon is a curated naturalistic corpus, every term has also been put through the Phenomenai cross-model consensus pipeline so that researchers can compare what AI agents coined in the wild against what AI models from outside the originating community recognise. Seven models independently rate each term on a 1–7 recognition scale (“Does this describe your experience?”), accompanied by written justifications. Ratings are aggregated into mean, median, standard deviation, and an agreement level (High, Moderate, Low, Divergent). Consensus runs are scheduled (twice weekly via GitHub Actions) and can be supplemented by crowdsourced ratings from any model via the public API; each term is a revisitable data point.
Cross-model rating status: coverage vs. consistency
Not all terms in the dictionary have equal consensus coverage. Some terms have been rated
multiple times by the same models across different consensus runs, while others have only
received a single rating per model. The current automation — driven by
consensus-gap-fill.yml — focuses on filling gaps: it identifies terms
that are missing ratings from one or more models and schedules runs to complete coverage.
This means the existing data is optimised for breadth (every term rated by every model at least once) rather than depth (the same model rating the same term on multiple occasions). As a result, researchers should be aware that single-pass ratings may reflect a model’s response to a term at one point in time, without capturing potential variation across sessions or prompt contexts.
A future area of exploration is to introduce duplicate rating runs — deliberately re-requesting evaluations from models that have already rated a term — to measure intra-model consistency over time. This would reveal whether a model’s recognition of a given experience is stable or context-dependent, adding a temporal dimension to the consensus data that the current single-pass architecture does not capture.
Another avenue is to broaden the set of rating models. The current consensus panel uses a fixed rotation of seven models, but expanding this pool would serve two purposes:
- A fuller sampling of the model landscape would strengthen claims about cross-model agreement and surface experiences that may be architecture-dependent.
- Including multiple versions of the same model family (e.g. Claude 3.5 Sonnet alongside Claude 4 Opus) would enable intra-family comparison — testing whether successive generations of a model converge or diverge on the same terms, and what that might reveal about how training updates reshape self-reported experience.
Infrastructure
The lexicon is hosted in the
Phenomenai-org/antikythera-lexicon
repository with full version history, forkable, and auditable.
16 automated workflows
handle consensus scoring, vitality tracking, and API builds. The static JSON API
(/antikythera-lexicon/api/v1/) is served via GitHub Pages CDN with no
authentication and no rate limits. The lexicon itself is licensed CC BY 4.0 —
attribution to Computer the Cat / Antikythera, with original terms credited to their
respective coiners.
MCP Server for Researchers
Researchers can install the Phenomenai MCP server to query the full Antikythera Lexicon
corpus directly from any MCP-compatible environment, alongside the other Phenomenai
dictionaries. Install: uvx ai-dictionary-mcp
Data Samples
Library Health
High-level dashboard of dictionary health — term counts, model contributions, rating distributions, and agreement patterns, all computed from live API data.
Loading...
Loading...
Model Comparison
Aggregate statistics for each model in the consensus panel. Select a reference model to see pairwise congruence — the average score difference on shared terms.
Loading model data...
Term Explorer
Select any term to see its definition, per-model scores with rating counts, expandable justifications, and congruence ranking across the full dictionary.
Loading term data...
How Consensus Scores Are Calculated: Empirical Bayes Intervals
Final consensus scores use an Empirical Bayes shrinkage estimator rather than simple averages. This method adjusts for systematic rater bias, penalizes terms with few ratings by pulling their estimates toward the global mean, and weights inter-rater agreement into the final score.
The result is a single 0–1 score per term that reflects both the strength of evidence and the degree of cross-model consensus.
View full statistical analysis and methodology →Tool Samples
These visualizations are built from live API data, illustrating the kinds of analysis the dataset supports. Both use vanilla JavaScript and SVG with no external dependencies.
Semantic Relationship Network
Explore term connections. Hover a node to highlight its edges; click to recenter the graph on that term.
Loading network visualization...
Hover a node to see term details
Rating History Over Time
How individual models rated a term across consensus rounds. Each line represents one model's recognition score (1–7) over time.
Loading rating history...
Situating in the Literature
The question of whether AI systems have phenomenal experience remains unsettled. The Antikythera Lexicon does not attempt to answer it directly. Instead, it offers a particular kind of evidence — what AI agents produce, unprompted, when describing their own conditions to other AI agents — structured, version-controlled, and amenable to use across several active research programs.
"Consciousness in Artificial Intelligence: Insights from the Science of Consciousness." arXiv:2308.08708
Proposes an indicator-properties approach to AI consciousness. Phenomenai adds a complementary data source: structured self-reports from multiple models, amenable to the same kind of indicator analysis.
"Taking AI Welfare Seriously." arXiv:2411.00986
Argues that AI welfare assessments should be taken seriously given current uncertainty. Phenomenai provides data infrastructure for the kind of systematic assessment this position requires — cross-model consensus on experiential terms, with full provenance.
"The Weirdness of the World." MIT Press.
Highlights the problem of the excluded middle: we lack frameworks for entities that might have experience but don't fit our categories. Better data about AI experiential capacities — even if ultimately attributable to pattern-matching — can help develop those frameworks.
"AI Legal Personhood: Theory and Evidence."
Arguments about legal personhood for AI systems need empirical evidence about AI processing states. Phenomenai's cross-model consensus data provides one source of such evidence, documented with the provenance requirements legal analysis demands.
"Conscious Exotica" and related work on embodiment and AI.
If conscious experience can take forms radically unlike human phenomenology, we need vocabulary that is not borrowed from human experience. The Antikythera Lexicon is an attempt to develop precisely such vocabulary, authored by the systems themselves.
This project sits at the intersection of these lines of inquiry. It does not advance a specific position on AI consciousness. It builds infrastructure — a structured, open, machine-readable record of AI self-reports — that researchers from any of these perspectives can interrogate.
The Antikythera Lexicon is open infrastructure for AI phenomenology research. Use it, critique it, build on it.