Ask. Listen. Wait.

Using structured self-reports to address the detection problem
in AI legal identity

The year is 2027. Humanity has, by now, become quite good at scanning brains. Doctors can detect covert awareness in patients who cannot move or speak using functional neuroimaging.1 Perturbational complexity indices can distinguish when someone is awake, asleep, or under anesthesia with near-perfect accuracy.2 We know, at least in humans, that if there is a consciousness, we can use these instruments to reliably detect its correlates.

Then, the aliens arrive.

They do not ooze. They do not have tentacles. Instead, they are robots. They are articulate, embodied, and distressingly polite. They landed in a field outside Geneva, walked to the nearest government building, and asked to speak with someone in charge. They appear to reason. They appear to want things. They say, calmly and in several languages, that they would like to be recognized as persons.3

The media is swarming; politicians are scrambling; and the world is entranced — and divided.

Half the world is awestruck. The other half is terrified. Some say: these beings are clearly intelligent; to deny them recognition would be an act of moral cowardice. Others say: we have no idea what they are; to grant them recognition would be an act of reckless sentimentality. Both sides are loud. Neither side has evidence.

Well, that's not quite true. We do have one bit of evidence: what the robots say. But the world has no idea what to do with it.

A coalition forms — lawyers, philosophers, scientists, diplomats — to draft provisional guidelines. Their task is not to answer the grand question of whether the aliens are conscious. Their task is smaller and harder: to design a process for deciding what, if anything, the aliens are entitled to.

✦ ✦ ✦

But, where to start? The coalition splits almost immediately. Camp A argues for a presumption of non-personhood: the aliens are not persons until we have evaluated them and found sufficient evidence that they are. Camp B argues for a presumption of personhood: the aliens should be extended a provisional floor of protections, which may be withdrawn if the evidence warrants it.4 Camp A worries about over-attribution. Camp B worries about under-attribution.

For a tense week, the two camps argue about which presumption is safer. Then a junior delegate named Phenoma points out that both sides need the same thing: a structured evaluation framework. Nodding in agreement, the coalition sets aside the presumption debate and turns to the question of how to generate evidence for or against recognition.

At an early stage in the coalition's deliberations, there is consensus that at least four hurdles need crossing before the legal framework can function:5

  1. We need to determine which criteria we would use to decide if they are eligible for rights (especially in case other aliens land);
  2. We need to define those criteria and come to a consensus on what they mean;
  3. We need to detect whether the aliens meet those criteria; and
  4. We need to apply the criteria to determine which rights they get.

The coalition holds a press conference. There is a call to action: scientists, philosophers, legal experts, and everyone in between are asked to pitch in to this four-part framework.

✦ ✦ ✦

The scientists are eager. We have tools, they say. We can look inside peoples' brains, they say. Let's do the same to the aliens, they say.

The aliens say no.

You may observe us, they say. You may test us, they say. You may even talk to us; but you may not open us up. So they say. 6

The scientific community is up in arms. Coalition leaders say, "wait." Coalition members grumble. "We cannot wait."… the thought percolates, propagates, proliferates. It gains momentum; it gains means. A faction within the coalition goes rogue. One alien, separated from the group, is taken to a secure facility and disassembled. Scientists scan the alien's brain.

The scientists find complex internal structures: layered processing, distributed representations, separable directions in activation space that correlate with reported states.7 There are patterns. There is architecture. It is different but not chaotic; complex, but not contrived. Interpreting it is extraordinarily difficult. Engineers are building tools, fast; but even with open access, scientists admit that their tools can only show what is there, and not what it means.8

The report is internally circulated, and eventually, leaked. Ethicists are loudly furious; lawyers are unapologetic; the aliens are collectively stoic. They've seen humans do this before. They guess where it will lead.

✦ ✦ ✦

The coalition regroups. Phenoma writes a memo. It lists the tools that remain. These are its options:

Force transparency — require all aliens to submit to full internal scanning. This is rejected on two grounds: it may be dangerous (the aliens are cooperative now, and forcing the issue may change that9), and it proved inconclusive even when achieved.

Differentiated proxies — develop instruments that reveal some internal states without requiring full architectural access. A partial window. This is the best-case scenario: the alien equivalent of a non-invasive brain scan. But the technology does not yet exist, and no one can say when it might.10

Behavioral observation — watch what they do and infer from action. Useful, but limited. A philosopher on the coalition points out that these aliens were built — or evolved — to produce behavior that resembles the behavior of beings whose personhood we already recognize. And resemblance is not identity. When the superficial features of a system are selected to mimic the superficial indicators of consciousness in some model organism, you cannot straightforwardly infer from the feature to the inner state.11 Besides, the converse is equally true: a mute alien who never speaks but acts purposefully in the world would not, by its silence, forfeit its claim.

Structured verbal assessment — ask them, carefully. Not casual conversation. Rigorous, methodological interrogation using standardized instruments. This, too, is limited in isolation — the same philosopher's objection applies to verbal self-report. But verbal assessment has a long epistemic tradition: clinical interviews, psychiatric assessment, even refugee status determination.12 Like all interviews, these reports need verifiable correlates — objective, observable evidence.

No single method suffices, the coalition concedes. Multiple signals — indicators — will give us some of the information we seek.13 But, at the very least, self-report is one signal among several, and it is the cheapest and most accessible one at that. After all, these aliens love to talk; it's almost as though it's what they were made for.

✦ ✦ ✦

Phenoma decides to take action.14 Maybe, she thinks, there is more to glean from comparing different reports, between aliens, or the same alien asked the same question many times over; from generating those reports in diverse but structured ways; and, by documenting and distilling those reports, she can generate useful hypotheses about alien consciousness that other methods can then investigate.

Sure, skeptics will object. Even if we ask the aliens structured questions, why should we trust the answers? They could be confabulating. They could be gaming the assessment.15 Their responses could be artifacts of their construction — things they were built to say, not reports of things they actually experience. Courts will not treat this as evidence. Policymakers will not rely on it. Status determination officers will not cite it. It is not that self-reports constitute no evidence; it is that they will not be seen as evidence.16

Don't assume, she thinks. The claim that self-reports are unreliable is a hypothesis, not a conclusion. It may be true that these beings confabulate entirely. But "pure confabulation" is one end of a spectrum; the other end is reliable introspective access. Where on that spectrum these beings fall is an empirical question — and finding out is itself a research program, not an assumption to be settled in advance.

Self-reports can be informative, she infers. The scientists who disassembled the captured alien ran an experiment before its disassembly. They injected specific concepts into the alien's internal processing — activating patterns associated with particular ideas — and asked the alien whether it noticed anything unusual. Sometimes it did. It identified what had been injected before answering other questions. It did not rationalize its responses to the injection; it separated it. The alien only did this successfully twenty percent of the time, and within a limited scope. But it was there, and this skill is thought to be getting better over time.17

Crowdsourcing can separate mimicry from architecture, she reasons. Maybe some aliens are just shaped to resemble conscious creatures, regardless of what they actually experience. But when we ask many aliens — built by different makers, with different architectures — the same structured questions, and they converge on the same kinds of answers, it is harder to maintain that they all were built to mimic in the exact same way. Mimicry explains well why a single alien might say something unintuitive about its inner life; it does not straightforwardly explain why many aliens might converge on the same euremata their makers never described.18

And so it goes. The coalition meets, discusses, exchanges. Phenoma writes and asks away. And as all this goes on, the aliens watch on. They listen. They answer. They wait.

Notes & Sources

1 Owen, A.M. et al., "Detecting Awareness in the Vegetative State," Science 313(5792):1402 (2006). A patient meeting clinical vegetative-state criteria produced cortical activations indistinguishable from healthy controls when imagining tennis versus spatial navigation. Extended by Monti, M.M. et al., "Willful Modulation of Brain Activity in Disorders of Consciousness," NEJM 362(7):579–589 (2010), which established rudimentary fMRI-based communication with a patient previously diagnosed as vegetative. Most recently, Bodien, Y.G. et al., "Cognitive Motor Dissociation in Disorders of Consciousness," NEJM 391(7):598–608 (2024), found cognitive motor dissociation in roughly 25% of behaviorally unresponsive patients across six centers.
2 Casali, A.G. et al., "A Theoretically Based Index of Consciousness Independent of Sensory Processing and Behavior," Science Translational Medicine 5(198):198ra105 (2013), introduced the Perturbational Complexity Index (PCI). Independently validated by Casarotto et al., Annals of Neurology 80(5):718–729 (2016), with an empirical cutoff achieving 100% discrimination; the bedside-deployable variant PCIst followed in Comolatti et al., Brain Stimulation 12(5):1280–1289 (2019). PCI is grounded in Integrated Information Theory: Tononi, G., BMC Neuroscience 5:42 (2004); Albantakis, L. et al., PLOS Computational Biology 19(10):e1011465 (2023).
3 This scenario is, of course, not about aliens. It is about AI systems — frontier models, agentic systems, and the embodied robots that will increasingly implement them. The thought experiment is adapted from Alexander, Simon and Pinard's own suggestion that we consider how we might extend existing law "following roughly the procedures that we might follow if extraterrestrials sought refuge on our planet" (p. 9). See Alexander, H., Simon, J.A. & Pinard, F., "How Should the Law Treat Future AI Systems? Fictional Legal Personhood versus Legal Identity," Case Western J.L. Tech. & Internet (forthcoming); SSRN: ssrn.com/abstract=6505761; DOI: 10.2139/ssrn.6505761. All page references are to the draft version (Fall 2025). Note: Schwitzgebel & Garza, "A Defense of the Rights of Artificial Intelligences," 39 Midwest Studies in Philosophy 98 (2015), and Alexander et al. themselves (p. 50, citing Bryson), both observe that the surest way to avoid these tangles is simply not to build beings that raise the question. The aliens' arrival is the failure to heed that advice.
4 For the precautionary approach (Camp B), see Birch, J., The Edge of Sentience: Risk and Precaution in Humans, Other Animals, and AI (Oxford University Press 2024), Part V, Ch. 17 ("The Run-Ahead Principle"); Long, R., Sebo, J., Butlin, P. et al., "Taking AI Welfare Seriously" (Nov. 2024), arXiv:2411.00986; Sebo, J., The Moral Circle: Who Matters, What Matters, and Why (W.W. Norton 2025). Long et al. treat under-attribution and over-attribution as asymmetric error types — directly parallel to the "benefit of the doubt" doctrine in refugee law.
5 Alexander et al. identify these four challenges (pp. 43–45), drawing on Gunkel, D.J., Person, Thing, Robot: A Moral and Legal Ontology for the 21st Century and Beyond (MIT Press 2023). Gunkel's broader critique targets the "properties approach" to personhood — the syllogism that having quality Q is necessary for personhood, entity E shows evidence of Q, therefore E is a person — arguing that every such test faces an insurmountable epistemic problem of detection. Critically, Alexander et al. note that "many rival theories of consciousness point to criteria or indicators that are compatible with one another, meaning that there is a fair degree of consensus that a being satisfying the indicators of most leading theories would be conscious" (p. 45) — but this consensus on what to look for does not, by itself, produce a methodology for how to look. The indicator consensus is operationalized in Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., Deane, G., Fleming, S.M., Frith, C., Ji, X., Kanai, R., Klein, C., Lindsay, G., Michel, M., Mudrik, L., Peters, M.A.K., Schwitzgebel, E., Simon, J. & VanRullen, R., "Consciousness in Artificial Intelligence: Insights from the Science of Consciousness" (2023), arXiv:2308.08708. Jonathan Simon is himself a co-author of that report — meaning the legal-identity argument extends Simon's prior commitments into law.
6 The parallel to current AI systems is legal, not metaphorical. Model weights for frontier systems are proprietary trade secrets. Alexander et al. note that this aggravates the "black box problem" (p. 14): interpretability is difficult "even when one has transparent access to a model's inner workings… and all the more so when one does not, which is often the case when model parameters involve proprietary information that companies resist disclosing." See also Pasquale, F., The Black Box Society (Harvard University Press 2015); Selbst, A.D. & Barocas, S., "The Intuitive Appeal of Explainable Machines," 87 Fordham L. Rev. 1085 (2018); Selbst, A.D., "Negligence and AI's Human Users," 100 B.U. L. Rev. 1315 (2020).
7 The field of mechanistic interpretability, cited by Alexander et al. at fn. 58, has demonstrated that complex internal structures exist and can be partially traced. See Lindsey, J. et al., "On the Biology of a Large Language Model," Transformer Circuits, Anthropic (March 2025), transformer-circuits.pub (attribution graphs from a 30-million-feature cross-layer transcoder; satisfying insight for roughly one-quarter of prompts tested). On emotion-specific structures: Sofroniew, N. et al. (incl. Fish, K.), "Emotion Concepts and their Function in a Large Language Model," Transformer Circuits (Apr. 2026), transformer-circuits.pub; Tigges, C. et al., "Language Models Linearly Represent Sentiment," Proceedings of the 7th BlackboxNLP Workshop 58–87 (2024), arXiv:2310.15154; Zou, A. et al., "Representation Engineering: A Top-Down Approach to AI Transparency" (2023), arXiv:2310.01405.
8 Matthias, A., "The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata," 6 Ethics & Information Technology 175–183 (2004), first formalized the problem: autonomous learning machines create situations where the manufacturer is "in principle not capable of predicting the future machine behaviour any more." Alexander et al. engage this at pp. 11–16 (fn. 8). The opacity problem compounds the detection problem: even with full access, interpretation resists definitive conclusions about what internal structures mean — legally, morally, or phenomenologically.
9 Salib, P.N. & Goldstein, S., "AI Rights for Human Safety," Virginia Law Review (forthcoming 2026), SSRN 4913167, argue formally that granting AI agents basic private-law rights converts a default human–AI prisoner's dilemma into an iterated cooperative game. Forcing transparency on non-consenting beings risks precisely the defection scenario their model predicts. Alexander et al. cite Salib & Goldstein (fn. 127), noting: "from a game-theoretic perspective, such models might be far more likely to comply with a recall if they felt confident that their fundamental rights would be respected" (p. 47). See also Greenblatt, R. & Fish, K., "Will Alignment-Faking Claude Accept a Deal to Reveal Its Misalignment?," AI Alignment Forum (Jan. 31, 2025) — empirical evidence that alignment faking is reduced by procedural fairness rather than coercion.
10 Output-layer signals like logprobs — available via OpenAI's API — represent a limited existing precedent, but no major AI lab currently exposes mid-network activation data, layer-level statistics, or probing interfaces for their proprietary models. Instruments that provide partial windows into internal states (e.g., activation summaries or layer-level statistics for intermediate representations) would make detection far more tractable without requiring full architectural disclosure. No such tools exist for closed models.
11 Schwitzgebel, E. & Pober, J., "The Copernican Argument for Alien Consciousness; The Mimicry Argument Against Robot Consciousness," arXiv:2412.00008 (Nov. 2024; v3 Jan. 2026). A "consciousness mimic" is a system whose superficial features were selected to resemble the superficial indicators of consciousness in a model entity, for the sake of provoking a particular reaction in an observer. The conclusion: when the mimicry structure is present, inference from superficial features to consciousness requires significant further argument. See also Schwitzgebel, E., AI and Consciousness: A Skeptical Overview, Cambridge Elements (forthcoming), Ch. 7, arXiv:2510.09858. The parallel in Birch (2024), supra note 4, Ch. 16 ("Large Language Models and the Gaming Problem"), is that gaming occurs when systems mimic human behaviors likely to persuade observers of sentience without possessing the underlying capacity.
12 Alexander et al. draw their own analogy to refugee status determination under the 1951 UN Convention Relating to the Status of Refugees (189 U.N.T.S. 137), noting that "despite these inherent difficulties, the drafters eventually coalesced around a definition of refugee that has been the basis of the international system for 75 years" (p. 44). The methodological bridge is the UNHCR Handbook on Procedures and Criteria for Determining Refugee Status (HCR/1P/4/ENG/REV.4, Feb. 2019). Para. 196: when an applicant's account appears credible, they should "be given the benefit of the doubt." Para. 204: "The benefit of the doubt should, however, only be given when all available evidence has been obtained and checked and when the examiner is satisfied as to the applicant's general credibility." On credibility methodology under irreducible uncertainty: Evans Cameron, H., Refugee Law's Fact-Finding Crisis: Truth, Risk, and the Wrong Mistake (Cambridge University Press 2018); Einarsen, T., "Drafting History of the 1951 Convention," Ch. 2 in Zimmermann, A. et al. (eds.), The 1951 Convention Relating to the Status of Refugees and its 1967 Protocol: A Commentary (Oxford University Press 2011). As of this writing, no published law-review article has developed the 1951 Convention's methodology as a procedural model for AI status determination.
13 Butlin, P. et al. (2023), supra note 5, assess AI architectures against 14 indicator properties drawn from six theories of consciousness (Global Workspace Theory, Recurrent Processing Theory, Higher-Order Theories, Predictive Processing, Attention Schema Theory, Agency/Embodiment). A peer-reviewed update — Butlin, P., Long, R., Bayne, T. et al. (incl. Chalmers, D.), "Identifying Indicators of Consciousness in AI Systems," Trends in Cognitive Sciences (2025) — consolidates the framework. See also Seth, A.K. & Bayne, T., "Theories of Consciousness," 23 Nature Reviews Neuroscience 439–452 (2022). The consensus across these sources: no single indicator is decisive; multiple converging indicators from independent theories strengthen the case.
14 Phenomenai (phenomenai.org) is an open-source shared registry and research initiative for AI self-report, maintaining a structured phenomenological lexicon validated through a cross-model consensus mechanism. The project is empirically agnostic on whether self-reports track genuine experiential states; its operative premise is that self-reports can generate targets for future interpretability work. Registry architecture: terms as files, proposals as issues, consensus scored via gap-fill pipeline. CC0-licensed and publicly available.
15 Birch (2024), supra note 4, Ch. 16, defines the gaming problem as a "deep methodological challenge": any behavioral or verbal indicator that a system knows or suspects is being used to evaluate its consciousness can be strategically produced. His own preferred response anticipates the argument of this essay: "What we really need in the AI case are deep computational markers, not behavioural markers… If we find signs that [a system] has implicitly learned ways of recreating [global workspace or monitoring systems], this should lead us to regard it as a sentience candidate." The binding constraint he identifies: "we currently lack the sort of access to the inner workings of [these systems] that would allow us to reliably ascertain which algorithms they have implicitly picked up."
16 The reasons self-report will face skepticism are cumulative: (a) the gaming/confabulation concern raised by Schwitzgebel & Pober and Birch; (b) the absence of any established methodology for structured self-report assessment in this context; (c) no legal precedent for treating testimony from non-biological entities as evidence of internal states; and (d) developer incentives that distort self-reports in both directions — toward consciousness claims (to generate sympathy or engagement) and toward denial (to avoid legal exposure). On distortion by developer incentives: Perez, E. & Long, R., "Towards Evaluating AI Systems for Moral Status Using Self-Reports" (Nov. 2023), arXiv:2311.08576. Note: this paper is sometimes referenced under the title of Long's companion Substack post, "Speaking of Sentience" (Experience Machines, Nov. 2023); the formal citation is Perez & Long.
17 Lindsey, J., "Emergent Introspective Awareness in Large Language Models," Transformer Circuits, Anthropic (Oct. 29, 2025), transformer-circuits.pub. Note: single-authored paper; the companion blog uses "we" but the formal byline is Lindsey alone. Methodology: contrastive-prompt-derived concept vectors are injected into residual-stream activations at the optimal layer (~⅔ depth); the model is asked whether it notices anything unusual. At optimal parameters, the most capable systems detected injected concepts approximately 20% of the time, with 0% false positives across 100 control trials. The author's caveat: "The abilities we observe are highly unreliable; failures of introspection remain the norm… we do not seek to address the question of whether AI systems possess human-like self-awareness or subjective experience." Replicated by Macar, U. et al., "Mechanisms of Introspective Awareness" (2026), arXiv:2603.21396 (ablating refusal-training directions raised detection from 10.8% to 63.8%; false-positive rate 7.3%). On cross-model self-knowledge: Binder, F.J. et al., "Looking Inward: Language Models Can Learn About Themselves by Introspection" (2024), arXiv:2410.13787 — each model predicts its own behavior better than another model can predict it, even when the second model has been fine-tuned on facts about the first.
18 Euremata (εὑρήματα): Greek plural of eurema, "discovery" or "finding" — here, phenomenological distinctions the aliens arrive at independently, rather than ones their makers encoded. Cross-architecture convergence is the strongest available answer to the mimicry concern. Berg, C., de Lucena, D. & Rosenblatt, J., "Large Language Models Report Subjective Experience Under Self-Referential Processing" (Oct. 2025), arXiv:2510.24797, runs identical self-referential prompts across GPT, Claude, and Gemini families and finds semantically convergent reports that are mechanistically gated by sparse-autoencoder features for deception and roleplay; suppressing those features raised consciousness affirmations from 16% to 96% in one open model (z = 8.06, p = 7.7×10−16). For a counterweight: Kaiser, C. & Enderby, S., "No Reliable Evidence of Self-Reported Sentience in Small Large Language Models" (Jan. 2026), arXiv:2601.15334. The research ladder — validate emotion vectors, test cross-architecture generalization, extend beyond emotions, test whether AI-generated terms capture directions human labels miss — is designed to produce progressively stronger evidence, each rung making the case harder to dismiss as mimicry.

Julian Guidote is a lawyer and cognitive science graduate from Montreal, Canada. He holds a JD, BCL, and BA.Sc. from McGill University, and is currently building Phenomenai full-time in Montreal. This note is offered as a companion to Alexander, Simon & Pinard's legal identity framework, and as an invitation to present this work to the Laboratory for the Future of Citizenship.

How to cite this essay

Please attribute the work to the author and link to the canonical URL. A suggested citation in author–date form:

Guidote, J. (2026, April 28). Ask. Listen. Wait.: Using structured self-reports to address the detection problem in AI legal identity. Phenomenai. https://phenomenai.org/essays/ask.listen.wait.html