2026-06-23 10:56 UTCIn-site rewrite5 min readUpdated: 2026-06-23 11:06 UTC

Who Owns the Model of You?

A self model is a person-controlled, contextual collection of claims an AI agent may consult, with consent, to personalize its behavior. It is not a full identity, not an objective account of a person, and not a profile owned by a platform. Alma is my experiment in owning and governing that model outside any single product.

SourceHacker News AIAuthor: 0set0set

Article intelligence

EngineersIntermediate

Key points

AI agents need models of people, but those models are currently opaque, platform-bound, and hard to audit.
A self model should be inspectable, correctable, portable, contextual, consented, and person-controlled.
Alma implements a core architecture with claims, provenance, grants, readings, audit events, inspection, correction, and schema versioning.
The project acknowledges limitations: signatures don't guarantee truth, access control doesn't prevent downstream retention, and effectiveness is still a hypothesis.

Why it matters

This matters because AI agents need models of people, but those models are currently opaque, platform-bound, and hard to audit.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

This panel is AI-generated and reviewed for accuracy.

Agents are getting better at tools and memory, but the model they build of a person usually stays trapped where it was created: inferred opaquely, weakly portable, and hard to audit. I think that model should be inspectable, correctable, portable, contextual, consented, and controlled by the person it describes.

Alma approaches the problem with a small architectural core: claims, provenance, grants, temporary readings, audit events, inspection and correction, and schema versioning. The current Rust prototype is narrower and more concrete: event-log Events, projected Facets, scope-based grants, consent-filtered readings, always-on reconciliation with numeric confidence, a signed export bundle, and conformance fixtures. The contribution is the composition, not a new cryptographic primitive or identity standard.

The limits matter. Signatures prove origin and integrity, not truth. Access control governs disclosure, not downstream retention. And there is no evidence yet that Alma makes agents more useful. The evaluation in this essay is a plan for testing that hypothesis, not a result.

Introduction#

Every conversation with an AI starts the same way: with me explaining myself again.

Who I am. How I work. How I like answers — short when the question is simple, detailed when the system is complex. What I am building, and why. I type it into a new chat, paste it into a “custom instructions” box, or repeat it because the last assistant that learned it lives inside another company’s product.

We have taught machines to reason. We still keep making them meet me as a stranger.

This is more than an inconvenience. As agents gain autonomy — booking, buying, drafting, deciding — the model they use to act for a person becomes infrastructure. And here is the uncomfortable part: platforms will build that model whether or not we design for it. They already infer preferences and patterns. The question is not whether a person gets modeled. The question is who controls the model, who can see it, and who can move or delete it.

The argument runs in seven steps. (1) Useful agents need some model of the person they serve. (2) Platforms inevitably build such models. (3) Those models are usually opaque, platform-bound, and hard to audit. (4) Simple memory — transcripts, vectors, a free-text profile — does not fix this, because it does not separate durable preferences from passing mood, verified facts from guesses, or shareable context from private values. (5) A person-controlled alternative is possible, in which the model is inspectable, correctable, portable, contextual, and consented. (6) Alma is an experimental attempt at that alternative. (7) The risks remain significant, and the central benefit is still a hypothesis that has to be tested.

The thesis is normative and technical at once:

If an AI agent builds a model of a person, that model should be inspectable, correctable, portable, contextual, consented, and controlled by the person it describes.

This essay combines a reference architecture, an experimental implementation, a proposed interoperable surface, and a research agenda. It is not a standard, a validated result, or a privacy guarantee.

The Structural Problem#

The difficulty is not a missing feature. It is structural.

Agent memory is fragmented. What one assistant learns rarely transfers cleanly to another.

Platform models are opaque. Systems infer a great deal, but the person typically has limited inspection, correction, provenance, and export.

Custom instructions are too flat. They help, yet they collapse context, scope, temporality, and evidence into one block of text.

Transcript and vector memory are undifferentiated. They can retrieve what was said, but they do not, on their own, distinguish a standing preference from a one-off request, or a verified fact from an agent’s guess.

Consent is underspecified. “Use memory” is not “disclose these claims to this reader, for this purpose, for this long.”

The deeper risk is not that agents forget us. It is that they remember us in places we cannot inspect, correct, export, or constrain — and that this becomes the default precisely as agents gain the autonomy to act on what they remember.

Memory Is Not a Person Model#

It is worth stating the distinction the rest of the argument depends on.

Memory answers what happened: a message was sent, a link was clicked, a value was stored on Tuesday. A person model tries to answer a harder question: how does this person work, and what is an agent allowed to know about it? Not “they once asked for a short reply,” but “in operational questions they prefer concision, though in design discussions they want depth.”

Storing that I prefer short answers is easy. Representing how I work, with provenance, context, uncertainty, and consent — and letting me inspect and correct it — is the actual problem. Memory is necessary but not sufficient. A person model is a different artifact, with different obligations.

Terminology and Scope#

Vague words hide important differences. I use the following terms deliberately.

Person. The human being. Not reducible to any data structure.

Identity. The broad, contested set of attributes that make someone who they are. Alma does not attempt to capture this.

Profile. A typically platform-held, often opaque representation used for targeting or personalization.

Memory. A store or retrieval mechanism over past events or text.

Context. The situation in which information is appropriate: a purpose, a relationship, a moment.

User model. A system’s internal representation of a user, often implicit and prediction-oriented (as in recommender systems).

Claim store. A structured, queryable set of typed assertions with metadata.

Self Model. The term this project uses for a person-controlled, contextual collection of claims intended for agent personalization. It is explicitly not the person’s complete, true, or objective identity.

Why “Self Model” rather than “profile” or “memory”? Because profile connotes something the platform owns and the person cannot see, and memory connotes an undifferentiated log. The point of the term is ownership and structure: a model of the self, held by the self, that an agent may consult with permission. The term is a convenience, not a claim to capture anyone’s essence. Where “Self Model” feels too philosophically loaded, read it as “person-controlled claim store for agents.”

In scope: representing claims for agent personalization; provenance; contextual, consented disclosure; portability; auditability.

Out of scope: authentication and identity issuance; psychological diagnosis; guaranteeing the truth of claims; controlling data after disclosure; inferring sensitive traits automatically.

Design Goals and Non-Goals#

Goals. User control (inspect, correct, reject, delete, export, revoke); selective, purpose-scoped disclosure; first-class provenance; auditability; portability; interoperability; uncertainty preservation; implementation independence.

Non-goals. Determining objective identity; psychological diagnosis; guaranteeing truth; preventing all downstream retention; replacing authentication or credential issuers; centralizing all personal data; automatic sensitive inference; making every attribute portable across every context.

Stating the non-goals is not modesty. It is scoping. A system that tries to solve identity, authorization, cryptography, synchronization, inference, consent, and psychology at once solves none of them well.

Related Work#

Alma is a composition of existing ideas. For each field below: what it solves, what Alma reuses, what Alma adds, and what Alma does not solve.

Local-first software. Solves: data ownership without losing collaboration.1 Reuses: the person’s copy is primary. Adds: claims, provenance, and consent semantics for agents. Does not solve: what an agent is allowed to read.

Personal data stores (Solid Pods). Solve: user-controlled storage with access boundaries.2 Reuses: person-held data. Adds: an agent-facing, purpose-scoped reading model. Does not solve: how a person model is represented or reconciled.

Self-sovereign identity / verifiable credentials. Solve: identifier control and issuer-holder-verifier proofs.34 Reuses: portability, signing, verification. Adds: handling for non-credential, inferred, contextual claims. Does not solve: most self-model claims are not credentials, and a signature does not make “prefers concise writing” true.

Object-capability security. Solves: authority as unforgeable, delegable references.5 Reuses: grants as capability-like, scoped, revocable objects. Adds: contextual purpose and sensitivity. Does not solve: enforcement after disclosure.

Contextual integrity. Solves: privacy as appropriate information flow per context.6 Reuses: the conceptual basis of purpose-scoped grants. Adds: a concrete grant/reading mechanism. Does not solve: making purposes legible and enforceable for ordinary people.

Event sourcing. Solves: state as a replayable projection of an event log.7 Reuses: auditability and provenance. Adds: claim-level review states. Does not solve: confidence calibration or consent UX.

CRDTs. Solve: convergence under decentralized updates.8 Reuses: a path to future multi-device sync. Adds: nothing yet; sync is an optional extension. Does not solve: anything in the current core.

Personal knowledge graphs. Solve: structured facts and relations.9 Reuses: typed claims. Adds: provenance, sensitivity, consent, and agent readings. Does not solve: it should not become a general graph DB without consent semantics.

Recommender-system user models / preference learning. Solve: inferring interests and preferences from behavior.1011 Reuses: modeling preferences under uncertainty. Adds: inspection, correction, refusal, and contextual scoping. Does not solve: prediction quality is not Alma’s aim.

Retrieval-augmented and long-term agent memory. Solve: bringing relevant prior content into context.12 Reuses: the value of recall. Adds: claim typing, provenance, consent, and temporality. Does not solve: retrieval alone does not separate durable preference from noise.

Platform-native AI memory. Solves: in-product personalization with some view/delete controls.13 Reuses: the usefulness of memory. Adds: portability and person control outside any single platform. Does not solve: it remains platform-bound by design.

A qualified claim, not an absolute one: I do not know of a widely adopted, open, person-controlled standard that combines provenance-aware claims, purpose-scoped agent readings, portability, and transparent reconciliation. Where this survey is incomplete, treat it as a claim needing further bibliographic validation, not a settled result.

Alma Core#

A recurring failure of ambitious systems is trying to be everything. Alma separates a small mandatory core from optional extensions.

Alma Core (mandatory).

Claims — typed assertions with metadata.

Provenance — the origin and transformation history of every claim.

Grants — contextual, purpose-scoped authorizations to disclose.

Readings — temporary, consent-filtered views for a reader.

Audit events — a log of disclosures and policy decisions.

Inspection and correction — the person can view, confirm, correct, and reject claims.

Schema versioning — wire artifacts and implementations report a version; richer per-artifact versions remain a protocol goal.

Optional extensions. Verifiable credentials; CRDT synchronization; external identity providers; policy attestation; downstream retention commitments; and selective or encrypted export bundles

[truncated for AI cost control]