API

Status: Draft Owner: Ben Last Updated: 2026-02-28

Overview

The LifeDB API is the primary product surface — the contract between the unified data layer and everything built on top of it. It provides structured access to canonical entities with rich filtering, search, temporal querying, and full read-write capability over entities and their relationships.

Design principles:

Data layer first. The API serves data. Intelligence (summarization, topic extraction, intent detection) lives in consumers or in pre-computed artifacts the API serves as data.
Platform-transparent by default. Queries span all platforms. Platform is an optional filter, not a structural part of the API.
Programmatic consumers. Optimized for AI agents, dashboards, and automation tools. Error responses, pagination, and rate limits are designed for machines to act on.
Composable. Filters, search, ordering, and pagination combine freely. Consumers build the queries they need from simple primitives.
Unified read-write contract. The same API that consumers use to query data is the API that ingestion connectors use to write data. Ingestion is not a separate system — it’s a consumer of the write API with elevated permissions and platform-reported provenance.
Paradigm-neutral. This spec defines the capabilities, query patterns, and contracts the API must support — not the transport mechanism. Whether the implementation is REST, GraphQL, or both is a technical decision. The product contract is: these operations must be possible, composable, and efficient.

Consumers

The API serves two categories of consumers through the same contract:

External Consumers

AI agents — pulling context for reasoning, making sequential queries to drill into relevant data, correcting inferences, building up a picture of a person’s interactions
Personal dashboards — rendering timelines, activity summaries, communication patterns
Automation tools — triggering workflows based on new data, monitoring patterns
Applications — any downstream product that needs rich personal context

Ingestion Connectors

Platform connectors — iMessage, Slack, Gmail, etc. These are API consumers that create entities and relationships from platform data
Import tools — bulk data import from exports, backups, or migrations

The difference between external consumers and ingestion connectors is authorization and trust, not different endpoints:

Permissions — connectors have broader write access and higher throughput limits
Provenance — connector writes carry platform_reported provenance; external writes carry user_confirmed or manual
Bulk access — connectors may use batch operations not available to standard consumers

This unified model means the write API is battle-tested by ingestion, third parties can build connectors using the same public API, and there’s one contract to maintain.

Read Operations

Entity Queries

Structured access to each core entity type. The foundation of the read API.

Entity types:

Persons — unified identities
PlatformIdentities — platform-specific identities for a Person
Conversations — communication contexts (with recursive parent traversal)
Messages — messages within a Conversation or across all Conversations
Events — calendar events
Documents — notes and authored content
Attachments — media and files

Each entity type supports: list with filters, single entity retrieval by ID, and traversal of relationships to related entities.

Filtering

All list queries accept combinable filters:

Person — filter by participant or sender/recipient
Time range — after/before timestamps
Platform — optional, cross-platform by default
Conversation — scope to a conversation
Content type — for Messages (text, call, media, etc.)
Provenance — filter by relationship confidence (e.g., only user_confirmed and platform_reported)

Filters compose freely. Querying “text messages involving a specific person since February 15” is a single query combining person, time, and content type filters.

Timeline

A cross-entity temporal view — the “everything that happened” query.

Returns a mixed, chronologically ordered stream of Messages, Events, and Documents within a time range. Supports the same filters as entity queries, including person filtering — “everything involving Person X” is a single timeline query that returns messages, events, and documents involving that person in one unified stream, not multiple entity queries merged client-side. Default sort: reverse chronological.

Primary use cases: “show me my day,” “what happened this week with Person X.”

Search

Content search is a first-class filter integrated into all query types, not a separate system.

Composable — combines with all other filters. Searching “dinner plans” filtered by person and time range is a single query.
Scope — searches across message bodies, document content, event titles and descriptions, and person names.
Implementation-neutral — whether search is full-text, semantic, or hybrid is a technical concern. The API contract is: a search parameter accepts a query string and returns relevant results.

Ordering and Relevance

Three ordering modes:

Temporal — chronological or reverse chronological. Default when no search terms are present.
Relevance — ranked by match quality. Default when search terms are present. Re-ranking models may be applied under the hood.
Hybrid — relevance-ranked within a temporal window.

Intent hints. An optional intent parameter allows consumers to signal what they’re trying to accomplish (e.g., find_action_items, catch_up, research_topic). The API can use this to optimize ranking. Intent is never required, and all ordering behavior is overridable — a consumer that knows better than the API’s defaults can always specify exact sort and order.

Context Retrieval

Search results identify relevant entities. Context retrieval expands outward from those results within their Conversation.

The drill-down pattern:

Search: broad query returns ranked Messages across the dataset
Identify: agent spots a relevant Message
Drill down: agent requests surrounding context within that Message’s Conversation

Anchor-based windowing. Query messages within a Conversation anchored on a specific message: “give me N messages before and M messages after this message.” Supports the same filtering as other queries.

Search result metadata. Each entity in a search result carries enough context for the consumer to decide whether to drill down: Conversation ID, conversation name, participant summary, timestamp. Results are minimal by default — consumers make explicit drill-down requests for full context.

Inline expansion. An optional expand mechanism for consumers who want related data inline (e.g., include conversation metadata with each message). This is an optimization, not the primary pattern.

Relationship Traversal

Consumers can traverse relationships between entities:

Person → their PlatformIdentities, Messages, Conversations, Events
Conversation → Messages, participants, parent/child Conversations
Message → sender, recipients, Conversation, Attachments, cross-entity references

The depth and shape of traversal is consumer-controlled. This is a key reason the API should remain paradigm-neutral — GraphQL’s nested selection sets and REST’s expand parameters are both valid implementations of this capability.

Write Operations

Entity Management

Create, update, and delete entities:

Create — add a new Message, Event, Document, PlatformIdentity, etc. Every write carries provenance metadata indicating who/what created it and how.
Update — modify entity fields. The system preserves edit history where applicable.
Delete — soft delete with audit trail. Entities are marked as deleted, not destroyed.

Relationship Management

Create, confirm, reject, and destroy relationships between entities. This is core to the system’s ability to improve over time.

Identity resolution:

Confirm — confirm an inferred PlatformIdentity → Person link (“yes, these are the same person”)
Reject — reject an incorrect inference (“no, these are different people”)
Merge — manually link two PlatformIdentities to the same Person
Split — undo a merge, separating PlatformIdentities back into distinct Persons

Cross-entity references:

Create — manually link entities (“this message is about this event,” “these conversations are related”)
Destroy — remove an incorrect link
Confirm/reject — respond to inferred references

Conversation membership:

Add/remove participants — modify conversation participant lists
Correct membership — fix incorrect participant data from ingestion

Every relationship write carries provenance. A consumer confirming an AI-suggested identity merge produces a user_confirmed provenance record that supersedes the original ai_suggested record.

Annotations

Consumers can attach metadata to entities without modifying the entity itself:

Tags — user-defined labels (“action-item”, “follow-up”, “important”)
Notes — freeform annotations on any entity
Flags — system or user markers (starred, archived, muted)

Annotations are first-class data — queryable and filterable like any other field.

Connector Management

Self-service lifecycle management for data source connectors:

Register — create a connector instance with platform-specific credentials (opaque JSON). Credentials are envelope-encrypted at rest. One instance per connector type per tenant.
List / Get — query connector instances and their current state (ACTIVE, SYNCING, ERRORED, STALLED, DISCONNECTED).
Trigger Sync — fire-and-forget: returns immediately with the connector in SYNCING state. The actual sync runs asynchronously. Returns SyncAlreadyInProgressError if a sync is already in flight.
Unregister — remove a connector instance and its stored credentials.

Permission model: read for list/get, manage for register/unregister/sync. When connector infrastructure is not configured (no CREDENTIAL_KEK), queries return empty results and mutations return CONNECTOR_INFRASTRUCTURE_DISABLED.

Ingestion Operations

The same write API used by external consumers, with additional capabilities for connectors:

Batch creation — create many entities in a single operation (messages from a platform export, contacts from an address book)
Upsert semantics — create-or-update based on platform-specific identifiers, enabling idempotent re-ingestion
Provenance defaults — connector writes automatically carry platform_reported provenance with the source platform identified
Override support — consumers can override ingested data at any time. Manual overrides carry user_confirmed provenance and take precedence over platform_reported data.

Pre-Computed Intelligence

The API serves pre-computed artifacts as data fields on entities, not as “smart endpoints.” Computation happens asynchronously (at ingestion time, on a schedule, or on demand).

Examples:

Person summary — a condensed profile based on all interactions
Conversation summary — distilled overview of a long thread
Extracted entities — action items, decisions, key dates pulled from messages

These fields are optional and may not be populated for all entities. When present, they include provenance metadata indicating how and when they were computed.

As intelligence capabilities mature, dedicated operations for on-demand computation (e.g., summarization) may be added. Initially, all intelligence is pre-computed and served as data.

Pagination

Cursor-based pagination across all list queries. Not offset-based — cursors are stable as new data arrives.

Cursor — opaque token to resume from a previous position
Limit — results per page (with a sensible default and maximum)
Response metadata — next cursor (if more results exist), has-more indicator

Cursor semantics support “everything since my last request” patterns, providing a foundation for polling-based sync without requiring streaming infrastructure.

Error Handling

Structured error responses designed for programmatic consumers:

Machine-readable error codes — specific error identifiers beyond HTTP status codes
Actionable detail — if a filter is invalid, identify which one and why. If a time range exceeds limits, state the limits.
Consistent shape — every error response follows the same structure regardless of operation

Rate Limiting

Rate limit communication designed for sequential query patterns:

Remaining requests in current window
Window reset time
Limit ceiling

Different consumer tiers (external consumers vs. ingestion connectors) may have different rate limits. Designed for AI agents making sequential query chains (search → drill-down → drill-down) to pace themselves without guessing.

Future Considerations

Areas intentionally deferred, with the API designed to accommodate them later:

Streaming/real-time — webhooks, SSE, or websockets for push-based updates. Cursor-based pagination provides polling-based sync in the interim.
Smart operations — on-demand summarization, topic extraction, semantic analysis. Pre-computed artifacts served as data initially.
Advanced query language — if composable filters prove insufficient for complex query patterns, a more expressive query interface may be added.
Subscription/watch — subscribe to changes on specific entities or query results.

Vision

Project Vision — The thesis: the data model and API are the product

Specifications

Data Model — The canonical entities this API exposes
Entity Resolution — Resolution lifecycle and consumer interaction model
Ingestion — Connector architecture and the unified write API in practice
API Implementation — Implementation: GraphQL with gqlgen, REST endpoints, authentication middleware
Module Interfaces — Store interfaces the API layer consumes

Operational

CLI Data Exploration — Terminal-based read commands (ls, show, search) that consume the GraphQL API
Using the API — Consumer guide for GraphQL queries, filtering, pagination, and common patterns
Operations — Data export via paginated GraphQL queries, troubleshooting

Decisions

API paradigm — GraphQL primary with minimal REST surface; see API Implementation
Search implementation — PostgreSQL full-text search initially, interface-separated for engine swap; see Search & Indexing
Pagination strategy — Relay-style cursor-based connections; see API Implementation
Authentication model — split-key API keys resolving to scoped permissions; see Security & Privacy

API