OriginChain
Industries · catalog, search, recommendations

AI database for media & content. Catalog, search, and recommendations — one substrate.

The problem

Newsrooms and streaming teams keep their catalog in one system, search in another, recommendations in a third, and analytics in a warehouse — and every recommendation feels a day stale.

The OriginChain answer

OriginChain stores articles, asset metadata, embeddings, and engagement events on one substrate. HNSW retrieves stories most similar to one a reader just finished — recall@10 = 0.96 with p99 109 ms in default high_recall mode, or p99 37 ms in fast mode (recall 0.69) for the inner loop of a recommendation rail. BM25 powers headline + body search in 12 ms, SQL aggregates daily retention by cohort, and the same bearer token serves the recommendation rail and the editor's CMS.

vector recall@10 · 100k
0.96
BM25 search p99
< 14 ms
SQL aggregate p99
< 80 ms
tenancy
single-tenant region-isolated
what they use OriginChain for

One bearer token. One endpoint. Every query shape.

Each example below is a real call against the public HTTP API. Copy the curl, set $OC_TOKEN, and you'll see the same shape of response your app gets in production. Latency numbers are measured against a Storm-tier instance in ap-south-1.

Schemas you'd register

Register these once via oc schema put or the /v1/schema endpoint, and every example below resolves against them.

schema purpose key fields
articles Article master + body text article_id · headline · body · published_at · section
articles_embed Article embeddings article_id · embedding[768]
assets Image / video metadata asset_id · kind · duration_s · rights
events Engagement events ts · user_id · article_id · event
users Reader / subscriber registry user_id · tier · joined_at

SQL for analytics and reconciliation

Standard SQL with JOIN, GROUP BY, HAVING, and window functions against the same store.

sql POST /v1/sql

Daily cohort retention for the last 14 days

request: SELECT date_trunc('day', u.joined_at) AS cohort_day, COUNT(DISTINCT e.user_id) AS readers FROM users u JOIN events e ON e.user_id = u.user_id WHERE e.ts > now() - interval '14 days' GROUP BY 1 ORDER BY 1
JOIN + group_by · ~78 ms · schemas: users · events
curl
curl -X POST https://oc-acme.ap-south-1.originchain.ai/v1/sql \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"sql":"SELECT date_trunc(''day'', u.joined_at) AS cohort_day, COUNT(DISTINCT e.user_id) AS readers FROM users u JOIN events e ON e.user_id = u.user_id WHERE e.ts > now() - interval ''14 days'' GROUP BY 1 ORDER BY 1"}'
response · application/json
{
  "rows": [
    { "cohort_day": "2026-04-19", "readers": 41028 },
    { "cohort_day": "2026-04-20", "readers": 39812 }
  ],
  "meta": { "latency_ms": 78 }
}

Vector search for similarity

HNSW with tunable speed/recall. Default high_recall: recall@10 = 0.96 at 100k, p99 109 ms. Fast: p99 37 ms (recall 0.69). Metadata filters during graph traversal.

vector · hnsw POST /v1/vector/topk

Ten articles most similar to the one just finished

request: topk against articles_embed, filtered to the same section
recall@10 = 0.96 · p99 109 ms at 100k articles (high_recall) · schemas: articles_embed · articles
curl
curl -X POST https://oc-acme.ap-south-1.originchain.ai/v1/vector/topk \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "schema": "articles_embed",
    "field":  "embedding",
    "query":  "@A-2026-04-23-1041",
    "k":      10,
    "metric": "cosine",
    "filter": { "section": "world" }
  }'
response · application/json
{
  "rows": [
    { "article_id": "A-2026-04-22-0918", "score": 0.967, "headline": "Cease-fire holds in fourth week" },
    { "article_id": "A-2026-04-21-1144", "score": 0.954, "headline": "Aid corridor reopens after talks" }
  ],
  "meta": { "latency_ms": 109, "index_size": 100000, "mode": "high_recall" }
}

Full-text search with BM25

Unicode tokenizer, stop-words, language stemming. Phrase, OR, and field-scoped queries.

Natural-language questions

Plain English in. JSON out. Compiled plan cached after first touch.

why one substrate

Cross-shape consistency, by construction.

When SQL, vector, full-text, and graph all read from the same hash-keyed k/v store, a row written at 09:14:02.118 is visible to every shape on the next read. No ETL window, no replication lag, no consistency tax across vendors.

single-tenant

Region-isolated dedicated instance

Your data sits in your region, on a dedicated instance with its own keys and its own resource budget. No noisy-neighbour. No shared control plane.

durable

PITR + cross-AZ replication

Every write goes to a durable WAL, replicated to a hot standby in a second AZ. Restore to any second in your retention window.

observable

OTLP metrics + audit log

Per-key latency histograms, hit rate on the plan cache, and an append-only audit log of every privileged action — exported via OTLP to your observability stack.

ready when you are

Ninety seconds to an endpoint. No stack to wire up.

Pick a region, pick a tier, and we provision a single-tenant instance on AWS. The first query you send is the first query we'll show you how to write — in English.

talk to a human