SELECT id, amount, status
FROM orders
WHERE status = 'pending'
AND amount > 100
LIMIT 50 Limit(50)
└─ Project[id, amount, status]
└─ Filter(status = 'pending' AND amount > 100)
└─ Scan(orders) OriginChain runs SQL with full JOIN support against the same hash-keyed substrate that powers vector search, BM25 full-text, and graph traversal. POST a SQL string, get JSON rows back. p99 latency under 8 ms on indexed reads. SELECT, INSERT, DELETE, GROUP BY, HAVING, plus INNER, LEFT, RIGHT, and FULL OUTER joins on up to five tables (left-deep).
The parser is sqlparser (GenericDialect).
The translator folds the AST into an oc_query::Plan
tree and runs it. Anything outside the supported surface returns a typed
TranslateError::Unsupported at parse time — never a silent surprise at runtime.
POST /v1/tenants/:tenant/sql
Content-Type: application/json
Authorization: Bearer $OC_TOKEN
{ "sql": "SELECT ... " }
The translator never executes SQL directly. It folds the AST into a tree of physical
operators — Scan,
Filter,
Project,
Limit,
Aggregate,
HashJoin,
OuterJoin. The same
tree shape that /v1/ask
produces from natural language. The same plan cache. The same executor.
SQL string
│
▼
sqlparser AST ──────► TranslateError::Unsupported (typed, at parse time)
│
▼
oc_query::Plan tree ──► plan cache ──► executor ──► rows JSON
Six shapes. Each one has the SQL on the left and the Plan tree it produces on the right. AND-only
WHERE; literals are integer, string, float, boolean. Comparison operators
= != < <= > >= plus
IN (..).
SELECT id, amount, status
FROM orders
WHERE status = 'pending'
AND amount > 100
LIMIT 50 Limit(50)
└─ Project[id, amount, status]
└─ Filter(status = 'pending' AND amount > 100)
└─ Scan(orders) SELECT status, COUNT(*), SUM(amount), AVG(amount)
FROM orders
GROUP BY status Aggregate[group=status,
agg=count(*), sum(amount), avg(amount)]
└─ Scan(orders) SELECT status, COUNT(*) AS n
FROM orders
GROUP BY status
HAVING COUNT(*) > 100 Filter(count(*) > 100)
└─ Aggregate[group=status, agg=count(*)]
└─ Scan(orders) SELECT o.id, o.amount, c.name
FROM orders o
INNER JOIN clients c ON o.client_id = c.id
WHERE o.status = 'pending' Project[o.id, o.amount, c.name]
└─ HashJoin(o.client_id = c.id)
├─ Filter(o.status = 'pending')
│ └─ Scan(orders)
└─ Scan(clients) SELECT o.id, c.name, r.region
FROM orders o
INNER JOIN clients c ON o.client_id = c.id
INNER JOIN regions r ON c.region_id = r.id HashJoin(c.region_id = r.id)
├─ HashJoin(o.client_id = c.id)
│ ├─ Scan(orders)
│ └─ Scan(clients)
└─ Scan(regions) SELECT c.id, c.name, o.amount
FROM clients c
LEFT OUTER JOIN orders o ON c.id = o.client_id OuterJoin[kind=Left, on=c.id = o.client_id,
null_fill=orders.*]
├─ Scan(clients)
└─ Scan(orders) INSERT INTO t (cols...) VALUES (..)
translates to the /v1/rows/:t write
path — one WAL frame per batch, atomic with all index, FTS, vector, and relation maintenance.
DELETE FROM t WHERE pk = ..
requires equality on the manifest's declared primary key.
INSERT INTO orders (id, amount, status)
VALUES ('0f8a-...', 49.90, 'pending'),
('1c2b-...', 12.00, 'paid');
DELETE FROM orders WHERE id = '0f8a-...'; UPDATE is intentionally not in the surface. Use
PUT /v1/rows/:t
(overwrite-on-pk) or put_row_cas
(compare-and-set on _oc_row_version) instead.
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/sql" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "SELECT id, amount, status FROM orders WHERE status = '\''pending'\'' LIMIT 50"
}'import urllib.request, json, os
req = urllib.request.Request(
f"{os.environ['OC_ENDPOINT']}/v1/tenants/{os.environ['T']}/sql",
method="POST",
headers={
"Authorization": f"Bearer {os.environ['OC_TOKEN']}",
"Content-Type": "application/json",
},
data=json.dumps({
"sql": "SELECT id, amount, status FROM orders WHERE status = 'pending' LIMIT 50"
}).encode(),
)
with urllib.request.urlopen(req) as r:
print(json.loads(r.read())) curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/sql" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "INSERT INTO orders (id, amount, status) VALUES ('\''0f8a-...'\'', 49.90, '\''pending'\'')"
}'oc.sql("INSERT INTO orders (id, amount, status) "
"VALUES ('0f8a-...', 49.90, 'pending')") curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/sql" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d @- <<'JSON'
{
"sql": "SELECT o.id, o.amount, c.name FROM orders o INNER JOIN clients c ON o.client_id = c.id WHERE o.status = 'pending' LIMIT 100"
}
JSONrows = oc.sql("""
SELECT o.id, o.amount, c.name
FROM orders o
INNER JOIN clients c ON o.client_id = c.id
WHERE o.status = 'pending'
LIMIT 100
""")
Anything not in the supported-shape list above returns a typed
TranslateError with a 400 and a
pointer at the right primitive. No silent reinterpretation, no implicit fallback.
UPDATE Use PUT /v1/rows/:t (overwrite) or put_row_cas (compare-and-set on _oc_row_version). BEGIN / COMMIT inside SQL Use single-row CAS via the typed rows API. DDL — CREATE / ALTER / DROP Use POST /v1/schemas with a TOML manifest. CROSS JOIN, comma joins, NATURAL JOIN, USING Refused — be explicit with ON. Subqueries (scalar, EXISTS, IN-SELECT) Express as a JOIN with an inner aggregate or filter. Window functions (OVER) Use GROUP BY for partitioned aggregates. CTE (WITH) Express the CTE as a separate query and JOIN the result. ORDER BY Order client-side; LIMIT alone returns deterministic key-prefix order. OR in WHERE Run two AND-only queries and union client-side. COUNT(DISTINCT col) Use GROUP BY col then COUNT(*). Bind params ($1, ?) Inline literals — the parser strips them deterministically.