Challenge
Art market data comes in two forms, and neither is enough on its own. You can license raw tabular auction results — thousands of rows, no interpretation — or buy aggregated artist sales reports that summarize rather than interrogate. What neither provides is an intelligence layer: custom answers to custom questions, contextualised for a specific work, a specific collector, a specific negotiation.
For that, professionals have had two options: hire an independent art advisor — expensive, scarce, and opinionated — or lean on a gallery director who is, structurally, a salesperson. Using foundation models directly — ChatGPT, Claude — introduces a third problem: partial auction data in training sets, stale gallery representation records, and a tendency to hallucinate specifics that art market professionals notice immediately. Industry trust is hard to build and easy to lose.
Archibald fills that gap.
Approach
Archibald is an AI art advisor you email like a colleague. Ask about an artist’s auction trajectory, recent comparable sales, or exhibition history — a structured analysis lands in your inbox within seconds. No app to download, no dashboard to learn, no new software to embed in an existing workflow.
The interface choice was deliberate and counterintuitive. Art professionals live in their inbox. Building another app would mean another login, another tab, another onboarding conversation. Email removes all of that friction and puts Archibald exactly where the work already happens.
The architectural challenge was that art market data doesn’t live in one form. Structured data — 8M+ auction results with prices, dates, sale houses, and lot details — needed precise filtering, so I chose text-to-SQL to query it directly. Narrative data — provenance records, exhibition histories, artist biographies — required semantic search, where meaning matters more than exact match. Two retrieval strategies, one coherent response.
Grounding responses in current, comprehensive market data — rather than relying on a base model’s training set — reduces hallucinations by roughly 90%. The model handles language and reasoning; the database handles facts.
Architecture
Archibald runs as a collection of loosely coupled Docker services, each with a single responsibility. The flow is linear: inbound email → email-monitor → inference-engine → data layer → reply via JMAP.
- email-monitor — A custom JMAP client connected directly to the mail server. Polls for inbound messages, parses content, authenticates the user, maintains message history in PostgreSQL, and hands the request off.
- inference-engine — The LLM orchestration layer. Interprets the query, plans tool calls, and runs a multi-step reasoning loop — querying structured data, triggering semantic search, initiating live scrapes, or some combination. OpenAI and Gemini for full inference; Groq for lightweight intermediate classification. Composes the final response and passes it back to email-monitor.
- data layer — Serves two query strategies behind a unified interface: text-to-SQL for structured auction data, vector search for unstructured records. The inference engine remains agnostic to the strategy in use.
- on-demand scrapers — Triggered when a query requires data not yet in the database — fetch, parse, and ingest in real time.
- background scrapers — Run on a continuous schedule, harvesting auction results and artist records. No user-triggered latency.
- Cloud database — PostgreSQL (structured) and a vector store (unstructured) in managed cloud infrastructure. Both accessed exclusively through the data layer.
The email-as-interface constraint shaped this architecture: no UI to fall back on meant every service boundary had to be clean and every failure mode had to degrade gracefully. Either reply with a useful answer, or reply with a clear explanation — no spinners, no error pages.
Results & Impact
Archibald is in private beta with art professionals across the US. Early usage shows consistent patterns:
- 5× faster research — Multi-year market analysis that previously required hours of manual lookup now arrives in seconds
- 90% reduction in hallucinations — Grounding every response in comprehensive, current market data rather than base model training sets
- Zero interface friction — No downloads, no logins, no new tabs; users stay in their existing workflow
By the time I arrived at the restaurant, multi-year market analysis and recent comps were waiting in my inbox. Archibald helped me close a sale that evening.
— Senior Director, Buchmann Galerie
Related Writing
- Giving an LLM access to a million rows : The structured data architecture that actually works