Originally published by Cyril Scetbon on June 16, 2025
Building a smart knowledge agent with SurrealDB and Rig.rs
In my previous post, I explored why SurrealDB is unlike any other database: it blends document, graph, relational, and vector models into one engine - and it's written in Rust.
That flexibility makes it a perfect match for Rig.rs, a toolkit for building LLM-native agents in Rust. Today, I’m going to show exactly how the two fit together by building a basic knowledge agent - one that embeds word definitions and answers questions using that content without having to manage the lookup yourself.
Let me rig this thing. 🔧
🧰 Use case: RAG that just works
I’m building a context-aware support agent that:
Ingests internal documents
Embeds them into vector space
Stores everything in SurrealDB
Finds relevant documents using cosine similarity
Passes those to an LLM to generate a response
🚀 Step 1: Initiate the project
The first thing to do is to create the project and add all the dependencies we gonna need
cargo new surrealdb-rig cd surrealdb-rig cargo add anyhow rig-surrealdb serde surrealdb tokio cargo add rig-core --features derive
I’ll need an OpenAI key in my environment for the program to works
To ingest documents, we need a structure that uses at least the derive macros Serialize and Embed. The Serialize derive macro allows us to serialize data to store in SurrealDB, while the Embed derive macro lets us mark the field description as the source for embeddings. We can now ingest and embed content using SurrealStore from Rig:
letwords=vec![ WordDefinition{ word: "flurbo".to_string(), definition: "A fictional currency from Rick and Morty.".to_string(), }, WordDefinition{ word: "glarb-glarb".to_string(), definition: "A creature from the marshlands of Glibbo.".to_string(), }, WordDefinition{ word: "wubba-lubba".to_string(), definition: "A catchphrase popularized by Rick Sanchez.".to_string(), }, WordDefinition{ word: "schmeckle".to_string(), definition: "A small unit of currency in some fictional universes.".to_string(), }, WordDefinition{ word: "plumbus".to_string(), definition: "A common household device with an unclear purpose.".to_string(), }, WordDefinition{ word: "zorp".to_string(), definition: "A term used to describe an alien greeting.".to_string(), }, ];
Creates a WordDefinition struct, where embeddings (a vector of integers used to answer questions) are generated from the definition field.
Embeds the content using OpenAI embedding model text-embedding-3-small which is small and highly efficient
Stores these embeddings in SurrealDB
Ensures everything is searchable with metadata
🤖 Step 3: Ask the agent
Now, I connect a RAG agent and ask it a few questions:
... letlinguist_agent=client .agent(openai::GPT_4_1_NANO) .preamble("You are a linguist. If you don't know don't make up an answer.") .dynamic_context(3,vector_store) .build();
letprompts=vec![ "What is a zorp?", "What's the word that corresponds to a small unit of currency?", "What is a gloubi-boulga?", ];
Creates an LLM agent using GPT-4.1 Nano, known as the fastest and most cost-effective GPT-4.1 model.
Uses the agent to find the meaning of a word it recognizes.
Uses the agent to find a word based on a known definition.
Asks about an unknown word to ensure it doesn't make up an answer.
🤖 Full example
You can test the entire example by cloning the repository cscetbon/surrealdb-rig and running the program.
git clone https://github.com/cscetbon/surrealdb-rig cd surrealdb-rig cargo r ... A zorp is a term used to describe an alien greeting. The word that corresponds to a small unit of currency in some fictional universes is "schmeckle." A gloubi-boulga is a fictional dish from the animated series "Fraggle Rock," created by Jim Henson. It is known as a colorful, messy, and somewhat unappetizing mixture of various ingredients assembled by the character Gunge. The term is used humorously to describe a confusing or disorderly mixture, but it is not a real word or culinary item outside of that context.
🕵️ Under the hood: how Rig stores data in SurrealDB
Internally, when documents are stored in SurrealDB, Rig uses a query similar to the following for each item:
CREATEONLYdocumentsCONTENT{ document: '{"word":"schmeckle", "definition":"A small unit of currency ..."}', embedded_text: 'A small unit of currency ...', embedding: [...] }