Explanation
Embedding models and build channels
Which BGE model the brain uses, why it can't change after init, and how the airgapped and online channels differ.
GigaBrain runs the embedding model itself, on the CPU you already own. The defaults are tuned for a useful tradeoff between quality and binary size; the online build lets you trade those off at install time.
The model menu
Section titled “The model menu”| Alias | Model ID | Dimensions | Notes |
|---|---|---|---|
small (default) | BAAI/bge-small-en-v1.5 | 384 | Fastest. Embedded in the airgapped binary. |
base | BAAI/bge-base-en-v1.5 | 768 | Stronger English recall. Online build only. |
large | BAAI/bge-large-en-v1.5 | 1024 | Highest English recall. Online build only. |
m3 | BAAI/bge-m3 | 1024 | Multilingual. Online build only. |
You can also pass a full Hugging Face model ID as --model <ORG>/<NAME> on the online channel; it’s resolved at runtime, downloaded, and cached.
Why BGE
Section titled “Why BGE”We picked the BGE family because:
- Quality. Strong on retrieval benchmarks (BEIR) for the size class.
- Open weights. No license entanglement.
- Sane defaults.
bge-small-en-v1.5punches above its weight on personal-knowledge corpora.
We use candle for inference — pure Rust, no Python or ONNX runtime. The whole pipeline (tokenizer → model forward pass → pooled embedding) runs inside the binary.
Two build channels
Section titled “Two build channels”The same source compiles two binaries, selected by Cargo features:
| Channel | Features | Binary size | Model assets |
|---|---|---|---|
| Airgapped (default) | embedded-model | ~180 MB | BGE-small bytes embedded via include_bytes! |
| Online | bundled,online-model | ~90 MB | Cached on first semantic use under ~/.gbrain/models/ |
# Default — airgappedcargo build --release
# Online buildcargo build --release --no-default-features --features bundled,online-modelOn the airgapped binary, --model and GBRAIN_MODEL are warning-only no-ops: the embedded model is the only one that will ever load. The warning fires loudly enough that it’s hard to miss in CI.
What init records
Section titled “What init records”gbrain init writes the active model into the immutable brain_config table:
| Key | Example | Meaning |
|---|---|---|
model_id | BAAI/bge-large-en-v1.5 | The full Hugging Face model ID. |
model_alias | large | The alias the user passed (or default). |
embedding_dim | 1024 | The vector width — drives which page_embeddings_vec_<dim> table is used. |
schema_version | 5 | The schema generation. |
On every subsequent open, the binary verifies that the requested model matches the recorded one. A mismatch fails the call before any command runs. This isn’t a UX choice; it’s a correctness one — embeddings produced by different models live in different vector spaces and aren’t comparable.
Switching models
Section titled “Switching models”You can’t change a brain’s model in place. The vector space is part of the database identity. To switch:
gbrain init ~/brain-large.db --model large(online build).gbrain export ~/brain.db ~/brain-export/(or just import your source markdown directory again).gbrain import ~/brain-export/ --db ~/brain-large.db.- Drop
~/brain.dbwhen you’re satisfied.
See Switch embedding models for the full recipe.
When to use which model
Section titled “When to use which model”small(default) — Almost everyone, almost always. Fast, accurate enough for personal-knowledge scale, ships in the airgapped binary.base/large— When you have ≥100K pages and you’re noticing recall degradation on paraphrased queries. Costs you binary size and per-query latency.m3— When your corpus is multilingual or you specifically want the m3 retrieval profile.- A specific HF model ID — When you’ve benchmarked something specific to your domain and you want to use it.
We do not recommend swapping models without a measured reason. The defaults are tuned for the median user.
What about API embeddings?
Section titled “What about API embeddings?”Out of scope. The whole point of GigaBrain is that the brain doesn’t talk to the network. If you want OpenAI or Voyage embeddings, you’re in the wrong tool — but those tools work fine alongside GigaBrain for other concerns.
Stale embeddings
Section titled “Stale embeddings”When brain_put updates a page, the embedding job queue (embedding_jobs) re-chunks and re-embeds whatever changed. If you suspect drift — a model upgrade, a chunking-strategy change, an aborted batch — gbrain embed --stale re-embeds only chunks whose content_hash no longer matches; gbrain embed --all rebuilds everything.