Configuration¶
Everything you need to configure MemoryMesh: embedding providers, storage paths, encryption, and relevance tuning.
Embedding Providers¶
MemoryMesh supports multiple embedding backends. Choose the one that fits your constraints:
| Provider | Install | Requires | Best For |
|---|---|---|---|
none |
pip install memorymesh |
Nothing | Getting started, keyword-based matching |
local (default) |
pip install memorymesh[local] |
~500MB model download | Privacy-sensitive apps, offline use |
ollama |
pip install memorymesh[ollama] |
Running Ollama instance | Local semantic search, GPU acceleration |
openai |
pip install memorymesh[openai] |
OpenAI API key | Highest quality embeddings |
# Use local embeddings (runs on your machine, no API calls) -- this is the default
memory = MemoryMesh(embedding="local")
# Use Ollama (connect to local Ollama server)
memory = MemoryMesh(embedding="ollama", ollama_model="nomic-embed-text")
# Use OpenAI embeddings
memory = MemoryMesh(embedding="openai", openai_api_key="sk-...")
# No embeddings (pure keyword matching, zero dependencies)
memory = MemoryMesh(embedding="none")
Using Ollama¶
What is Ollama?¶
Ollama is a free, open-source application that runs AI models locally on your machine. Think of it like a local server -- it runs in the background and applications connect to it over HTTP. MemoryMesh uses Ollama for one specific purpose: converting text into numerical vectors (embeddings) that enable semantic search.
Without Ollama (or another embedding provider), MemoryMesh falls back to keyword matching -- recall("testing") will only find memories containing the exact word "testing". With Ollama, MemoryMesh understands meaning -- recall("testing") finds memories about "pytest", "unit tests", "test coverage", and "CI pipeline" because they are semantically related.
How it works¶
┌──────────────────┐ HTTP (localhost:11434) ┌──────────────────┐
│ Your AI Tool │ │ │
│ (Claude Code, │ │ Ollama │
│ Gemini CLI, │ │ (background │
│ Cursor, etc.) │ │ service) │
│ │ │ │ │
│ MemoryMesh │ ───── "embed this text" ──────────> │ nomic-embed- │
│ (MCP server) │ <──── [0.02, -0.15, 0.89, ...] ─── │ text model │
│ │ 768 numbers back │ │
└──────────────────┘ └──────────────────┘
│
SQLite DB
(memories.db)
- When you call
remember("User prefers dark mode"), MemoryMesh sends the text to Ollama - Ollama runs the embedding model and returns a vector of 768 numbers representing the meaning
- MemoryMesh stores the text + vector in SQLite
- When you call
recall("theme preferences"), MemoryMesh embeds the query and finds stored memories with similar vectors - This is why
recall("theme preferences")finds "User prefers dark mode" even though no words match
Step 1: Install Ollama¶
Ollama is a separate application. Install it first:
macOS:
Linux:
Windows: Download from ollama.com/download.
Step 2: Start Ollama¶
Ollama runs as a background service on port 11434. You start it once and it stays running.
Already running? If you see "address already in use", Ollama is already running. This is fine -- on macOS with Homebrew it often auto-starts.
# Check if Ollama is running
brew services info ollama # macOS (Homebrew)
curl http://localhost:11434 # Any OS -- should return "Ollama is running"
Step 3: Pull the embedding model¶
Download the embedding model that MemoryMesh will use. This is a one-time download (~274MB):
What is nomic-embed-text? It is an embedding model, not a chat model. It does one thing: convert text into a vector of 768 numbers that capture semantic meaning. Similar texts produce similar vectors. This is what powers MemoryMesh's semantic search.
| Embedding Model | Dimensions | Size | Quality | Speed |
|---|---|---|---|---|
nomic-embed-text |
768 | 274MB | Very good | Fast |
all-minilm |
384 | 46MB | Good | Very fast |
mxbai-embed-large |
1024 | 670MB | Best | Slower |
nomic-embed-text is the recommended default -- good quality, reasonable size, fast. You can use a different model by passing ollama_model="model-name".
Step 4: Install and configure MemoryMesh¶
# Install MemoryMesh (Ollama support uses only stdlib -- no extra deps needed)
pip install memorymesh
Important: You do NOT need a special install for Ollama support. pip install memorymesh is sufficient because MemoryMesh communicates with Ollama via HTTP using Python's built-in urllib -- no extra packages required.
As a Python library:
from memorymesh import MemoryMesh
memory = MemoryMesh(
embedding="ollama",
ollama_model="nomic-embed-text", # default
# ollama_base_url="http://localhost:11434", # default, change if Ollama is on another machine
)
As an MCP server (for Claude Code, Gemini CLI, Cursor, etc.):
{
"mcpServers": {
"memorymesh": {
"command": "memorymesh-mcp",
"env": {
"MEMORYMESH_EMBEDDING": "ollama",
"MEMORYMESH_OLLAMA_MODEL": "nomic-embed-text"
}
}
}
}
Step 5: Verify it works¶
# Quick test
python -c "
from memorymesh import MemoryMesh
m = MemoryMesh(embedding='ollama')
m.remember('User prefers Python and dark mode')
results = m.recall('programming language preferences')
print(results[0].text if results else 'No results')
m.close()
"
If you see "User prefers Python and dark mode", semantic search is working.
FAQ¶
Q: Does Ollama need to be running in the same terminal as MemoryMesh?
No. Ollama is a background service. Once started, it listens on port 11434. MemoryMesh connects to it via HTTP from any process. You can run pip install, memorymesh-mcp, and your AI tools in any terminal.
Q: Does Ollama use my GPU? Yes, if available. Ollama automatically uses your GPU (CUDA on Linux/Windows, Metal on macOS) for faster inference. The embedding model is small enough that CPU is also fast (~10ms per embedding).
Q: Can I use Ollama on a remote server?
Yes. Set ollama_base_url="http://your-server:11434" or MEMORYMESH_OLLAMA_BASE_URL env var. MemoryMesh will connect to the remote Ollama instance.
Q: What if Ollama is not running when MemoryMesh starts?
MemoryMesh gracefully falls back to keyword-only search. Your memories are still stored and recalled, just without semantic matching. Start Ollama and the next recall() will use embeddings automatically.
Constructor Options¶
from memorymesh import MemoryMesh
memory = MemoryMesh(
# Storage (dual-store)
path=".memorymesh/memories.db", # Project-specific database (optional)
global_path="~/.memorymesh/global.db", # User-wide global database
# Embeddings
embedding="local", # "none", "local", "ollama", "openai"
# Embedding provider options (passed as **kwargs)
# ollama_model="nomic-embed-text", # Ollama model name
# ollama_base_url="http://localhost:11434",
# openai_api_key="sk-...", # OpenAI API key
# local_model="all-MiniLM-L6-v2", # sentence-transformers model
# local_device="cpu", # PyTorch device
# Encryption (optional)
# encryption_key="my-secret-key", # Encrypt text and metadata at rest
# Relevance tuning (optional)
# relevance_weights=RelevanceWeights(
# semantic=0.5,
# recency=0.2,
# importance=0.2,
# frequency=0.1,
# ),
)
Encrypted Storage¶
MemoryMesh can encrypt memory text and metadata at rest. Pass an encryption_key to the constructor:
memory = MemoryMesh(
path=".memorymesh/memories.db",
encryption_key="my-secret-passphrase",
)
# Memories are encrypted before writing to SQLite
memory.remember("Sensitive API key: sk-abc123")
# Decrypted transparently on recall
results = memory.recall("API key")
How it works:
- A key is derived from your passphrase using PBKDF2-HMAC-SHA256.
- The
textandmetadatafields are encrypted before storage and decrypted on read. - IDs, timestamps, importance, and embeddings are not encrypted (needed for queries and indexing).
- A random salt is stored in the database and reused across sessions.
- Uses only Python standard library (
hashlib,hmac,os) -- zero external dependencies.
This protects against casual inspection of the database file on disk. It is not a substitute for full-disk encryption for highly sensitive data.
Auto-Importance Scoring¶
Instead of manually setting importance on every remember() call, let MemoryMesh score it automatically:
# Manual importance (default behavior)
memory.remember("User prefers dark mode", importance=0.7)
# Auto-scored importance based on text analysis
memory.remember("Critical security decision: use JWT with RS256", auto_importance=True)
The auto-scorer analyzes text using four heuristic signals:
| Signal | Weight | What it detects |
|---|---|---|
| Keywords | 35% | Decision words ("critical", "always", "security") boost; tentative words ("maybe", "temporary") reduce |
| Specificity | 30% | File paths, version numbers, proper nouns, URLs indicate high-value information |
| Structure | 20% | Code patterns (backticks, function names, imports) suggest technical decisions |
| Length | 15% | Very short texts score lower; detailed texts score higher |
The output is clamped to [0.0, 1.0] with a baseline of 0.5.
Memory Categories¶
MemoryMesh supports automatic memory categorization. When you set a category, the scope is automatically routed:
# Category determines scope automatically
memory.remember("I prefer dark mode", category="preference") # -> global
memory.remember("Never auto-commit", category="guardrail") # -> global
memory.remember("Chose SQLite over Postgres", category="decision") # -> project
# Or let MemoryMesh detect the category from text
memory.remember("I always use black for formatting", auto_categorize=True)
# Detected as "preference" -> stored in global scope
| Category | Auto-Scope | Description |
|---|---|---|
preference |
global | User coding style, tool preferences |
guardrail |
global | Rules AI must follow |
mistake |
global | Past mistakes to avoid |
personality |
global | User character traits |
question |
global | Recurring questions/concerns |
decision |
project | Architecture/design decisions |
pattern |
project | Code patterns and conventions |
context |
project | Project-specific facts |
session_summary |
project | Auto-generated session summaries |
When auto_categorize=True, MemoryMesh also enables auto_importance=True automatically.
Scope Inference¶
When scope is not explicitly set in remember(), MemoryMesh automatically infers the correct scope from the text content:
- User-focused text -> global scope: "User prefers dark mode", "Krishna's workflow: review then merge", "Coding style: functional over OOP"
- Project-focused text -> project scope: file paths (
src/,*.py), config files (pyproject.toml), implementation state, version numbers, commit hashes
The inference uses a scoring system -- when both user and project signals are present, the stronger signal wins. Product/project names detected from the project directory add extra weight toward project scope.
Category routing still applies first. Inference refines it when the subject disagrees -- for example, a memory categorized as "pattern" (normally project) that says "Krishna's patterns: tests CLI hands-on" is about the user, so it routes to global.
To override inference, set scope explicitly:
memory.remember("User prefers dark mode", scope="project") # forced project despite user-focused text
Session Start¶
Retrieve structured context at the beginning of every AI session:
context = memory.session_start(project_context="working on auth module")
# Returns:
# {
# "user_profile": ["Senior Python developer", "Prefers dark mode"],
# "guardrails": ["Never auto-commit without asking"],
# "common_mistakes": ["Forgot to run tests before pushing"],
# "common_questions": ["Always asks about test coverage"],
# "project_context": ["Uses SQLite for storage", "Google-style docstrings"],
# "last_session": ["Implemented auth module, 15 tests added"],
# }
This is available as an MCP tool (session_start) that AI assistants can call at the beginning of every conversation.
Auto-Compaction¶
MemoryMesh automatically detects and merges duplicate memories during normal operation. Every 50 remember() calls, a lightweight compaction pass runs in the background. This is like SQLite's auto-vacuum -- you never need to think about it.
# Adjust the interval (default: 50)
memory.compact_interval = 100 # compact every 100 writes
memory.compact_interval = 0 # disable auto-compaction
# Manual compaction is still available
result = memory.compact(scope="project", dry_run=True)
print(f"Would merge {result.merged_count} duplicates")
Pin Support¶
Pin critical memories so they never fade and always appear in recall results:
When pin=True:
- Importance is set to
1.0(maximum). - Decay rate is set to
0.0(never fades). - The metadata field
pinned: trueis set for identification.
Use pinning for guardrails, non-negotiable rules, and critical identity facts that should always influence the AI's behavior.
Privacy Guard¶
MemoryMesh scans all text for potential secrets before storing. Detected patterns include:
| Pattern | Example |
|---|---|
| API keys | sk-abc123..., pk-xyz... |
| GitHub tokens | ghp_..., gho_... |
| Passwords | password: mySecret |
| Private keys | -----BEGIN PRIVATE KEY----- |
| JWT tokens | eyJhbG... |
| AWS access keys | AKIA... |
| Slack tokens | xoxb-... |
When secrets are detected, a warning is logged and metadata flags (has_secrets_warning, detected_secret_types) are added. To automatically redact secrets before storing:
You can also use the functions directly:
from memorymesh.privacy import check_for_secrets, redact_secrets
secrets = check_for_secrets("my password: hunter2")
# ["password"]
clean = redact_secrets("token: sk-abc123456789")
# "token: [REDACTED]"
Contradiction Detection¶
When storing a new memory, MemoryMesh can check for existing memories that contradict it. Control the behavior with the on_conflict parameter:
# Default: store both, flag the contradiction in metadata
memory.remember("Use PostgreSQL for production", on_conflict="keep_both")
# Replace the most similar existing memory
memory.remember("Use PostgreSQL for production", on_conflict="update")
# Don't store if a contradiction is found
memory.remember("Use PostgreSQL for production", on_conflict="skip")
The three conflict modes:
| Mode | Behavior |
|---|---|
keep_both |
Store the new memory alongside existing ones. Adds has_contradiction flag to metadata. |
update |
Replace the most similar existing memory with the new text. |
skip |
Discard the new memory if a contradiction is found. Returns empty string. |
You can also find contradictions directly:
from memorymesh.contradiction import find_contradictions, ConflictMode
# Find memories that may contradict new text
contradictions = find_contradictions(text, embedding, store, threshold=0.75)
# Returns: [(memory, similarity_score), ...]
Retrieval Filters¶
recall() supports additional filters to narrow down results:
results = memory.recall(
"auth decisions",
k=10,
category="decision", # Only return memories with this category
min_importance=0.7, # Only return memories with importance >= 0.7
time_range=("2026-01-01", "2026-02-01"), # Filter by creation date
metadata_filter={"pinned": True}, # Match specific metadata keys
)
| Filter | Type | Description |
|---|---|---|
category |
str |
Only return memories with this category in metadata |
min_importance |
float |
Minimum importance threshold |
time_range |
tuple[str, str] |
ISO-8601 date range (start, end) for creation time |
metadata_filter |
dict |
Key-value pairs that must match in memory metadata |
Filters are applied before ranking, so they reduce the candidate set rather than post-filtering results.