Skip to content

Configuration

Everything you need to configure MemoryMesh: embedding providers, storage paths, encryption, and relevance tuning.

Embedding Providers

MemoryMesh supports multiple embedding backends. Choose the one that fits your constraints:

Provider Install Requires Best For
none pip install memorymesh Nothing Getting started, keyword-based matching
local (default) pip install memorymesh[local] ~500MB model download Privacy-sensitive apps, offline use
ollama pip install memorymesh[ollama] Running Ollama instance Local semantic search, GPU acceleration
openai pip install memorymesh[openai] OpenAI API key Highest quality embeddings
# Use local embeddings (runs on your machine, no API calls) -- this is the default
memory = MemoryMesh(embedding="local")

# Use Ollama (connect to local Ollama server)
memory = MemoryMesh(embedding="ollama", ollama_model="nomic-embed-text")

# Use OpenAI embeddings
memory = MemoryMesh(embedding="openai", openai_api_key="sk-...")

# No embeddings (pure keyword matching, zero dependencies)
memory = MemoryMesh(embedding="none")

Using Ollama

What is Ollama?

Ollama is a free, open-source application that runs AI models locally on your machine. Think of it like a local server -- it runs in the background and applications connect to it over HTTP. MemoryMesh uses Ollama for one specific purpose: converting text into numerical vectors (embeddings) that enable semantic search.

Without Ollama (or another embedding provider), MemoryMesh falls back to keyword matching -- recall("testing") will only find memories containing the exact word "testing". With Ollama, MemoryMesh understands meaning -- recall("testing") finds memories about "pytest", "unit tests", "test coverage", and "CI pipeline" because they are semantically related.

How it works

┌──────────────────┐         HTTP (localhost:11434)        ┌──────────────────┐
│   Your AI Tool   │                                       │                  │
│  (Claude Code,   │                                       │     Ollama       │
│   Gemini CLI,    │                                       │  (background     │
│   Cursor, etc.)  │                                       │   service)       │
│        │         │                                       │                  │
│   MemoryMesh     │  ───── "embed this text" ──────────>  │  nomic-embed-    │
│   (MCP server)   │  <──── [0.02, -0.15, 0.89, ...] ───  │  text model      │
│                  │         768 numbers back               │                  │
└──────────────────┘                                       └──────────────────┘
    SQLite DB
  (memories.db)
  1. When you call remember("User prefers dark mode"), MemoryMesh sends the text to Ollama
  2. Ollama runs the embedding model and returns a vector of 768 numbers representing the meaning
  3. MemoryMesh stores the text + vector in SQLite
  4. When you call recall("theme preferences"), MemoryMesh embeds the query and finds stored memories with similar vectors
  5. This is why recall("theme preferences") finds "User prefers dark mode" even though no words match

Step 1: Install Ollama

Ollama is a separate application. Install it first:

macOS:

brew install ollama

Linux:

curl -fsSL https://ollama.com/install.sh | sh

Windows: Download from ollama.com/download.

Step 2: Start Ollama

Ollama runs as a background service on port 11434. You start it once and it stays running.

# Start Ollama (runs in the background)
ollama serve

Already running? If you see "address already in use", Ollama is already running. This is fine -- on macOS with Homebrew it often auto-starts.

# Check if Ollama is running
brew services info ollama       # macOS (Homebrew)
curl http://localhost:11434     # Any OS -- should return "Ollama is running"

Step 3: Pull the embedding model

Download the embedding model that MemoryMesh will use. This is a one-time download (~274MB):

ollama pull nomic-embed-text

What is nomic-embed-text? It is an embedding model, not a chat model. It does one thing: convert text into a vector of 768 numbers that capture semantic meaning. Similar texts produce similar vectors. This is what powers MemoryMesh's semantic search.

Embedding Model Dimensions Size Quality Speed
nomic-embed-text 768 274MB Very good Fast
all-minilm 384 46MB Good Very fast
mxbai-embed-large 1024 670MB Best Slower

nomic-embed-text is the recommended default -- good quality, reasonable size, fast. You can use a different model by passing ollama_model="model-name".

Step 4: Install and configure MemoryMesh

# Install MemoryMesh (Ollama support uses only stdlib -- no extra deps needed)
pip install memorymesh

Important: You do NOT need a special install for Ollama support. pip install memorymesh is sufficient because MemoryMesh communicates with Ollama via HTTP using Python's built-in urllib -- no extra packages required.

As a Python library:

from memorymesh import MemoryMesh

memory = MemoryMesh(
    embedding="ollama",
    ollama_model="nomic-embed-text",           # default
    # ollama_base_url="http://localhost:11434", # default, change if Ollama is on another machine
)

As an MCP server (for Claude Code, Gemini CLI, Cursor, etc.):

{
  "mcpServers": {
    "memorymesh": {
      "command": "memorymesh-mcp",
      "env": {
        "MEMORYMESH_EMBEDDING": "ollama",
        "MEMORYMESH_OLLAMA_MODEL": "nomic-embed-text"
      }
    }
  }
}

Step 5: Verify it works

# Quick test
python -c "
from memorymesh import MemoryMesh
m = MemoryMesh(embedding='ollama')
m.remember('User prefers Python and dark mode')
results = m.recall('programming language preferences')
print(results[0].text if results else 'No results')
m.close()
"

If you see "User prefers Python and dark mode", semantic search is working.

FAQ

Q: Does Ollama need to be running in the same terminal as MemoryMesh? No. Ollama is a background service. Once started, it listens on port 11434. MemoryMesh connects to it via HTTP from any process. You can run pip install, memorymesh-mcp, and your AI tools in any terminal.

Q: Does Ollama use my GPU? Yes, if available. Ollama automatically uses your GPU (CUDA on Linux/Windows, Metal on macOS) for faster inference. The embedding model is small enough that CPU is also fast (~10ms per embedding).

Q: Can I use Ollama on a remote server? Yes. Set ollama_base_url="http://your-server:11434" or MEMORYMESH_OLLAMA_BASE_URL env var. MemoryMesh will connect to the remote Ollama instance.

Q: What if Ollama is not running when MemoryMesh starts? MemoryMesh gracefully falls back to keyword-only search. Your memories are still stored and recalled, just without semantic matching. Start Ollama and the next recall() will use embeddings automatically.

Constructor Options

from memorymesh import MemoryMesh

memory = MemoryMesh(
    # Storage (dual-store)
    path=".memorymesh/memories.db",       # Project-specific database (optional)
    global_path="~/.memorymesh/global.db", # User-wide global database

    # Embeddings
    embedding="local",                    # "none", "local", "ollama", "openai"

    # Embedding provider options (passed as **kwargs)
    # ollama_model="nomic-embed-text",    # Ollama model name
    # ollama_base_url="http://localhost:11434",
    # openai_api_key="sk-...",            # OpenAI API key
    # local_model="all-MiniLM-L6-v2",    # sentence-transformers model
    # local_device="cpu",                 # PyTorch device

    # Encryption (optional)
    # encryption_key="my-secret-key",     # Encrypt text and metadata at rest

    # Relevance tuning (optional)
    # relevance_weights=RelevanceWeights(
    #     semantic=0.5,
    #     recency=0.2,
    #     importance=0.2,
    #     frequency=0.1,
    # ),
)

Encrypted Storage

MemoryMesh can encrypt memory text and metadata at rest. Pass an encryption_key to the constructor:

memory = MemoryMesh(
    path=".memorymesh/memories.db",
    encryption_key="my-secret-passphrase",
)

# Memories are encrypted before writing to SQLite
memory.remember("Sensitive API key: sk-abc123")

# Decrypted transparently on recall
results = memory.recall("API key")

How it works:

  • A key is derived from your passphrase using PBKDF2-HMAC-SHA256.
  • The text and metadata fields are encrypted before storage and decrypted on read.
  • IDs, timestamps, importance, and embeddings are not encrypted (needed for queries and indexing).
  • A random salt is stored in the database and reused across sessions.
  • Uses only Python standard library (hashlib, hmac, os) -- zero external dependencies.

This protects against casual inspection of the database file on disk. It is not a substitute for full-disk encryption for highly sensitive data.

Auto-Importance Scoring

Instead of manually setting importance on every remember() call, let MemoryMesh score it automatically:

# Manual importance (default behavior)
memory.remember("User prefers dark mode", importance=0.7)

# Auto-scored importance based on text analysis
memory.remember("Critical security decision: use JWT with RS256", auto_importance=True)

The auto-scorer analyzes text using four heuristic signals:

Signal Weight What it detects
Keywords 35% Decision words ("critical", "always", "security") boost; tentative words ("maybe", "temporary") reduce
Specificity 30% File paths, version numbers, proper nouns, URLs indicate high-value information
Structure 20% Code patterns (backticks, function names, imports) suggest technical decisions
Length 15% Very short texts score lower; detailed texts score higher

The output is clamped to [0.0, 1.0] with a baseline of 0.5.

Memory Categories

MemoryMesh supports automatic memory categorization. When you set a category, the scope is automatically routed:

# Category determines scope automatically
memory.remember("I prefer dark mode", category="preference")        # -> global
memory.remember("Never auto-commit", category="guardrail")          # -> global
memory.remember("Chose SQLite over Postgres", category="decision")  # -> project

# Or let MemoryMesh detect the category from text
memory.remember("I always use black for formatting", auto_categorize=True)
# Detected as "preference" -> stored in global scope
Category Auto-Scope Description
preference global User coding style, tool preferences
guardrail global Rules AI must follow
mistake global Past mistakes to avoid
personality global User character traits
question global Recurring questions/concerns
decision project Architecture/design decisions
pattern project Code patterns and conventions
context project Project-specific facts
session_summary project Auto-generated session summaries

When auto_categorize=True, MemoryMesh also enables auto_importance=True automatically.

Scope Inference

When scope is not explicitly set in remember(), MemoryMesh automatically infers the correct scope from the text content:

  • User-focused text -> global scope: "User prefers dark mode", "Krishna's workflow: review then merge", "Coding style: functional over OOP"
  • Project-focused text -> project scope: file paths (src/, *.py), config files (pyproject.toml), implementation state, version numbers, commit hashes

The inference uses a scoring system -- when both user and project signals are present, the stronger signal wins. Product/project names detected from the project directory add extra weight toward project scope.

Category routing still applies first. Inference refines it when the subject disagrees -- for example, a memory categorized as "pattern" (normally project) that says "Krishna's patterns: tests CLI hands-on" is about the user, so it routes to global.

To override inference, set scope explicitly:

memory.remember("User prefers dark mode", scope="project")  # forced project despite user-focused text

Session Start

Retrieve structured context at the beginning of every AI session:

context = memory.session_start(project_context="working on auth module")

# Returns:
# {
#     "user_profile": ["Senior Python developer", "Prefers dark mode"],
#     "guardrails": ["Never auto-commit without asking"],
#     "common_mistakes": ["Forgot to run tests before pushing"],
#     "common_questions": ["Always asks about test coverage"],
#     "project_context": ["Uses SQLite for storage", "Google-style docstrings"],
#     "last_session": ["Implemented auth module, 15 tests added"],
# }

This is available as an MCP tool (session_start) that AI assistants can call at the beginning of every conversation.

Auto-Compaction

MemoryMesh automatically detects and merges duplicate memories during normal operation. Every 50 remember() calls, a lightweight compaction pass runs in the background. This is like SQLite's auto-vacuum -- you never need to think about it.

# Adjust the interval (default: 50)
memory.compact_interval = 100   # compact every 100 writes
memory.compact_interval = 0     # disable auto-compaction

# Manual compaction is still available
result = memory.compact(scope="project", dry_run=True)
print(f"Would merge {result.merged_count} duplicates")

Pin Support

Pin critical memories so they never fade and always appear in recall results:

memory.remember("NEVER auto-commit without asking the user", pin=True)

When pin=True:

  • Importance is set to 1.0 (maximum).
  • Decay rate is set to 0.0 (never fades).
  • The metadata field pinned: true is set for identification.

Use pinning for guardrails, non-negotiable rules, and critical identity facts that should always influence the AI's behavior.

Privacy Guard

MemoryMesh scans all text for potential secrets before storing. Detected patterns include:

Pattern Example
API keys sk-abc123..., pk-xyz...
GitHub tokens ghp_..., gho_...
Passwords password: mySecret
Private keys -----BEGIN PRIVATE KEY-----
JWT tokens eyJhbG...
AWS access keys AKIA...
Slack tokens xoxb-...

When secrets are detected, a warning is logged and metadata flags (has_secrets_warning, detected_secret_types) are added. To automatically redact secrets before storing:

memory.remember("API key is sk-abc123456789", redact=True)
# Stored text: "API key is [REDACTED]"

You can also use the functions directly:

from memorymesh.privacy import check_for_secrets, redact_secrets

secrets = check_for_secrets("my password: hunter2")
# ["password"]

clean = redact_secrets("token: sk-abc123456789")
# "token: [REDACTED]"

Contradiction Detection

When storing a new memory, MemoryMesh can check for existing memories that contradict it. Control the behavior with the on_conflict parameter:

# Default: store both, flag the contradiction in metadata
memory.remember("Use PostgreSQL for production", on_conflict="keep_both")

# Replace the most similar existing memory
memory.remember("Use PostgreSQL for production", on_conflict="update")

# Don't store if a contradiction is found
memory.remember("Use PostgreSQL for production", on_conflict="skip")

The three conflict modes:

Mode Behavior
keep_both Store the new memory alongside existing ones. Adds has_contradiction flag to metadata.
update Replace the most similar existing memory with the new text.
skip Discard the new memory if a contradiction is found. Returns empty string.

You can also find contradictions directly:

from memorymesh.contradiction import find_contradictions, ConflictMode

# Find memories that may contradict new text
contradictions = find_contradictions(text, embedding, store, threshold=0.75)
# Returns: [(memory, similarity_score), ...]

Retrieval Filters

recall() supports additional filters to narrow down results:

results = memory.recall(
    "auth decisions",
    k=10,
    category="decision",           # Only return memories with this category
    min_importance=0.7,            # Only return memories with importance >= 0.7
    time_range=("2026-01-01", "2026-02-01"),  # Filter by creation date
    metadata_filter={"pinned": True},         # Match specific metadata keys
)
Filter Type Description
category str Only return memories with this category in metadata
min_importance float Minimum importance threshold
time_range tuple[str, str] ISO-8601 date range (start, end) for creation time
metadata_filter dict Key-value pairs that must match in memory metadata

Filters are applied before ranking, so they reduce the candidate set rather than post-filtering results.


Back to Home