Search code by meaning.
Rebuild legacy systems with confidence.
ogrep is semantic search for codebases: it chunks your repository, embeds those chunks, stores them in a single local SQLite index, and retrieves the most relevant code for any question. It is designed for Claude Code as a Skill (not MCP) and works just as well from the CLI.
Token savings: ogrep uses embeddings for indexing + retrieval. It does not require a chat model. Chat/completion tokens are only spent if you choose to have an LLM interpret the retrieved snippets.
# Install with AST support
pip install "ogrep[ast]"
# Index (AST chunking is now default)
export OPENAI_API_KEY="sk-..."
ogrep index .
# Ask questions (JSON output is default)
ogrep query "where is authentication handled?" -n 12
ogrep query "how are API errors mapped?" -M hybrid
What's New in v0.8.7
Tune command uses AST mode. Smart API key detection. AST chunking default. Voyage AI support. FlashRank reranking.
Tune Uses AST Mode
The ogrep tune command now uses AST chunking when available,
matching the index command's behavior for consistent tuning.
ogrep tune . # Now AST-awareSmart API Key Detection
ogrep now auto-selects the best embedding model based on available API keys.
Just set your key and go—no -m flag needed.
export VOYAGE_API_KEY=... # Uses voyage-code-3AST Chunking Default
AST-aware chunking is now enabled by default when tree-sitter is available.
No more --ast flag needed.
ogrep index . # AST by defaultVoyage AI Embeddings
Code-optimized embeddings from Voyage AI. 32K token context, 1024D vectors. Best quality for semantic code search.
ogrep index . -m voyage-code-3FlashRank Reranking
Lightweight ONNX reranker (~4MB). Parallel-safe, no file locking. Helps local embeddings; skip for Voyage/OpenAI.
pip install "ogrep[rerank-light]"Benchmark-Driven Advice
Recommendations based on real MRR benchmarks. We tested so you don't have to. Reranking hurts strong embeddings—skip it.
Key Finding: Reranking Hurts Strong Embeddings
Our benchmarks show that reranking degrades results for Voyage and OpenAI embeddings (MRR drops 12-21%).
Only use --rerank with local embeddings like Nomic. High-quality embeddings are already well-calibrated.
Open Source on GitHub
ogrep is MIT licensed. Star the repo, report issues, contribute, or fork it for your own use.
AST-Aware Chunking
Now the default. Respects function, class, and method boundaries for better search accuracy.
Lines 55-115 (one chunk):
- End of ClassA
- Start of ClassB ← Semantic mixing!
- Beginning of method foo()
Chunk 1: ClassA (complete)
Chunk 2: ClassB.foo() method
Chunk 3: ClassB.bar() method
Supported Languages
Install with pip install "ogrep[ast]" for core languages (Python, JS, TS, Go, Rust) or pip install "ogrep[ast-all]" for all.
Unsupported languages fall back to line-based chunking automatically.
Recommended Configurations
Based on benchmarks with 10 ground-truth queries. MRR = Mean Reciprocal Rank (higher is better).
🥇 Best Quality: Voyage AI
MRR: 0.717 • Code-optimized • No reranking needed
pip install "ogrep[ast,voyage]"
export VOYAGE_API_KEY="pa-..."
ogrep index . -m voyage-code-3
ogrep query "your search"
Best for production systems where search quality matters most.
🥈 Best Value: OpenAI
MRR: 0.700 • 3x cheaper • No reranking needed
pip install "ogrep[ast]"
export OPENAI_API_KEY="sk-..."
ogrep index . -m small
ogrep query "your search"
Only 2.4% quality drop vs Voyage. Great balance of cost and quality.
🥉 Offline/Free: Nomic + FlashRank
MRR: ~0.63 • Free • Reranking helps
pip install "ogrep[ast,rerank-light]"
export OGREP_BASE_URL=http://localhost:1234/v1
ogrep index . -m nomic
ogrep query "your search" --rerank
Zero API costs. Works offline. FlashRank compensates for weaker embeddings.
| Configuration | Quality | Cost | Rerank? | Best For |
|---|---|---|---|---|
| Voyage + AST | MRR 0.717 | $0.06/M tokens | ❌ Skip | Production quality |
| OpenAI + AST | MRR 0.700 | $0.02/M tokens | ❌ Skip | Budget-conscious |
| Nomic + FlashRank | MRR ~0.63 | Free | ✅ Use | Offline/privacy |
Reranking: When to Use It
Reranking helps weak embeddings but hurts strong ones. Use it selectively.
⚠️ Don't Rerank Strong Embeddings
Our benchmarks show reranking degrades Voyage and OpenAI results by 12-21%.
These embeddings are already well-calibrated. Only use --rerank with local embeddings like Nomic.
| Embedding | Without Rerank | With FlashRank | Change | Recommendation |
|---|---|---|---|---|
| Voyage | MRR 0.717 | MRR ~0.60 | -16% | ❌ Skip reranking |
| OpenAI | MRR 0.700 | MRR 0.550 | -21% | ❌ Skip reranking |
| Nomic (local) | MRR 0.545 | MRR 0.633 | +16% | ✅ Use reranking |
FlashRank (Recommended)
Lightweight ONNX model (~4MB). Parallel-safe, no file locking. Best balance of speed and quality for local embeddings.
pip install "ogrep[rerank-light]"Voyage Reranker
Voyage AI's rerank-2.5 model. 32K context, instruction-following.
Cloud API, requires VOYAGE_API_KEY.
--rerank-model voyagesentence-transformers
Heavy PyTorch models (90-300MB). bge-m3 is slow on CPU (~30s/query).
Use only with GPU acceleration.
pip install "ogrep[rerank]"Performance & Requirements
Reranking uses cross-encoder models that benefit from GPU acceleration. Here's what to expect.
| Hardware | Reranking Speed | Notes |
|---|---|---|
| NVIDIA GPU (CUDA) | ~10x faster | Requires CUDA 12.x drivers. Best experience. |
| Apple Silicon (MPS) | ~3-5x faster | Automatic on macOS 12.3+. No setup needed. |
| CPU only | Baseline | Works but slower. Expect 2-5 seconds per query. |
Model Downloads
The reranker model (bge-reranker-v2-m3) is ~300MB, downloaded on first use.
AST parsers add ~5-15MB per language. All cached locally after first download.
Check Your Hardware
Run ogrep device to detect GPU/CPU capabilities and get recommendations.
JSON output is the default; use --no-json for text.
ogrep deviceCPU-Only Tips
Without GPU: use --rerank-top 20 for faster response, or skip --rerank
entirely. Hybrid search without reranking is still very accurate and fast.
Graceful Degradation
If reranking fails (missing dependencies, GPU issues), ogrep automatically falls back to non-reranked results.
The JSON output includes "rerank_skipped": true and a "suggestion" field explaining why
and what to do. Your queries always return results—never fail due to reranking issues.
Real-world scenarios
This is where ogrep shines: legacy code archaeology, behavior reconstruction, and fast intent-level navigation.
Legacy archaeology → rebuild by outcome
Instead of rewriting in place and fighting old architecture, use ogrep to extract what the system does: flows, invariants, edge cases, and the real source-of-truth logic. Build a clean replacement that mimics the original behavior while enabling modern development.
Stop the grep → paste → token blackhole loop
Index once, then retrieve small, high-signal snippets. This reduces the need to shovel entire files into a chat model. With local embeddings (LM Studio), indexing is fully local and cost-free; with cloud embeddings, you still avoid repeated read-everything prompts.
Understand "meaning", not naming
Names lie in legacy repos. ogrep helps you find the intent behind code: where auth truly happens, how state transitions work, where validation is enforced, or which code actually sends emails.
How it works
A simple pipeline: chunk, embed, store, retrieve. No external services required when using local embeddings.
Embedding Providers
Three options: Voyage AI (best quality), OpenAI (best value), or local (free/offline).
| Provider | Model | Quality | Cost | Setup |
|---|---|---|---|---|
| Voyage AI | voyage-code-3 | MRR 0.717 | $0.06/M tokens | Set VOYAGE_API_KEY |
| OpenAI | text-embedding-3-small | MRR 0.700 | $0.02/M tokens | Set OPENAI_API_KEY |
| LM Studio (local) | nomic-embed-text-v1.5 | MRR ~0.63* | Free | Set OGREP_BASE_URL |
*With FlashRank reranking. Without reranking: MRR ~0.55.
pip install "ogrep[voyage]"
export VOYAGE_API_KEY="pa-..."
ogrep index . -m voyage-code-3
export OPENAI_API_KEY="sk-..."
ogrep index . -m small
export OGREP_BASE_URL=http://localhost:1234/v1
ogrep index . -m nomic
Install
Use the CLI directly, or integrate as a Claude Code Skill via marketplace plugin (not MCP).
CLI (recommended: pip)
Simple pip install. Add optional extras for AST and reranking.
# Basic install
pip install ogrep
# With AST chunking (recommended)
pip install "ogrep[ast]"
# With reranking
pip install "ogrep[rerank]"
# Full install (AST + reranking)
pip install "ogrep[ast,rerank]"
Claude Code
Marketplace plugin + Skills integration. This is the primary integration mode for Claude Code.
/plugin marketplace add gplv2/ogrep-marketplace
/plugin install ogrep@ogrep-marketplace
Get Started
Pick your embedding provider. Index once. Query forever. AST chunking is automatic.
# Install with AST support
pip install "ogrep[ast]"
# Choose your embedding provider:
## Option A: Voyage AI (best quality, MRR 0.717)
pip install "ogrep[voyage]"
export VOYAGE_API_KEY="pa-..."
ogrep index . -m voyage-code-3
ogrep query "where is authentication handled?"
## Option B: OpenAI (best value, MRR 0.700)
export OPENAI_API_KEY="sk-..."
ogrep index . -m small
ogrep query "where is authentication handled?"
## Option C: Local/Free (MRR ~0.63 with reranking)
pip install "ogrep[rerank-light]"
export OGREP_BASE_URL=http://localhost:1234/v1
ogrep index . -m nomic
ogrep query "where is authentication handled?" --rerank
# Check index status
ogrep status
JSON Output is Default
All commands output JSON by default. Use --no-json for human-readable text.
The JSON includes confidence scoring, language detection, and search stats.
{
"query": "database connections",
"results": [
{
"rank": 1,
"chunk_ref": "src/db.py:2",
"score": 0.72,
"confidence": {"level": "high", "relative_pct": 100.0},
"language": "python",
"text": "def connect_to_database(config):\n ..."
}
],
"stats": {
"search_mode": "hybrid",
"reranked": false,
"ast_mode": true
}
}
Fast
Local SQLite index. Semantic queries ~200ms. Fulltext queries ~5ms.
Accurate
AST chunking preserves code structure. Hybrid search combines meaning + keywords.
Cost-aware
Index once, reuse forever. Embedding tokens only. No chat model required.
FAQ
Short answers to the questions people ask immediately.
Is this MCP?
No. ogrep is primarily delivered as a Claude Code marketplace plugin that exposes a Skill-style workflow. You run indexing and queries locally; Claude Code can call into it as a tool without requiring an MCP server.
Does ogrep use tokens?
ogrep uses embeddings for indexing and query retrieval. With local embeddings, there is no per-token bill. With cloud embeddings, indexing costs embedding tokens (and a small amount per query). ogrep does not require a chat model; chat/completion tokens are only spent if you choose to have an LLM interpret the retrieved snippets.
Is AST chunking automatic now?
Yes! As of v0.8, AST chunking is enabled by default when tree-sitter is installed.
Install with pip install "ogrep[ast]" and just run ogrep index ..
Use --no-ast to disable if needed.
Should I use --rerank?
Only with local embeddings like Nomic. Our benchmarks show reranking hurts Voyage and OpenAI results by 12-21%.
If you're using local embeddings, --rerank with FlashRank helps. Otherwise, skip it.
Which embedding model should I use?
Voyage AI (voyage-code-3) for best quality. OpenAI (text-embedding-3-small) for best value. Nomic (local) for free/offline use. See the recommendations section for benchmarks.
Where does the index live?
By default: .ogrep/index.sqlite. It is a single local file, so it is easy to keep per repo or per profile.
How do I get better results?
1. Use AST chunking: install pip install "ogrep[ast]" (enabled by default)
2. Use reranking with local embeddings: --rerank (skip for Voyage/OpenAI)
3. Tune chunk size: ogrep tune .
4. Use fulltext mode for exact identifiers: --mode fulltext
What does the JSON hint mean?
When querying an index built without AST chunking, the JSON output includes a hint:
"hint": "Index was built without AST chunking. For better semantic boundaries, run: ogrep reindex .".
This tells AI tools to suggest rebuilding the index with AST (now the default) for better results.