Local-first code search that finds intent, not strings Open Source

Search code by meaning.
Rebuild legacy systems with confidence.

ogrep is semantic search for codebases: it chunks your repository, embeds those chunks, stores them in a single local SQLite index, and retrieves the most relevant code for any question. It is designed for Claude Code as a Skill (not MCP) and works just as well from the CLI.

Token savings: ogrep uses embeddings for indexing + retrieval. It does not require a chat model. Chat/completion tokens are only spent if you choose to have an LLM interpret the retrieved snippets.

Quick start CLI or Claude Code Skill
# Install
pip install ogrep

# Index (creates .ogrep/index.sqlite)
export OPENAI_API_KEY="sk-..."
ogrep index . --ast

# Ask questions (always use --json for AI tools)
ogrep query "where is authentication handled?" -n 12 --json
ogrep query "how are API errors mapped?" -n 12 --rerank --json

What's New in v0.7

Major search accuracy improvements: AST-aware chunking, cross-encoder reranking, and RRF hybrid fusion.

AST chunking Reranking RRF fusion

AST-Aware Chunking

Chunk by function/class/method boundaries instead of arbitrary line counts. Produces semantically coherent chunks that dramatically improve search accuracy.

ogrep index . --ast

Cross-Encoder Reranking

Add --rerank to apply a cross-encoder model for high-precision ranking. Solves the "right file is in top 30 but not #1" problem.

ogrep query "..." --rerank

RRF Hybrid Fusion

Reciprocal Rank Fusion replaces alpha weighting as default. Combines results by position for more robust ranking—no tuning required.

OGREP_FUSION_METHOD=rrf

AST Mode Tracking

Index tracks whether AST chunking was used. Query results include hints when the index could benefit from AST mode.

"ast_mode": false, "hint": "..."

AI Tool Integration: Always Use --json

When calling ogrep from AI tools, scripts, or programmatic contexts, always use --json for structured output. The JSON includes full chunk text, confidence scores, language detection, search stats, and AST mode hints. For Claude Code Skills: use --json and --refresh together for accurate, up-to-date results.

Open Source on GitHub

ogrep is MIT licensed. Star the repo, report issues, contribute, or fork it for your own use.

View on GitHub

AST-Aware Chunking

The biggest accuracy improvement for code search. Respects function, class, and method boundaries.

Without AST (line-based) semantic mixing
Lines 55-115 (one chunk):
  - End of ClassA
  - Start of ClassB  ← Semantic mixing!
  - Beginning of method foo()
With AST chunking coherent boundaries
Chunk 1: ClassA (complete)
Chunk 2: ClassB.foo() method
Chunk 3: ClassB.bar() method

Supported Languages

Python JavaScript TypeScript Go Rust C/C++ Java Ruby PHP C# Scala Kotlin

Install with pip install "ogrep[ast]" for core languages or pip install "ogrep[ast-all]" for all 13. Files in unsupported languages fall back to line-based chunking automatically.

Cross-Encoder Reranking

Solves the "right file in top 30 but not #1" problem with high-precision ranking.

Two-stage pipeline fast retrieval + precise reranking
Query → Stage 1: Fast Retrieval (embeddings + BM25) → Top 50 candidates
                              ↓
      Stage 2: Slow Reranking (cross-encoder) → Top 10 results

Why it works

Cross-encoders process (query, document) pairs together, modeling fine-grained relationships that embeddings miss. Much higher precision for final ranking.

Usage

pip install "ogrep[rerank]"
ogrep query "..." --rerank
ogrep query "..." --rerank-top 100

Model

Uses BAAI/bge-reranker-v2-m3 by default (~300MB, auto-downloaded). Works well with code and is multilingual.

Real-world scenarios

This is where ogrep shines: legacy code archaeology, behavior reconstruction, and fast intent-level navigation.

Legacy archaeology → rebuild by outcome

Instead of rewriting in place and fighting old architecture, use ogrep to extract what the system does: flows, invariants, edge cases, and the real source-of-truth logic. Build a clean replacement that mimics the original behavior while enabling modern development.

Stop the grep → paste → token blackhole loop

Index once, then retrieve small, high-signal snippets. This reduces the need to shovel entire files into a chat model. With local embeddings (LM Studio), indexing is fully local and cost-free; with cloud embeddings, you still avoid repeated read-everything prompts.

Understand "meaning", not naming

Names lie in legacy repos. ogrep helps you find the intent behind code: where auth truly happens, how state transitions work, where validation is enforced, or which code actually sends emails.

How it works

A simple pipeline: chunk, embed, store, retrieve. No external services required when using local embeddings.

Index: SQLite Embeddings: OpenAI or LM Studio Query: top-K code chunks
Step 1
Scan
Walk the repo with sane defaults (source-first; skip noise).
Step 2
Chunk
Split by AST boundaries (functions, classes) or line-based with overlap.
Step 3
Embed
Generate vectors via OpenAI embeddings or local LM Studio models.
Step 4
Store
Save embeddings + metadata in a single local SQLite file.
Step 5
Retrieve
Query by meaning, rerank, and return the best-matching code snippets.

Embedding providers

Pick cloud embeddings for convenience or local embeddings for privacy and zero run-cost.

Provider Cost Privacy Setup
OpenAI API $0.02 / 1M tokens (embedding) Cloud Set OPENAI_API_KEY
LM Studio (local) Free 100% local Run lms server start, set OGREP_BASE_URL
OpenAI (cloud) embedding tokens during indexing
export OPENAI_API_KEY="sk-..."
ogrep index . --ast -m small
LM Studio (local) offline, free
export OGREP_BASE_URL=http://localhost:1234/v1
ogrep index . --ast -m nomic

Install

Use the CLI directly, or integrate as a Claude Code Skill via marketplace plugin (not MCP).

CLI (recommended: pip)

Simple pip install. Add optional extras for AST and reranking.

Install via pip recommended
# Basic install
pip install ogrep

# With AST chunking (recommended)
pip install "ogrep[ast]"

# With reranking
pip install "ogrep[rerank]"

# Full install (AST + reranking)
pip install "ogrep[ast,rerank]"

Claude Code

Marketplace plugin + Skills integration. This is the primary integration mode for Claude Code.

Claude Code marketplace
/plugin marketplace add gplv2/ogrep-marketplace
/plugin install ogrep@ogrep-marketplace

Get started

Index the repo once with AST. Ask questions forever. Use --json for AI tools.

Quick start copy/paste
# Install with AST support
pip install "ogrep[ast,rerank]"

# Choose embeddings

## Option A: OpenAI (cloud embeddings)
export OPENAI_API_KEY="sk-..."
ogrep index . --ast -m small

## Option B: LM Studio (local embeddings)
# Start LM Studio server: lms server start
export OGREP_BASE_URL=http://localhost:1234/v1
ogrep index . --ast -m nomic

# Query (use --json for AI tools)
ogrep query "where is authentication handled?" -n 12 --json
ogrep query "how do we map API errors to exceptions?" -n 12 --rerank --json

# Check index status
ogrep status --json

# (Optional) tune chunking for this repo
ogrep tune .

JSON Output for AI Tools

Always use --json when calling from AI tools, scripts, or Claude Code Skills. The JSON includes AST mode status and hints when the index could benefit from AST chunking.

JSON response example structured for parsing
{
  "query": "database connections",
  "results": [
    {
      "rank": 1,
      "chunk_ref": "src/db.py:2",
      "score": 0.8923,
      "confidence": "high",
      "language": "python",
      "text": "def connect_to_database(config):\n    ..."
    }
  ],
  "stats": {
    "search_mode": "hybrid",
    "fusion_method": "rrf",
    "reranked": true,
    "ast_mode": true
  }
}

Fast

Local SQLite index, incremental updates, and quick top-K retrieval.

Accurate

AST chunking + RRF fusion + cross-encoder reranking = precise results.

Cost-aware

Index once, reuse forever. Chat tokens only when you choose to interpret results.

FAQ

Short answers to the questions people ask immediately.

Is this MCP?

No. ogrep is primarily delivered as a Claude Code marketplace plugin that exposes a Skill-style workflow. You run indexing and queries locally; Claude Code can call into it as a tool without requiring an MCP server.

Does ogrep use tokens?

ogrep uses embeddings for indexing and query retrieval. With local embeddings, there is no per-token bill. With cloud embeddings, indexing costs embedding tokens (and a small amount per query). ogrep does not require a chat model; chat/completion tokens are only spent if you choose to have an LLM interpret the retrieved snippets.

Should I use --ast?

Yes, for code search. AST-aware chunking produces semantically coherent chunks that dramatically improve accuracy. Install with pip install "ogrep[ast]" and use ogrep index . --ast. If your index was built without AST, run ogrep reindex . --ast.

Should I use --rerank?

Use --rerank when you need the most accurate ordering, especially for complex queries where the right result might be in the top 30 but not #1. It's slower but much more precise. Great for difficult searches.

Where does the index live?

By default: .ogrep/index.sqlite. It is a single local file, so it is easy to keep per repo or per profile.

How do I get better results?

1. Use AST chunking: ogrep reindex . --ast
2. Use reranking: --rerank for complex queries
3. Tune chunk size: ogrep tune .
4. Use fulltext mode for exact identifiers: --mode fulltext

What does the JSON hint mean?

When querying an index built without AST chunking, the JSON output includes a hint: "hint": "Index was built without AST chunking. For better semantic boundaries, run: ogrep reindex . --ast". This tells AI tools to suggest rebuilding the index with AST for better results.

ogrep v0.7 — semantic grep for codebases (local-first, SQLite-backed, Claude Code Skills)
GitHub MIT License
Test page: serve locally with python3 -m http.server 8080 and open http://localhost:8080.