Stop wasting tokens
on file navigation.
Sema indexes your codebase once. Your AI assistant searches it forever — 4–9× fewer tokens per question, no API keys, runs locally.
Claude Code
·
OpenAI Codex
Numbers measured via tiktoken (cl100k_base) on the real hoppscotch repo. Run sema against fastapi-users and the ratio holds at 9×.
Index once.
Search forever.
Sema parses every function, class, and method, embeds them locally with SBERT, and stores them in a ChromaDB index on disk. Your AI assistant connects over MCP and queries the index instead of reading files.
Parse
Every function, class, method, and section becomes a Chunk — with its full source, signature, line range, and call list.
Embed
SBERT runs locally — no API key, no network. ~80MB model downloaded once and cached globally.
Store
ChromaDB persists vectors + source bodies on disk. SHA-256 hashes skip unchanged files on re-index.
Serve
Claude Code and Codex call search_code and get_code instead of running grep.
Six new tools your AI
gets the moment you install.
Find a function, class, or method by natural-language description. Returns signatures + file locations, no bodies.
Fetch the full source body of a symbol by exact name. Returns every implementation if the name appears in multiple files.
Compressed architecture overview — files with their exported symbols. The fastest way to orient a new session.
Locate every call site and reference to a symbol. Returns signatures only — load bodies on demand.
Summary of what a file exports — classes, functions, imports — without dumping the source.
Bidirectional call graph: what the symbol calls, and what calls it. Run it before refactoring to see the blast radius.
AST-aware where it counts.
Text-aware everywhere else.
AST-aware
Text-aware
AI assistants navigate by reading.
Sema teaches them to search.
| find · cat · read | sema | |
|---|---|---|
| Tool calls per question | 4–8 reads | 3 (search → fetch → fetch) |
| Tokens for "how does X work?" | 5,000–15,000 | 500–1,500 |
| Scales with repo size | ✗ cost grows linearly | ✓ constant per query |
| Symbol-level fetch | ✗ whole file or nothing | ✓ one function at a time |
| Call graph / blast radius | ✗ manual grep across files | ✓ impact_analysis() |
| Stays consistent across sessions | ✗ every cold start re-explores | ✓ index persists on disk |
| Data leaves your machine | depends on assistant | ✓ 100% local |
Shipping in the open.
Tool improvements
- impact_analysis with call graph
- explain_file with import graph
- Better stale-index errors
Incremental indexing
- sema watch — re-index on save
- Workspace support for monorepos
- SHA-256 hash store, 20× faster re-index
More AST parsers
- Rust · tree-sitter-rust
- Java / Kotlin
- Ruby · C# · C/C++
Public release
- Publish to PyPI
- Homebrew formula
- Cursor, Copilot, Windsurf auto-config