CodeIndex.Mcp
1.0.7
dotnet tool install --global CodeIndex.Mcp --version 1.0.7
dotnet new tool-manifest
dotnet tool install --local CodeIndex.Mcp --version 1.0.7
#tool dotnet:?package=CodeIndex.Mcp&version=1.0.7
nuke :add-package CodeIndex.Mcp --version 1.0.7
CodeIndex.Mcp
MCP server for semantic code indexing with Qdrant and llama.cpp.
Changelog
1.0.7
- Added runtime controls for expensive background work:
- New
--disable-rerankflag disables the local reranker service and removesrerankfrom advertised search modes - New
--filewatcher-interval-secondsoption controls the delay between background FileWatcher passes get_indexno longer requires the reranker service unlesssearch_mode=rerankis requested- MCP
search_modeschema default now matches runtime default:hybrid
- New
1.0.6
- Breaking Change: Replaced adaptive HTTP concurrency with static
--workersparameter:- New CLI option
--workers(default: 2, min: 1, max: 32) - Controls both HTTP request concurrency (tokenize + embed) and llama embedding workers
- Removed adaptive HTTP limiter and health-based dynamic adjustment
- Simplified progress logging: removed
tok/emb/lim/healthcounters, kept onlyok/err
- New CLI option
1.0.5
- Optimized external llama server throughput for multi-GPU setups:
- Increased HTTP parallelism from CPU *2 to CPU *4 for external llama mode
- Increased backpressure queue capacity from 1024 to 2048
- Added explicit HttpClient connection pooling (64 connections,5min lifetime)
1.0.4
- Merged c++ and c files into the same category
- Deleted fusion field from response of get_index method
1.0.3
- Added centralized
TextNormalizationutility class for consistent text handling - Consolidated line ending normalization (CRLF→LF) and path separator normalization (
\→/) - Upgraded file path normalization, taking into account quotation marks returned by git commands
- Improved code maintainability by removing duplicate normalization methods
Features
- Semantic Code Search: Index and search code using vector embeddings
- Git Repository Support: Automatic file discovery via
git ls-files - Tree-sitter Chunking: Intelligent code chunking based on AST
- Multiple Embedding Backends:
- Local llama.cpp (GPU accelerated via CUDA/Vulkan/SYCL)
- External OpenAI-compatible APIs (OpenRouter, etc.)
- Remote llama.cpp server
- Hybrid Search: Dense + sparse (BM25) vectors with RRF/DBSF fusion
- Reranking: Optional local llama.cpp reranker for improved relevance
- Distributed Indexing: Shard support for parallel one-shot indexing
Installation
dotnet tool install -g CodeIndex.Mcp
Usage
MCP Mode (for AI assistants)
code-index-mcp
The server communicates via JSON-RPC over stdio, following the MCP protocol.
One-shot Mode (single indexing run)
code-index-mcp --repo C:\path\to\repo --one-shot
Show Version
code-index-mcp --version
Command Line Options
Core Options
| Option | Default | Description |
|---|---|---|
--repo |
(auto-detected) | Git repository path. Auto-detected via: (1) MCP roots/list, (2) current working directory. Required for one-shot mode. |
--one-shot |
false | Run single indexing pass and exit |
--verbose |
false | Enable debug logging, disable interactive progress |
--disable-filewatcher |
false | Disable background FileWatcher (implicitly disabled with --one-shot) |
--filewatcher-interval-seconds |
60 |
Delay between background FileWatcher passes after each pass completes (10-86400) |
--disable-rerank |
false | Disable local reranker service and remove rerank from advertised search modes |
--qdrant-url |
http://localhost:6333 |
Qdrant server URL (always local, no API key) |
--collection |
(repo folder name) | Collection name in Qdrant. Creates {collection} and {collection}-files |
--vendor |
(auto) | Force GPU vendor for local llama: nvidia, amd, intel |
--workers |
2 |
Number of parallel workers for HTTP requests and llama embedding (1-32) |
Local llama.cpp Options
Used for local embedding (when --external-embed-api-key is not set) and for reranking unless --disable-rerank is set.
| Option | Default | Description |
|---|---|---|
--llama-embed-model |
jina-code-embeddings-0.5b-Q8_0 |
Local embedding model filename (without .gguf extension) |
--llama-rerank-model |
jina-reranker-v3-Q8_0 |
Local reranker model filename (without .gguf extension) |
--llama-embed-parallel |
2 |
Embedding parallel slots (1-4) |
--llama-embed-slot-tokens |
2048 |
Tokens per embedding slot (1024-4096) |
--llama-rerank-parallel |
1 |
Rerank parallel slots (1-2) |
--llama-rerank-slot-tokens |
3072 |
Tokens per rerank slot (2048-4096) |
External Embedding API Options
When --external-embed-api-key is set, embedding is performed via external API. Reranking still uses local llama.cpp unless --disable-rerank is set.
| Option | Default | Description |
|---|---|---|
--external-embed-api-key |
(empty) | API key for external embedding service. Enables external mode when set. |
--external-embed-api-url |
https://openrouter.ai/api/v1 |
External API URL |
--external-type |
service |
External embedder type: service or llama |
--service-embed-model |
openai/text-embedding-3-small |
Model identity for external service |
--service-embed-provider |
(empty) | Provider for external service (OpenRouter-specific) |
--service-embed-context-length |
8192 |
Context window size (8192-32768) |
External Types
service: OpenRouter/OpenAI-compatible payload. Includesmodelfield, andproviderfield per OpenRouter rules.llama: llama-compatible payload. Uses--llama-embed-modelfor model name, noproviderfield. For remote llama.cpp servers.
Distributed One-shot Options
For parallel indexing across multiple processes/machines. Only valid with --one-shot.
| Option | Default | Description |
|---|---|---|
--scan-shard-index |
(null) | Shard index (0-based). Requires --scan-shard-count. |
--scan-shard-count |
(null) | Total number of shards (minimum 2). |
Files are distributed by: ordered_file_index % shard_count == shard_index
Requirements
- Qdrant: Running instance at
http://localhost:6333(or custom URL via--qdrant-url) - llama.cpp (for local embedding/reranking):
C:\llama\cuda\llama-server.exefor NVIDIAC:\llama\vulkan\llama-server.exefor AMDC:\llama\sycl\llama-server.exefor IntelC:\llama\models\jina-code-embeddings-0.5b-Q8_0.gguf(embedding)C:\llama\models\jina-reranker-v3-Q8_0.gguf(reranking, unless disabled)
MCP Tools
The server provides these MCP tools:
set_index
Index or update repository code.
{"name": "set_index", "arguments": {"reindex": false}}
reindex(bool, default: false): Full reindex - deletes and recreates both collections
get_index
Semantic code search.
{"name": "get_index", "arguments": {"query": "search query", "count": 32, "search_mode": "rerank"}}
query(string, required): Search querycount(int, default: 32, range: 16-64): Number of resultspath_prefix(string, default: empty): Filter by path prefixlang(string, default: empty): Filter by programming languagesearch_mode(string, default:hybrid):dense,sparse,hybrid, orrerank
recovery
Cleanup inconsistent state after errors.
{"name": "recovery", "arguments": {}}
Removes stale entries from collections based on current repository state.
Examples
Basic MCP Mode
code-index-mcp
One-shot Indexing
code-index-mcp --repo C:\projects\myapp --one-shot
External Embedding API (OpenRouter)
code-index-mcp --repo C:\projects\myapp `
--external-embed-api-key sk-xxx `
--external-embed-api-url https://openrouter.ai/api/v1 `
--external-type service `
--service-embed-model openai/text-embedding-3-small
External llama.cpp Server
code-index-mcp --repo C:\projects\myapp `
--external-embed-api-key any `
--external-embed-api-url http://192.168.1.100:8080 `
--external-type llama
Force NVIDIA GPU
code-index-mcp --repo C:\projects\myapp --vendor nvidia
Distributed Indexing (2 shards)
# Terminal 1 - Shard 0
code-index-mcp --repo C:\projects\myapp --one-shot --scan-shard-index 0 --scan-shard-count 2
# Terminal 2 - Shard 1
code-index-mcp --repo C:\projects\myapp --one-shot --scan-shard-index 1 --scan-shard-count 2
License
MIT
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.