SharpCoreDB.VectorSearch 1.3.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package SharpCoreDB.VectorSearch --version 1.3.0
                    
NuGet\Install-Package SharpCoreDB.VectorSearch -Version 1.3.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="SharpCoreDB.VectorSearch" Version="1.3.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="SharpCoreDB.VectorSearch" Version="1.3.0" />
                    
Directory.Packages.props
<PackageReference Include="SharpCoreDB.VectorSearch" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add SharpCoreDB.VectorSearch --version 1.3.0
                    
#r "nuget: SharpCoreDB.VectorSearch, 1.3.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package SharpCoreDB.VectorSearch@1.3.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    
Install as a Cake Addin
#tool nuget:?package=SharpCoreDB.VectorSearch&version=1.3.0
                    
Install as a Cake Tool

๐Ÿ” SharpCoreDB.VectorSearch

High-performance vector search extension for SharpCoreDB โ€” SIMD-accelerated similarity search with HNSW indexing, quantization, and encrypted storage.

NuGet License .NET C# Version


๐Ÿš€ Overview

SharpCoreDB.VectorSearch enables semantic search, similarity matching, and AI/RAG applications by storing and querying high-dimensional embeddings directly within your SharpCoreDB database. It's built for production workloads with:

  • โœ… Pure managed C# 14 โ€” Zero native dependencies
  • โœ… SIMD-accelerated โ€” AVX-512, AVX2, ARM NEON support
  • โœ… HNSW indexing โ€” Logarithmic-time approximate nearest neighbor search
  • โœ… Quantization โ€” Scalar and binary quantization for memory efficiency
  • โœ… Encrypted storage โ€” AES-256-GCM for sensitive embeddings
  • โœ… NativeAOT compatible โ€” Deploy as trimmed, self-contained executables
  • โœ… SQL integration โ€” Native VECTOR(N) type and vec_*() functions

Performance Highlights

Operation Typical Latency Notes
Vector Search (k=10) 0.5-2ms 1M vectors, HNSW index, cosine similarity
Index Build (1M vectors) 2-5 seconds M=16, efConstruction=200
Memory Overhead 200-400 bytes/vector HNSW graph structure (M=16)
Throughput 500-2000 queries/sec Single-threaded on modern CPU

Benchmarks run on AMD Ryzen 9 5950X with 1536-dim vectors. See tests/SharpCoreDB.Benchmarks/VectorSearchPerformanceBenchmark.cs for reproducible results.


๐Ÿ“ฆ Installation

# Install SharpCoreDB core (if not already installed)
dotnet add package SharpCoreDB --version 1.3.0

# Install vector search extension
dotnet add package SharpCoreDB.VectorSearch --version 1.3.0

Requirements:

  • .NET 10.0 or later
  • SharpCoreDB 1.3.0+
  • 64-bit runtime (x64, ARM64)

๐ŸŽฏ Quick Start

1. Register Vector Support

using Microsoft.Extensions.DependencyInjection;
using SharpCoreDB;
using SharpCoreDB.VectorSearch;

var services = new ServiceCollection();
services.AddSharpCoreDB()
    .AddVectorSupport(options =>
    {
        options.EnableQueryOptimization = true;  // Auto-select indexes
        options.DefaultIndexType = VectorIndexType.Hnsw;
        options.MaxCacheSize = 1_000_000;       // Cache 1M vectors
    });

var provider = services.BuildServiceProvider();
var factory = provider.GetRequiredService<DatabaseFactory>();

using var db = factory.Create("./vector_db", "StrongPassword!");

2. Create Vector Schema

// Create table with VECTOR column
await db.ExecuteSQLAsync(@"
    CREATE TABLE documents (
        id INTEGER PRIMARY KEY,
        title TEXT,
        content TEXT,
        embedding VECTOR(1536)  -- OpenAI text-embedding-3-large dimensions
    )
");

// Build HNSW index for fast similarity search
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_doc_embedding ON documents(embedding)
    WITH (index_type='hnsw', m=16, ef_construction=200)
");

3. Insert Vectors

// Insert embeddings (e.g., from OpenAI API)
var embedding = new float[1536]; // Your embedding vector
// ... populate embedding from your ML model ...

await db.ExecuteSQLAsync(@"
    INSERT INTO documents (id, title, content, embedding)
    VALUES (?, ?, ?, ?)
", [1, "AI Overview", "Artificial Intelligence is...", embedding]);
// Search for similar documents
var queryEmbedding = new float[1536]; // Query embedding
var k = 10;  // Top-10 results

var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT ?
", [queryEmbedding, k]);

foreach (var row in results)
{
    Console.WriteLine($"Document: {row["title"]}, Similarity: {row["similarity"]:F3}");
}

๐Ÿ› ๏ธ Features

Distance Metrics

Choose the right metric for your embeddings:

Metric Use Case SQL Function
Cosine Text embeddings (normalized) vec_distance_cosine(v1, v2)
Euclidean (L2) Image embeddings, general purpose vec_distance_l2(v1, v2)
Dot Product Recommendation systems, max similarity vec_dot_product(v1, v2)
Hamming Binary embeddings vec_distance_hamming(v1, v2)
// Example: Dot product search (higher = more similar)
var results = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_dot_product(embedding, ?) AS score
    FROM documents
    ORDER BY score DESC
    LIMIT 10
", [queryEmbedding]);

Index Types

HNSW (Hierarchical Navigable Small World)

Best for: Large datasets (10K+ vectors), fast approximate search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_hnsw ON vectors(embedding)
    WITH (
        index_type='hnsw',
        m=16,               -- Neighbors per layer (higher = more recall, slower build)
        ef_construction=200, -- Build-time beam search width
        ef_search=50        -- Query-time beam search width
    )
");

Tuning Guide:

  • M=8-16 โ€” Good default (16 for high recall, 8 for faster build)
  • ef_construction=100-400 โ€” Higher = better quality, slower build
  • ef_search=10-100 โ€” Higher = better recall, slower search
Flat Index

Best for: Small datasets (<1K vectors), exact search

await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_flat ON vectors(embedding)
    WITH (index_type='flat')
");

Quantization

Reduce memory usage by 4-32x with minimal accuracy loss:

// Scalar Quantization (4x reduction: float32 โ†’ int8)
var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar
);

// Binary Quantization (32x reduction: float32 โ†’ bit)
await indexManager.CreateIndexAsync(
    tableName: "documents",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Binary
);

Tradeoffs:

  • Scalar: ~1-3% recall drop, 4x memory savings
  • Binary: ~5-10% recall drop, 32x memory savings, best for cosine similarity

SQL Functions

-- Distance/similarity functions
vec_distance_cosine(v1, v2)    -- Returns 0-2 (lower = more similar)
vec_distance_l2(v1, v2)        -- Euclidean distance
vec_dot_product(v1, v2)        -- Dot product (higher = more similar)
vec_distance_hamming(v1, v2)   -- Hamming distance (binary vectors)

-- Vector operations
vec_length(v)                  -- Vector L2 norm
vec_normalize(v)               -- Normalize to unit length
vec_add(v1, v2)                -- Element-wise addition
vec_subtract(v1, v2)           -- Element-wise subtraction
vec_multiply(v, scalar)        -- Scalar multiplication

-- Metadata
vec_dimensions(v)              -- Get vector dimensions

๐Ÿ“Š Use Cases

1. AI/RAG Applications

Store document embeddings for retrieval-augmented generation:

// Index knowledge base
var docs = await LoadDocumentsAsync();
foreach (var doc in docs)
{
    var embedding = await GetEmbeddingAsync(doc.Content);  // OpenAI, Ollama, etc.
    await db.ExecuteSQLAsync(@"
        INSERT INTO knowledge_base (id, content, embedding)
        VALUES (?, ?, ?)
    ", [doc.Id, doc.Content, embedding]);
}

// Retrieve context for LLM
var userQuestion = "What is vector search?";
var queryEmbedding = await GetEmbeddingAsync(userQuestion);
var context = await db.ExecuteSQLAsync(@"
    SELECT content
    FROM knowledge_base
    ORDER BY vec_distance_cosine(embedding, ?)
    LIMIT 5
", [queryEmbedding]);

// Send context + question to LLM...

Search by meaning, not just keywords:

// Traditional keyword search (may miss relevant docs)
var results = await db.ExecuteSQLAsync(@"
    SELECT * FROM articles
    WHERE content LIKE '%machine learning%'
");

// Semantic vector search (finds conceptually similar docs)
var queryEmbedding = await GetEmbeddingAsync("machine learning");
var semanticResults = await db.ExecuteSQLAsync(@"
    SELECT id, title, vec_distance_cosine(embedding, ?) AS relevance
    FROM articles
    ORDER BY relevance ASC
    LIMIT 10
", [queryEmbedding]);

3. Recommendation Systems

Find similar products, users, or content:

// Find similar products based on embedding similarity
var productEmbedding = await GetProductEmbeddingAsync(productId);
var recommendations = await db.ExecuteSQLAsync(@"
    SELECT id, name, price, vec_dot_product(embedding, ?) AS score
    FROM products
    WHERE id != ?
    ORDER BY score DESC
    LIMIT 5
", [productEmbedding, productId]);

4. Image/Audio Similarity

Compare media by their embeddings (e.g., CLIP, Wav2Vec):

// Find visually similar images
var imageEmbedding = await GetImageEmbeddingAsync(imagePath);  // CLIP model
var similarImages = await db.ExecuteSQLAsync(@"
    SELECT id, path, vec_distance_l2(embedding, ?) AS distance
    FROM images
    ORDER BY distance ASC
    LIMIT 20
", [imageEmbedding]);

๐Ÿ” Security

Encrypted Vector Storage

All vectors are encrypted at rest using AES-256-GCM when you create an encrypted database:

using var db = factory.CreateEncrypted(
    dbPath: "./secure_vectors",
    password: "YourStrongPassword123!",
    options: new DatabaseOptions
    {
        EnableEncryption = true  // Vectors encrypted automatically
    }
);

What's encrypted:

  • โœ… Vector embeddings (VECTOR columns)
  • โœ… HNSW graph structure
  • โœ… Quantization tables
  • โœ… All metadata

โšก Performance Tips

1. Choose the Right Index

Dataset Size Recommended Index Search Time
< 1K vectors Flat 0.1-1ms
1K-10K vectors HNSW (M=8) 0.2-0.5ms
10K-100K vectors HNSW (M=16) 0.5-2ms
100K+ vectors HNSW (M=16) + Quantization 1-5ms

2. Tune HNSW Parameters

// High recall (slower)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_high_recall ON vectors(embedding)
    WITH (index_type='hnsw', m=32, ef_construction=400, ef_search=100)
");

// Fast search (lower recall)
await db.ExecuteSQLAsync(@"
    CREATE INDEX idx_fast ON vectors(embedding)
    WITH (index_type='hnsw', m=8, ef_construction=100, ef_search=10)
");

3. Use Quantization for Large Datasets

// 1M vectors, 1536 dimensions:
// - Unquantized: ~6GB RAM
// - Scalar:      ~1.5GB RAM (4x reduction)
// - Binary:      ~200MB RAM (32x reduction)

var indexManager = provider.GetRequiredService<VectorIndexManager>();
await indexManager.CreateIndexAsync(
    tableName: "large_embeddings",
    columnName: "embedding",
    indexType: VectorIndexType.Hnsw,
    quantization: QuantizationType.Scalar  // 4x memory savings
);

4. Batch Operations

// โœ… DO: Batch inserts
using var transaction = db.BeginTransaction();
foreach (var doc in documents)
{
    await db.ExecuteSQLAsync(@"
        INSERT INTO documents (id, embedding) VALUES (?, ?)
    ", [doc.Id, doc.Embedding]);
}
transaction.Commit();

// โŒ DON'T: Individual transactions
foreach (var doc in documents)
{
    using var tx = db.BeginTransaction();
    await db.ExecuteSQLAsync("INSERT INTO documents ...");
    tx.Commit();  // Slow!
}

๐Ÿงช Testing

Run the included benchmarks to verify performance on your hardware:

cd tests/SharpCoreDB.Benchmarks
dotnet run -c Release -- --filter *VectorSearch*

Example output:

| Method        | VectorCount | Dimensions | K   | Mean      | Error   | StdDev  | Allocated |
|-------------- |------------ |----------- |---- |----------:|--------:|--------:|----------:|
| HnswSearch    | 100000      | 1536       | 10  | 1.845 ms  | 0.032 ms| 0.028 ms|     2.1 KB|
| FlatSearch    | 100000      | 1536       | 10  | 89.32 ms  | 1.23 ms | 1.15 ms |     2.1 KB|

๐Ÿ“š Documentation


๐Ÿค Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

  • ๐Ÿš€ Additional distance metrics (Manhattan, Mahalanobis, etc.)
  • ๐Ÿ”ฌ New quantization strategies (product quantization, PQ)
  • ๐Ÿ“Š Performance benchmarks on different hardware
  • ๐Ÿ“– Documentation improvements and examples
  • ๐Ÿ› Bug reports and fixes

๐Ÿ“„ License

This project is licensed under the MIT License. See LICENSE for details.


๐Ÿ™ Acknowledgments


๐Ÿ“ž Support


Made with โค๏ธ by MPCoreDeveloper

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.3.5 90 2/21/2026
1.3.0 90 2/14/2026