UniversalQueryBuilder.InMemory 10.0.0-beta

This is a prerelease version of UniversalQueryBuilder.InMemory.
There is a newer prerelease version of this package available.
See the version list below for details.
dotnet add package UniversalQueryBuilder.InMemory --version 10.0.0-beta
                    
NuGet\Install-Package UniversalQueryBuilder.InMemory -Version 10.0.0-beta
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="UniversalQueryBuilder.InMemory" Version="10.0.0-beta" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="UniversalQueryBuilder.InMemory" Version="10.0.0-beta" />
                    
Directory.Packages.props
<PackageReference Include="UniversalQueryBuilder.InMemory" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add UniversalQueryBuilder.InMemory --version 10.0.0-beta
                    
#r "nuget: UniversalQueryBuilder.InMemory, 10.0.0-beta"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package UniversalQueryBuilder.InMemory@10.0.0-beta
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=UniversalQueryBuilder.InMemory&version=10.0.0-beta&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=UniversalQueryBuilder.InMemory&version=10.0.0-beta&prerelease
                    
Install as a Cake Tool

QueryBuilder.InMemory

In-memory execution engine for the Universal Query Builder system

.NET 10 License

Overview

QueryBuilder.InMemory is the in-memory execution strategy for Universal Query Builder, providing high-performance LINQ-based query execution against in-memory data sources. It serves as a complementary strategy to EntityFramework, handling data sources that don't come from databases or require advanced operators not available in SQL.

Key Features

  • LINQ Expression Compilation - Converts QueryDefinition to compiled LINQ expressions for maximum performance
  • Custom Data Providers - Plugin architecture for any in-memory data source (cache, API, file, queue)
  • Expression Caching - Leverages QueryBuilder.Expressions L1/L2 caching for 10-40x faster repeated queries
  • Advanced Operators - Supports regex and fuzzy matching operators NOT available in SQL Server
  • Parallel Execution - PLINQ support for large datasets with configurable parallelism
  • Timeout Protection - Configurable query timeouts to prevent runaway queries
  • Rich Metadata - Detailed execution metrics including compilation time, execution time, materialization time
  • Schema Registry Integration - Full validation and provider resolution via Schema Registry
  • Hierarchical Projections - Support for nested object and collection selection
  • Nested Relation Options - Apply filters, ordering, and limits to child collections
  • Aggregate Ordering - Order by MIN, MAX, COUNT, SUM, AVG over collection navigation properties
  • ⚠️ Phase 10+ Features - GROUP BY, aggregations planned for future phases

Architecture Role

┌─────────────────────────────────────────────────────────────┐
│                    Universal Query API                       │
│                    (QueryDefinition)                         │
└────────────────────────┬────────────────────────────────────┘
                         │
         ┌───────────────┴────────────────┐
         │                                │
    ┌────▼────────┐                  ┌────▼────────┐
    │EntityFramew │                  │  InMemory   │
    │ork Strategy │                  │  Strategy   │
    └─────────────┘                  └────┬────────┘
                                          │
                ┌────────────────────────┴─────────────────────┐
                │   InMemoryExecutionStrategy                  │
                │   • Provider resolution                      │
                │   • Expression compilation (with caching)    │
                │   • LINQ query execution                     │
                │   • Result materialization                   │
                └────┬─────────────────────────────────────────┘
                     │
        ┌────────────┴────────────────┐
        │                             │
   ┌────▼─────────────┐      ┌────────▼─────────────┐
   │ LinqQueryBuilder │      │ IDataSourceProvider  │
   │ • Build IQueryable│      │ • GetDataAsync()     │
   │ • Apply filters  │      │ • Metadata           │
   │ • Apply ordering │      └──────────────────────┘
   │ • Apply pagination│             │
   └────┬─────────────┘              │
        │                             │
   ┌────▼────────────────────────────▼──┐
   │ QueryBuilder.Expressions           │
   │ • FilterDefinition → Expression    │
   │ • L1/L2 Caching (10-40x speedup)   │
   └────────────────────────────────────┘

Table of Contents

  1. Core Concepts
  2. Installation
  3. Quick Start
  4. Data Provider Pattern
  5. Expression Compilation
  6. Supported Operators
  7. Configuration
  8. Performance
  9. Advanced Features
  10. Testing
  11. Advanced Usage
  12. Integration
  13. Best Practices
  14. Limitations

Core Concepts

Execution Strategy Pattern

QueryBuilder.InMemory implements the IExecutionStrategy interface, allowing it to be selected automatically by the ExecutionStrategyFactory when a query targets an in-memory data source.

public interface IExecutionStrategy
{
    string Name { get; }                           // "InMemory"
    int Priority { get; }                          // 100 (highest priority for in-memory sources)
    DataSourceType SupportedDataSourceType { get; } // DataSourceType.InMemory

    bool CanExecute(QueryDefinition query, IExecutionContext? context = null);
    Task<IQueryResult<Dictionary<string, object>>> ExecuteAsync(QueryDefinition query, IExecutionContext? context = null);
    Task<IEstimatedMetrics> EstimateAsync(QueryDefinition query, IExecutionContext? context = null);
    ValidationResult Validate(QueryDefinition query);
}

Priority Order:

  • InMemory: Priority 100 (highest - for in-memory sources)
  • EntityFramework: Priority 10 (for database sources)

Three-Phase Execution Model

Phase 1: Build IQueryable<T>
   ├─ Resolve data provider from Schema Registry
   ├─ Call provider.GetDataAsync() → IEnumerable<T>
   ├─ Convert to IQueryable<T>
   ├─ Build LINQ expression from FilterDefinition (with L1/L2 caching)
   └─ Apply expression to IQueryable<T>.Where()

Phase 2: Execute Query
   ├─ Apply OrderBy
   ├─ Apply Skip/Take (pagination)
   ├─ Execute with timeout protection
   └─ Optionally enable PLINQ for parallel execution

Phase 3: Materialize Results
   ├─ Call .ToListAsync() to materialize
   ├─ Count total records (if pagination enabled)
   └─ Return IQueryResult<T> with metadata

Data Provider Architecture

public interface IDataSourceProvider<T> where T : class
{
    string SourceName { get; }           // Unique identifier (e.g., "cachedUsers")
    string DisplayName { get; }          // Human-readable name
    string? Description { get; }         // Optional description

    Task<IEnumerable<T>> GetDataAsync(CancellationToken cancellationToken = default);
}

Key Design Principles:

  • Single Responsibility: Provider only fetches data, doesn't filter/sort
  • Async First: All data retrieval is async for I/O efficiency
  • Stateless: Providers should be thread-safe and stateless
  • Cacheable: Providers can implement internal caching if needed

Installation

NuGet Package (Future)

dotnet add package QueryBuilder.InMemory

Project Reference

<ItemGroup>
  <ProjectReference Include="../QueryBuilder.InMemory/QueryBuilder.InMemory.csproj" />
</ItemGroup>

Dependencies

<PackageReference Include="Microsoft.Extensions.Caching.Memory" Version="10.0.0" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.0" />
<PackageReference Include="System.Linq.Async" Version="6.0.1" />

Implicit Dependencies:

  • QueryBuilder.Core - Core models and abstractions
  • QueryBuilder.Expressions - Expression compilation with caching
  • QueryBuilder.SchemaRegistry - Provider discovery and registration

Quick Start

1. Create a Data Provider

using QueryBuilder.Core.Abstractions;

public sealed class CachedUsersProvider : IDataSourceProvider<User>
{
    private readonly IMemoryCache _cache;
    private readonly ILogger<CachedUsersProvider> _logger;

    public string SourceName => "cachedUsers";
    public string DisplayName => "Cached Users";
    public string? Description => "User data from in-memory cache";

    public CachedUsersProvider(IMemoryCache cache, ILogger<CachedUsersProvider> logger)
    {
        _cache = cache;
        _logger = logger;
    }

    public async Task<IEnumerable<User>> GetDataAsync(CancellationToken cancellationToken = default)
    {
        return await _cache.GetOrCreateAsync("users", async entry =>
        {
            _logger.LogInformation("Loading users from source");
            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5);

            // Load from actual source (database, API, file, etc.)
            var users = await LoadUsersFromSourceAsync(cancellationToken);
            return users;
        }) ?? Enumerable.Empty<User>();
    }

    private async Task<List<User>> LoadUsersFromSourceAsync(CancellationToken ct)
    {
        // Your data loading logic here
        await Task.Delay(100, ct); // Simulate I/O
        return new List<User>
        {
            new() { Id = 1, Username = "alice", Email = "alice@example.com", IsActive = true },
            new() { Id = 2, Username = "bob", Email = "bob@example.com", IsActive = true }
        };
    }
}

2. Configure Services

using QueryBuilder.InMemory.Extensions;

var builder = WebApplication.CreateBuilder(args);

// Register in-memory execution strategy
builder.Services.AddInMemorySupport(options =>
{
    // Optional: Configure parallelism
    options.EnableParallelExecution = true;
    options.MaxDegreeOfParallelism = Environment.ProcessorCount;

    // Optional: Configure timeout
    options.DefaultTimeout = TimeSpan.FromSeconds(30);
});

// Register custom data provider
builder.Services.AddDataSourceProvider<CachedUsersProvider>(ServiceLifetime.Scoped);

3. Register Provider in Schema Registry

using QueryBuilder.SchemaRegistry;

// Register in-memory provider
await registrationService.RegisterInMemoryProviderAsync(
    sourceName: "cachedUsers",        // Must match provider's SourceName property
    displayName: "Cached Users",
    description: "User data from cache with 5-minute TTL",
    createdBy: "admin"
);

4. Execute a Query

using QueryBuilder.Core.Models;
using QueryBuilder.Core.Extensions;

// Build query
var query = new QueryDefinition
{
    From = "cachedUsers",
    Where = FilterDefinitionExtensions.And(
        FilterDefinitionExtensions.Equal("IsActive", true),
        FilterDefinitionExtensions.Contains("Email", "@example.com")
    ),
    OrderBy = new[] {
        new OrderingDefinition { Field = "Username", Direction = SortDirection.Ascending }
    },
    Limit = 50,
    Offset = 0
};

// Execute via strategy
var strategy = serviceProvider.GetRequiredService<InMemoryExecutionStrategy>();
var result = await strategy.ExecuteAsync<User>(query);

// Check results
if (result.Success)
{
    Console.WriteLine($"Found {result.TotalCount} users, showing {result.Data.Count()}");

    foreach (var user in result.Data)
    {
        Console.WriteLine($"{user.Username} - {user.Email}");
    }

    // Access detailed metadata
    Console.WriteLine($"Total execution time: {result.ExecutionTime.TotalMilliseconds}ms");
    Console.WriteLine($"Expression compilation: {result.AdditionalMetadata["CompilationTime"]}ms");
    Console.WriteLine($"Query execution: {result.AdditionalMetadata["ExecutionTime"]}ms");
    Console.WriteLine($"Materialization: {result.AdditionalMetadata["MaterializationTime"]}ms");
    Console.WriteLine($"Cache hit: {result.AdditionalMetadata["CacheHit"]}");
}
else
{
    Console.WriteLine($"Error: {result.Error?.Message}");
}

5. Example Output

Found 127 users, showing 50
alice - alice@example.com
bob - bob@example.com
charlie - charlie@example.com
...
Total execution time: 12.3ms
Expression compilation: 0.8ms (L1 cache hit)
Query execution: 8.5ms
Materialization: 3.0ms
Cache hit: true

Data Provider Pattern

Why Data Providers?

In-memory execution requires a source of data. Unlike SQL Server (where data comes from tables), in-memory sources can be:

  • Caches (IMemoryCache, Redis, distributed cache)
  • APIs (REST, GraphQL, gRPC)
  • Files (JSON, CSV, XML)
  • Message Queues (RabbitMQ, Kafka, Azure Service Bus)
  • External Services (search engines, feature stores)
  • Static Data (configuration, lookup tables)

The IDataSourceProvider<T> interface abstracts these diverse sources into a unified contract.

Provider Contract

public interface IDataSourceProvider<T> where T : class
{
    /// <summary>
    /// Unique identifier for this data source (e.g., "cachedUsers", "apiOrders")
    /// MUST match the SourceName registered in Schema Registry
    /// </summary>
    string SourceName { get; }

    /// <summary>
    /// Human-readable display name (e.g., "Cached Users", "Orders from API")
    /// </summary>
    string DisplayName { get; }

    /// <summary>
    /// Optional description of the data source
    /// </summary>
    string? Description { get; }

    /// <summary>
    /// Fetch all data from the source.
    /// Should return the FULL dataset - filtering happens later via LINQ.
    /// Implement internal caching if data is expensive to fetch.
    /// </summary>
    Task<IEnumerable<T>> GetDataAsync(CancellationToken cancellationToken = default);
}

Example Providers

1. Memory Cache Provider:

public sealed class CachedProductsProvider : IDataSourceProvider<Product>
{
    private readonly IMemoryCache _cache;
    private readonly IProductService _productService;

    public string SourceName => "cached-products";
    public string DisplayName => "Cached Products";
    public string? Description => "Product catalog with 10-minute cache";

    public async Task<IEnumerable<Product>> GetDataAsync(CancellationToken ct = default)
    {
        return await _cache.GetOrCreateAsync("products", async entry =>
        {
            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10);
            return await _productService.GetAllProductsAsync(ct);
        }) ?? Enumerable.Empty<Product>();
    }
}

2. REST API Provider:

public sealed class OrdersApiProvider : IDataSourceProvider<Order>
{
    private readonly HttpClient _httpClient;
    private readonly ILogger<OrdersApiProvider> _logger;

    public string SourceName => "orders-api";
    public string DisplayName => "Orders from API";
    public string? Description => "Real-time orders from external API";

    public async Task<IEnumerable<Order>> GetDataAsync(CancellationToken ct = default)
    {
        try
        {
            var response = await _httpClient.GetAsync("/api/orders", ct);
            response.EnsureSuccessStatusCode();

            var orders = await response.Content.ReadFromJsonAsync<List<Order>>(cancellationToken: ct);
            return orders ?? Enumerable.Empty<Order>();
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to fetch orders from API");
            return Enumerable.Empty<Order>();
        }
    }
}

3. JSON File Provider:

public sealed class ConfigurationProvider : IDataSourceProvider<ConfigItem>
{
    private readonly IWebHostEnvironment _env;
    private List<ConfigItem>? _cachedData;

    public string SourceName => "app-config";
    public string DisplayName => "Application Configuration";
    public string? Description => "Configuration items from appsettings.json";

    public async Task<IEnumerable<ConfigItem>> GetDataAsync(CancellationToken ct = default)
    {
        if (_cachedData != null)
            return _cachedData;

        var path = Path.Combine(_env.ContentRootPath, "config", "items.json");
        var json = await File.ReadAllTextAsync(path, ct);
        _cachedData = JsonSerializer.Deserialize<List<ConfigItem>>(json) ?? new();

        return _cachedData;
    }
}

4. Static Data Provider:

public sealed class CountriesProvider : IDataSourceProvider<Country>
{
    private static readonly List<Country> _countries = new()
    {
        new Country { Code = "US", Name = "United States", Region = "North America" },
        new Country { Code = "CA", Name = "Canada", Region = "North America" },
        new Country { Code = "UK", Name = "United Kingdom", Region = "Europe" },
        // ... more countries
    };

    public string SourceName => "countries";
    public string DisplayName => "Countries";
    public string? Description => "ISO country codes and regions";

    public Task<IEnumerable<Country>> GetDataAsync(CancellationToken ct = default)
    {
        return Task.FromResult<IEnumerable<Country>>(_countries);
    }
}

Provider Registration

Option 1: Extension Method (Recommended):

services.AddDataSourceProvider<CachedUsersProvider>(ServiceLifetime.Scoped);
services.AddDataSourceProvider<OrdersApiProvider>(ServiceLifetime.Singleton);
services.AddDataSourceProvider<CountriesProvider>(ServiceLifetime.Singleton);

Option 2: Manual Registration:

services.AddScoped<IDataSourceProvider<User>, CachedUsersProvider>();
services.AddSingleton<IDataSourceProvider<Order>, OrdersApiProvider>();

Option 3: Configure Options:

services.AddInMemorySupport(options =>
{
    options.AddDataProvider<CachedUsersProvider>(ServiceLifetime.Scoped);
    options.AddDataProvider<OrdersApiProvider>(ServiceLifetime.Singleton);
});

Provider Discovery

After registering providers in DI, they must be registered in the Schema Registry:

// Discover all registered providers
var providerDiscovery = serviceProvider.GetRequiredService<ProviderDiscoveryService>();
var availableProviders = await providerDiscovery.DiscoverProvidersAsync();

// Admin reviews and selects which providers to make queryable
foreach (var provider in availableProviders)
{
    Console.WriteLine($"Found provider: {provider.SourceName} - {provider.DisplayName}");

    // Register selected providers
    if (ShouldRegister(provider))
    {
        await registrationService.RegisterInMemoryProviderAsync(
            sourceName: provider.SourceName,
            displayName: provider.DisplayName,
            description: provider.Description,
            createdBy: "admin"
        );
    }
}

Provider Resolution at Query Time

// User submits query
var query = new QueryDefinition
{
    From = "cachedUsers"
};

// InMemoryExecutionStrategy resolves provider
// 1. Lookup "cachedUsers" in Schema Registry
var registry = await _schemaRegistry.GetBySourceNameAsync("cachedUsers");

// 2. Get provider type from registry
var providerType = registry.ProviderTypeName;  // "CachedUsersProvider"

// 3. Resolve from DI container
var provider = _serviceProvider.GetRequiredService(Type.GetType(providerType));

// 4. Cast to IDataSourceProvider<T>
var typedProvider = provider as IDataSourceProvider<User>;

// 5. Fetch data
var data = await typedProvider.GetDataAsync(cancellationToken);

// 6. Execute LINQ query against data
var results = data.AsQueryable().Where(compiledExpression);

Expression Compilation

Integration with QueryBuilder.Expressions

InMemory execution leverages QueryBuilder.Expressions for converting FilterDefinition to LINQ expressions with two-level caching.

Architecture:

FilterDefinition
    ↓
IExpressionBuilder<T>.BuildAsync()
    ↓
Check L1 Cache (ConcurrentDictionary)
    ├─ Hit → Return cached Expression<Func<T, bool>> (<0.1ms)
    └─ Miss ↓
Check L2 Cache (MemoryCache)
    ├─ Hit → Store in L1 → Return (<2ms)
    └─ Miss ↓
Build Expression Tree
    ├─ Recursive descent through FilterDefinition
    ├─ Create Expression nodes for each operator
    ├─ Combine with AndAlso/OrElse for logical operators
    └─ Return Expression<Func<T, bool>>
    ↓
Compile Expression (FastExpressionCompiler)
    ├─ Standard compilation: ~5-15ms
    └─ FastExpressionCompiler: ~0.5-2ms (10-40x faster)
    ↓
Store in L1 and L2 caches
    ↓
Return compiled expression

Compilation Flow Example

Input FilterDefinition:

var filter = new FilterDefinition
{
    LogicalOperator = LogicalOperator.And,
    Expressions = new[]
    {
        new FilterDefinition { Field = "IsActive", Operator = FilterOperator.Equal, Value = true },
        new FilterDefinition { Field = "Age", Operator = FilterOperator.GreaterThan, Value = 18 }
    }
};

Generated Expression Tree:

// Conceptual representation
Expression<Func<User, bool>> expression =
    user => user.IsActive == true && user.Age > 18;

Compiled Delegate:

// After compilation
Func<User, bool> compiledPredicate = expression.Compile();

// Used in LINQ
var results = users.Where(compiledPredicate);

Performance Impact of Caching

First execution (cache miss):

L1 Cache: Miss
L2 Cache: Miss
Expression Building: ~5ms
FastExpressionCompiler: ~1.5ms
Total: ~6.5ms

Second execution (L1 hit):

L1 Cache: Hit
Total: <0.1ms

After L1 eviction (L2 hit):

L1 Cache: Miss
L2 Cache: Hit
Restore to L1: <0.1ms
Total: ~2ms

Speedup: 10-65x faster for repeated queries

Expression Building Operators

QueryBuilder.Expressions supports 18 filter operators mapped to LINQ expressions:

Operator LINQ Expression Example
Equal user.Field == value user.Status == "Active"
NotEqual user.Field != value user.Status != "Deleted"
GreaterThan user.Field > value user.Age > 18
GreaterThanOrEqual user.Field >= value user.Price >= 10.00
LessThan user.Field < value user.Quantity < 100
LessThanOrEqual user.Field <= value user.Score <= 100
Contains user.Field.Contains(value) user.Email.Contains("@example.com")
StartsWith user.Field.StartsWith(value) user.Name.StartsWith("Dr")
EndsWith user.Field.EndsWith(value) user.Filename.EndsWith(".pdf")
In values.Contains(user.Field) new[] {"A","B"}.Contains(user.Grade)
NotIn !values.Contains(user.Field) !new[] {"X","Y"}.Contains(user.Code)
Between user.Field >= min && user.Field <= max user.Age >= 18 && user.Age <= 65
IsNull user.Field == null user.DeletedAt == null
IsNotNull user.Field != null user.Email != null
Like Regex.IsMatch(user.Field, pattern) Regex.IsMatch(user.Name, "J.*")
Regex Regex.IsMatch(user.Field, pattern) Regex.IsMatch(user.Email, @".*@example\.com")
Fuzzy FuzzyMatcher.Match(user.Field, value, threshold) FuzzyMatcher.Match(user.Name, "Jon", 0.8)

Advanced Operators (InMemory Only):

// Regex operator (NOT supported in SQL Server)
Where = new FilterDefinition
{
    Field = "Email",
    Operator = FilterOperator.Regex,
    Value = @"^[a-zA-Z0-9._%+-]+@example\.com$"
}
// Expression: user => Regex.IsMatch(user.Email, @"^[a-zA-Z0-9._%+-]+@example\.com$")

// Fuzzy matching (NOT supported in SQL Server)
Where = new FilterDefinition
{
    Field = "Name",
    Operator = FilterOperator.Fuzzy,
    Value = "John",
    Threshold = 0.8  // 80% similarity
}
// Expression: user => FuzzyMatcher.Match(user.Name, "John", 0.8)

Supported Operators

Production-Ready Operators (14)

Category Operators Status
Comparison Equal, NotEqual, GreaterThan, GreaterThanOrEqual, LessThan, LessThanOrEqual ✅ Production
String Contains, StartsWith, EndsWith, Like ✅ Production
List In, NotIn ✅ Production
Range Between ✅ Production
Null IsNull, IsNotNull ✅ Production
Advanced Regex, Fuzzy ✅ Production (InMemory only)
Flags Enum Equal, NotEqual, In, NotIn on [Flags] fields ✅ Production

Flags enum fields use bitwise membership semantics: Equal checks flag presence via (field & mask) == mask, and In checks any-of membership. Comparison, string, and range operators on [Flags] fields throw QueryValidationException.

Hierarchical Projections

Hierarchical projections allow you to select specific fields from in-memory objects, including nested objects and collections. This implementation uses the INestedProjectionBuilder to build efficient LINQ projection expressions.

Features:

  • Object Navigation: Select properties from nested child objects.
  • Collection Support: Project elements from child collections.
  • Nested Relation Options: Apply Where, OrderBy, Limit, and Offset to child collections.

Example: Hierarchical Selection

{
  "from": "users",
  "select": {
    "Name": true,
    "Department": {
      "select": { "Name": true }
    },
    "Orders": {
      "where": { "field": "Total", "operator": "gt", "value": 100 },
      "orderBy": [{ "field": "Date", "direction": "desc" }],
      "limit": 3,
      "select": {
        "OrderNumber": true,
        "Total": true
      }
    }
  }
}

This query will return a dictionary structure where Orders contains only the top 3 completed orders over $100.

Implementation Details:

  • Service Registration: Requires AddQueryBuilderExpressions() (registered automatically by AddInMemorySupport()).
  • Performance: Projections are compiled into highly efficient LINQ delegates and cached.
  • Type Safety: Field names are validated against the DataSourceDefinition.

Planned Features (Phase 10+)

Feature Status Notes
Aggregations 🚧 Planned COUNT, SUM, AVG, MIN, MAX with GROUP BY
Subqueries 🚧 Planned Nested queries in WHERE/HAVING
Window Functions 🚧 Planned ROW_NUMBER, LAG, LEAD over partitions
Distinct ✅ Implemented Single-field projected queries. Uses DistinctBy on field value post-materialization. Multi-field DISTINCT throws QueryExecutionException. Offset is not supported with DISTINCT.
HAVING 🚧 Planned Post-aggregation filtering
Complex Expressions 🚧 Planned CASE, CAST, COALESCE in SELECT

Operator Examples

Comparison Operators:

// Equal
Where = FilterDefinitionExtensions.Equal("Status", "Active")
// LINQ: user => user.Status == "Active"

// Greater Than
Where = FilterDefinitionExtensions.GreaterThan("Age", 18)
// LINQ: user => user.Age > 18

// Between
Where = FilterDefinitionExtensions.Between("Price", 10.00m, 100.00m)
// LINQ: user => user.Price >= 10.00m && user.Price <= 100.00m

String Operators:

// Contains
Where = FilterDefinitionExtensions.Contains("Email", "@example.com")
// LINQ: user => user.Email.Contains("@example.com")

// StartsWith
Where = FilterDefinitionExtensions.StartsWith("Name", "Dr")
// LINQ: user => user.Name.StartsWith("Dr")

// Regex (InMemory only)
Where = new FilterDefinition
{
    Field = "PhoneNumber",
    Operator = FilterOperator.Regex,
    Value = @"^\+1\d{10}$"  // US phone number
}
// LINQ: user => Regex.IsMatch(user.PhoneNumber, @"^\+1\d{10}$")

List Operators:

// IN
Where = new FilterDefinition
{
    Field = "Department",
    Operator = FilterOperator.In,
    Values = new object[] { "Sales", "Marketing", "Engineering" }
}
// LINQ: user => new[] { "Sales", "Marketing", "Engineering" }.Contains(user.Department)

// NOT IN
Where = new FilterDefinition
{
    Field = "Status",
    Operator = FilterOperator.NotIn,
    Values = new object[] { "Deleted", "Suspended" }
}
// LINQ: user => !new[] { "Deleted", "Suspended" }.Contains(user.Status)

Null Operators:

// IS NULL
Where = FilterDefinitionExtensions.IsNull("DeletedAt")
// LINQ: user => user.DeletedAt == null

// IS NOT NULL
Where = FilterDefinitionExtensions.IsNotNull("Email")
// LINQ: user => user.Email != null

Fuzzy Matching (InMemory only):

// Fuzzy match with 80% similarity threshold
Where = new FilterDefinition
{
    Field = "Name",
    Operator = FilterOperator.Fuzzy,
    Value = "John",
    Threshold = 0.8
}
// LINQ: user => FuzzyMatcher.Match(user.Name, "John", 0.8)
// Matches: "John", "Jon", "Johan", but not "Jane"

Configuration

Basic Configuration

using QueryBuilder.InMemory.Extensions;

services.AddInMemorySupport();

Advanced Configuration

services.AddInMemorySupport(options =>
{
    // Parallel execution
    options.EnableParallelExecution = true;
    options.MaxDegreeOfParallelism = Environment.ProcessorCount;  // Default: CPU count

    // Timeout protection
    options.DefaultTimeout = TimeSpan.FromSeconds(30);  // Default: 30s

    // Expression compilation
    options.UseFastExpressionCompiler = true;  // Default: true (10-40x faster)

    // Data providers (optional, can also use AddDataSourceProvider extension)
    options.AddDataProvider<CachedUsersProvider>(ServiceLifetime.Scoped);
    options.AddDataProvider<OrdersApiProvider>(ServiceLifetime.Singleton);
});

InMemoryOptions Reference

public sealed class InMemoryOptions
{
    /// <summary>
    /// Enable PLINQ for parallel query execution.
    /// Recommended for large datasets (10,000+ items).
    /// Default: true
    /// </summary>
    public bool EnableParallelExecution { get; set; } = true;

    /// <summary>
    /// Max degree of parallelism for PLINQ.
    /// Default: Environment.ProcessorCount (number of CPU cores)
    /// </summary>
    public int MaxDegreeOfParallelism { get; set; } = Environment.ProcessorCount;

    /// <summary>
    /// Default query timeout.
    /// Default: 30 seconds
    /// </summary>
    public TimeSpan DefaultTimeout { get; set; } = TimeSpan.FromSeconds(30);

    /// <summary>
    /// Use FastExpressionCompiler for 10-40x faster compilation.
    /// Default: true
    /// </summary>
    public bool UseFastExpressionCompiler { get; set; } = true;

    /// <summary>
    /// Register data providers (alternative to AddDataSourceProvider extension).
    /// </summary>
    public void AddDataProvider<TProvider>(ServiceLifetime lifetime = ServiceLifetime.Scoped)
        where TProvider : class { }
}

Parallel Execution (PLINQ)

When to Enable:

  • Large datasets (10,000+ items)
  • CPU-intensive filters (regex, fuzzy matching)
  • Multi-core systems

When to Disable:

  • Small datasets (<1,000 items) - overhead not worth it
  • I/O-bound operations - no CPU benefit
  • Memory-constrained environments

Configuration:

options.EnableParallelExecution = true;
options.MaxDegreeOfParallelism = 4;  // Limit to 4 threads

Performance Impact:

Dataset: 100,000 items
Filter: Complex regex pattern

Sequential: ~250ms
Parallel (4 cores): ~75ms
Parallel (8 cores): ~45ms

Speedup: 3-5x on multi-core systems

Timeout Configuration

Purpose: Prevent runaway queries from consuming resources indefinitely.

Configuration:

// Global default
options.DefaultTimeout = TimeSpan.FromSeconds(30);

// Per-query override
var context = new ExecutionContext
{
    Timeout = TimeSpan.FromMinutes(2)
};
var result = await strategy.ExecuteAsync<User>(query, context);

Behavior:

  • Timeout applies to entire execution (data fetch + LINQ query + materialization)
  • Throws TimeoutException if exceeded
  • Respects cancellation tokens

Performance

Execution Time Breakdown

Typical Query Performance (1,000 items):

Phase 1: Expression Compilation
  ├─ L1 Cache Hit: <0.1ms
  ├─ L2 Cache Hit: ~2ms
  └─ Cache Miss: ~6.5ms (building + FastExpressionCompiler)

Phase 2: Data Provider Fetch
  ├─ Memory Cache: ~0.5ms
  ├─ Redis Cache: ~5-10ms
  ├─ REST API: ~50-200ms
  └─ Database: ~10-50ms

Phase 3: LINQ Query Execution
  ├─ Simple filter (Age > 18): ~1-2ms
  ├─ Complex filter (5+ conditions): ~3-5ms
  └─ Regex filter: ~10-20ms

Phase 4: Materialization
  ├─ 100 items: ~0.5ms
  ├─ 1,000 items: ~2ms
  └─ 10,000 items: ~15ms

Total (cache hit + memory source): ~5-10ms
Total (cache miss + API source): ~60-220ms

Performance by Dataset Size

Dataset Size Sequential Parallel (4 cores) Parallel (8 cores)
100 items ~1ms ~2ms (overhead) ~3ms (overhead)
1,000 items ~5ms ~6ms ~7ms
10,000 items ~30ms ~12ms ~8ms
100,000 items ~250ms ~75ms ~45ms
1,000,000 items ~2,500ms ~750ms ~450ms

Recommendation: Enable parallel execution for datasets >10,000 items.

Memory Usage

Memory Footprint:

Expression Cache (L1): ~1-5 MB (10,000 cached expressions)
Expression Cache (L2): ~5-20 MB (100,000 cached expressions, evicted after 1 hour)
Data Provider: Depends on data size (see below)

Dataset Size → Memory
  1,000 simple objects: ~100 KB
  10,000 simple objects: ~1 MB
  100,000 simple objects: ~10 MB
  1,000,000 simple objects: ~100 MB

Optimization Tips:

  • Implement pagination (Limit/Offset) to reduce materialized dataset size
  • Use streaming APIs in providers when possible
  • Consider implementing provider-side caching with eviction policies
  • Monitor memory usage in production

Optimization Strategies

1. Leverage Expression Caching:

// First query: ~6.5ms (cache miss)
var result1 = await strategy.ExecuteAsync<User>(query1);

// Identical query: ~0.1ms (L1 cache hit)
var result2 = await strategy.ExecuteAsync<User>(query1);

// Similar query with different values: ~0.1ms (structural hash matches)
var query2 = new QueryDefinition
{
    From = query1.From,
    Where = FilterDefinitionExtensions.Equal("Age", 25)  // Different value, same structure
};
var result3 = await strategy.ExecuteAsync<User>(query2);  // Cache hit!

2. Implement Provider-Side Caching:

public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    // Cache at provider level to avoid repeated fetches
    return await _cache.GetOrCreateAsync("users", async entry =>
    {
        entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5);
        return await _userService.GetAllUsersAsync(ct);
    });
}

3. Use Pagination:

// ❌ Bad: Fetch all 1,000,000 items
var query = new QueryDefinition
{
    From = "large-dataset"
};

// ✅ Good: Fetch 50 items at a time
var query = new QueryDefinition
{
    From = "large-dataset",
    OrderBy = new[] { new OrderingDefinition { Field = "Id", Direction = SortDirection.Ascending } },
    Offset = 0,
    Limit = 50
};

4. Enable Parallel Execution for Large Datasets:

options.EnableParallelExecution = true;
options.MaxDegreeOfParallelism = Environment.ProcessorCount;

// Sequential: ~250ms for 100,000 items
// Parallel (8 cores): ~45ms for 100,000 items
// 5x speedup

5. Avoid Expensive Operators on Large Datasets:

// ⚠️ Slower: Regex on 100,000 items (~250ms)
Where = new FilterDefinition { Field = "Email", Operator = FilterOperator.Regex, Value = @".*@example\.com" }

// ✅ Faster: Contains on 100,000 items (~30ms)
Where = FilterDefinitionExtensions.Contains("Email", "@example.com")

Advanced Features

Parallel Execution (PLINQ)

Enabling PLINQ:

services.AddInMemorySupport(options =>
{
    options.EnableParallelExecution = true;
    options.MaxDegreeOfParallelism = 8;
});

How it Works:

// Without PLINQ
var results = data
    .AsQueryable()
    .Where(compiledExpression)
    .OrderBy(x => x.Name)
    .Skip(offset)
    .Take(limit)
    .ToList();

// With PLINQ
var results = data
    .AsParallel()                              // Enable parallel execution
    .WithDegreeOfParallelism(maxParallelism)   // Limit thread count
    .AsOrdered()                               // Preserve ordering
    .Where(compiledExpression)                 // Parallel filter
    .OrderBy(x => x.Name)                      // Parallel sort
    .Skip(offset)
    .Take(limit)
    .ToList();

Performance Comparison:

Dataset: 100,000 users
Filter: Age > 18 AND Email.Contains("@example.com")
System: 8-core CPU

Sequential (LINQ): 250ms
Parallel (PLINQ, 4 threads): 85ms (2.9x faster)
Parallel (PLINQ, 8 threads): 50ms (5x faster)

Timeout Protection

Configuration:

// Global default
options.DefaultTimeout = TimeSpan.FromSeconds(30);

// Per-query override
var context = new ExecutionContext
{
    Timeout = TimeSpan.FromMinutes(2)
};

try
{
    var result = await strategy.ExecuteAsync<User>(query, context);
}
catch (TimeoutException ex)
{
    Console.WriteLine($"Query timed out after {context.Timeout.TotalSeconds}s");
}

Implementation:

using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(timeout);

try
{
    var data = await provider.GetDataAsync(cts.Token);
    var results = data.AsQueryable().Where(expression).ToList();
    return results;
}
catch (OperationCanceledException) when (cts.IsCancellationRequested)
{
    throw new TimeoutException($"Query execution exceeded {timeout.TotalSeconds}s");
}

Regex Operator (InMemory Only)

FilterDefinition Usage:

var query = new QueryDefinition
{
    From = "users",
    Where = new FilterDefinition
    {
        Field = "Email",
        Operator = FilterOperator.Regex,
        Value = @"^[a-zA-Z0-9._%+-]+@(example\.com|test\.com)$"
    }
};

// LINQ: user => Regex.IsMatch(user.Email, @"^[a-zA-Z0-9._%+-]+@(example\.com|test\.com)$")

Shorthand Syntax:

The shorthand parser supports regex literals with modifiers:

email:/^admin@/              # Starts with "admin@"
name:/smith$/i               # Ends with "smith" (case-insensitive)
phone:/^\d{3}-\d{4}$/        # Matches XXX-XXXX format

Supported Modifiers (via RegexPatternValue):

Flag Effect .NET RegexOptions
i Case-insensitive RegexOptions.IgnoreCase
m Multiline (^ and $ match line boundaries) RegexOptions.Multiline
s Singleline (. matches newline) RegexOptions.Singleline
n Explicit capture only RegexOptions.ExplicitCapture
x Ignore pattern whitespace RegexOptions.IgnorePatternWhitespace

RegexPatternValue Usage:

// Programmatic regex with modifiers
var query = new QueryDefinition
{
    From = "users",
    Where = new FilterDefinition
    {
        Field = "Email",
        Operator = FilterOperator.Regex,
        Value = new RegexPatternValue("admin", RegexModifier.IgnoreCase)
    }
};

ReDoS Protection:

Regex patterns are compiled with protection against Regular Expression Denial of Service attacks:

  1. NonBacktracking Engine (Primary): Uses RegexOptions.NonBacktracking for linear-time matching
  2. Timeout Fallback: If NonBacktracking fails (backreferences, lookahead), falls back to standard engine with configurable timeout (default: 1 second)
services.AddQueryBuilder(options =>
{
    options.ConfigureRegex(regex =>
    {
        regex.MatchTimeout = TimeSpan.FromMilliseconds(500);
        regex.PreferNonBacktracking = true;
    });
});

Common Patterns:

// Email validation
Value = @"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

// US Phone number
Value = @"^\+1\d{10}$"

// ZIP code
Value = @"^\d{5}(-\d{4})?$"

// Strong password (8+ chars, upper, lower, digit, special)
Value = @"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$"

// URL
Value = @"^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b"

Performance:

  • Simple patterns: ~1-2ms per 1,000 items
  • Complex patterns: ~10-20ms per 1,000 items
  • Regex instances are compiled once and cached in the expression tree

Fuzzy Matching (InMemory Only)

Usage:

var query = new QueryDefinition
{
    From = "users",
    Where = new FilterDefinition
    {
        Field = "Name",
        Operator = FilterOperator.Fuzzy,
        Value = "John",
        Threshold = 0.8  // 80% similarity required
    }
};

// Matches: "John", "Jon", "Johan", "Johnny"
// Doesn't match: "Jane", "Jim"

Similarity Algorithms:

// Levenshtein Distance (default)
FuzzyMatcher.Match("John", "Jon", 0.8)    // true (1 edit, 75% similar)
FuzzyMatcher.Match("John", "Jane", 0.8)   // false (3 edits, 25% similar)

// Jaccard Similarity (set-based)
FuzzyMatcher.JaccardSimilarity("hello", "hallo")  // 0.6 (60% similar)

// Soundex (phonetic)
FuzzyMatcher.Soundex("Smith", "Smythe")  // Same soundex code

Threshold Guidelines:

0.9-1.0: Typos only (Jon → John)
0.8-0.9: Close matches (John → Johan)
0.7-0.8: Loose matches (Smith → Smythe)
<0.7: Too permissive (may return unrelated results)

Performance:

  • Levenshtein: ~5-10ms per 1,000 comparisons
  • Consider pre-indexing for large datasets

Testing

Test Patterns

1. Unit Testing Data Providers:

public class CachedUsersProviderTests
{
    [Fact]
    public async Task GetDataAsync_ShouldReturnCachedData()
    {
        // Arrange
        var cache = new MemoryCache(new MemoryCacheOptions());
        var provider = new CachedUsersProvider(cache, logger);

        // Act
        var data1 = await provider.GetDataAsync();
        var data2 = await provider.GetDataAsync();

        // Assert
        Assert.NotEmpty(data1);
        Assert.Same(data1, data2);  // Same instance from cache
    }

    [Fact]
    public async Task GetDataAsync_ShouldHandleCacheMiss()
    {
        var cache = new MemoryCache(new MemoryCacheOptions());
        var provider = new CachedUsersProvider(cache, logger);

        var data = await provider.GetDataAsync();

        Assert.NotEmpty(data);
    }
}

2. Integration Testing Execution Strategy:

public class InMemoryExecutionStrategyTests : IClassFixture<InMemoryFixture>
{
    private readonly InMemoryFixture _fixture;

    public InMemoryExecutionStrategyTests(InMemoryFixture fixture)
    {
        _fixture = fixture;
    }

    [Fact]
    public async Task ExecuteAsync_ShouldFilterData()
    {
        // Arrange
        var query = new QueryDefinition
        {
            From = "test-users",
            Where = FilterDefinitionExtensions.Equal("IsActive", true)
        };

        // Act
        var result = await _fixture.ExecuteAsync<User>(query);

        // Assert
        Assert.True(result.Success);
        Assert.All(result.Data, u => Assert.True(u.IsActive));
    }

    [Fact]
    public async Task ExecuteAsync_ShouldApplyPagination()
    {
        var query = new QueryDefinition
        {
            From = "test-users",
            OrderBy = new[] { new OrderingDefinition { Field = "Id", Direction = SortDirection.Ascending } },
            Offset = 10,
            Limit = 5
        };

        var result = await _fixture.ExecuteAsync<User>(query);

        Assert.Equal(5, result.Data.Count());
        Assert.True(result.TotalCount > 10);
    }
}

3. Testing Expression Compilation:

public class ExpressionCompilationTests
{
    [Fact]
    public async Task BuildAsync_ShouldCacheExpressions()
    {
        // Arrange
        var builder = serviceProvider.GetRequiredService<IExpressionBuilder<User>>();
        var filter = FilterDefinitionExtensions.Equal("IsActive", true);

        // Act
        var stopwatch1 = Stopwatch.StartNew();
        var expr1 = await builder.BuildAsync(filter);
        stopwatch1.Stop();

        var stopwatch2 = Stopwatch.StartNew();
        var expr2 = await builder.BuildAsync(filter);
        stopwatch2.Stop();

        // Assert
        Assert.NotNull(expr1);
        Assert.Same(expr1, expr2);  // Same cached instance
        Assert.True(stopwatch2.ElapsedMilliseconds < stopwatch1.ElapsedMilliseconds);  // Cache hit is faster
    }

    [Theory]
    [InlineData(FilterOperator.Equal)]
    [InlineData(FilterOperator.GreaterThan)]
    [InlineData(FilterOperator.Contains)]
    public async Task BuildAsync_ShouldSupportAllOperators(FilterOperator op)
    {
        var builder = serviceProvider.GetRequiredService<IExpressionBuilder<User>>();
        var filter = new FilterDefinition { Field = "Name", Operator = op, Value = "Test" };

        var expr = await builder.BuildAsync(filter);

        Assert.NotNull(expr);
    }
}

4. Testing Parallel Execution:

public class ParallelExecutionTests
{
    [Fact]
    public async Task ExecuteAsync_WithPLINQ_ShouldBeFaster()
    {
        // Arrange
        var largeDataset = Enumerable.Range(1, 100000).Select(i => new User { Id = i, Age = i % 100 }).ToList();
        var provider = new StaticDataProvider<User>(largeDataset);

        var query = new QueryDefinition
        {
            From = "large-dataset",
            Where = FilterDefinitionExtensions.GreaterThan("Age", 50)
        };

        // Sequential
        var options1 = new InMemoryOptions { EnableParallelExecution = false };
        var strategy1 = new InMemoryExecutionStrategy(provider, options1, ...);

        var stopwatch1 = Stopwatch.StartNew();
        await strategy1.ExecuteAsync<User>(query);
        stopwatch1.Stop();

        // Parallel
        var options2 = new InMemoryOptions { EnableParallelExecution = true, MaxDegreeOfParallelism = 8 };
        var strategy2 = new InMemoryExecutionStrategy(provider, options2, ...);

        var stopwatch2 = Stopwatch.StartNew();
        await strategy2.ExecuteAsync<User>(query);
        stopwatch2.Stop();

        // Assert: Parallel should be significantly faster
        Assert.True(stopwatch2.ElapsedMilliseconds < stopwatch1.ElapsedMilliseconds);
    }
}

Test Fixtures

InMemoryFixture:

public class InMemoryFixture : IAsyncLifetime
{
    private IServiceProvider _serviceProvider = null!;

    public async Task InitializeAsync()
    {
        var services = new ServiceCollection();

        // Register in-memory strategy
        services.AddInMemorySupport();

        // Register test provider
        services.AddSingleton<IDataSourceProvider<User>, TestUsersProvider>();

        // Register schema registry (in-memory)
        services.AddSchemaRegistryInMemory();

        _serviceProvider = services.BuildServiceProvider();

        // Register test data source
        var registration = _serviceProvider.GetRequiredService<RegistrationService>();
        await registration.RegisterInMemoryProviderAsync(
            sourceName: "test-users",
            displayName: "Test Users",
            createdBy: "test"
        );
    }

    public Task DisposeAsync() => Task.CompletedTask;

    public T GetService<T>() where T : notnull => _serviceProvider.GetRequiredService<T>();

    public async Task<IQueryResult<T>> ExecuteAsync<T>(QueryDefinition query) where T : class
    {
        var strategy = GetService<InMemoryExecutionStrategy>();
        return await strategy.ExecuteAsync<T>(query);
    }
}

Advanced Usage

Custom Data Provider with Caching

public sealed class OrdersApiProvider : IDataSourceProvider<Order>
{
    private readonly HttpClient _httpClient;
    private readonly IMemoryCache _cache;
    private readonly ILogger<OrdersApiProvider> _logger;

    public string SourceName => "orders-api";
    public string DisplayName => "Orders from API";
    public string? Description => "Real-time orders with 2-minute cache";

    public async Task<IEnumerable<Order>> GetDataAsync(CancellationToken ct = default)
    {
        var cacheKey = $"{SourceName}:data";

        return await _cache.GetOrCreateAsync(cacheKey, async entry =>
        {
            _logger.LogInformation("Fetching orders from API");

            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(2);
            entry.SetPriority(CacheItemPriority.High);

            try
            {
                var response = await _httpClient.GetAsync("/api/orders", ct);
                response.EnsureSuccessStatusCode();

                var orders = await response.Content.ReadFromJsonAsync<List<Order>>(cancellationToken: ct);

                _logger.LogInformation("Fetched {Count} orders from API", orders?.Count ?? 0);

                return orders ?? Enumerable.Empty<Order>();
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to fetch orders from API");

                // Return empty on failure (or throw depending on requirements)
                return Enumerable.Empty<Order>();
            }
        }) ?? Enumerable.Empty<Order>();
    }
}

Combining Multiple Providers

public sealed class AggregatedUsersProvider : IDataSourceProvider<User>
{
    private readonly IEnumerable<IDataSourceProvider<User>> _providers;

    public string SourceName => "all-users";
    public string DisplayName => "Aggregated Users from All Sources";
    public string? Description => "Combined users from database, cache, and API";

    public AggregatedUsersProvider(IEnumerable<IDataSourceProvider<User>> providers)
    {
        _providers = providers;
    }

    public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct = default)
    {
        var tasks = _providers.Select(p => p.GetDataAsync(ct));
        var results = await Task.WhenAll(tasks);

        // Merge all users, deduplicating by ID
        var allUsers = results
            .SelectMany(users => users)
            .GroupBy(u => u.Id)
            .Select(g => g.First())  // Take first occurrence
            .ToList();

        return allUsers;
    }
}

Dynamic Provider with Configuration

public sealed class ConfigurableApiProvider<T> : IDataSourceProvider<T> where T : class
{
    private readonly HttpClient _httpClient;
    private readonly string _endpoint;
    private readonly string _sourceName;

    public string SourceName => _sourceName;
    public string DisplayName => $"API: {_endpoint}";
    public string? Description => $"Data from {_endpoint}";

    public ConfigurableApiProvider(
        HttpClient httpClient,
        string endpoint,
        string sourceName)
    {
        _httpClient = httpClient;
        _endpoint = endpoint;
        _sourceName = sourceName;
    }

    public async Task<IEnumerable<T>> GetDataAsync(CancellationToken ct = default)
    {
        var response = await _httpClient.GetAsync(_endpoint, ct);
        response.EnsureSuccessStatusCode();

        var data = await response.Content.ReadFromJsonAsync<List<T>>(cancellationToken: ct);
        return data ?? Enumerable.Empty<T>();
    }
}

// Registration
services.AddSingleton<IDataSourceProvider<User>>(sp =>
{
    var httpClient = sp.GetRequiredService<IHttpClientFactory>().CreateClient();
    return new ConfigurableApiProvider<User>(
        httpClient,
        endpoint: "/api/users",
        sourceName: "users-api"
    );
});

Query with Custom Execution Context

var context = new ExecutionContext
{
    PreferredStrategy = "InMemory",
    Timeout = TimeSpan.FromMinutes(2),
    CancellationToken = cancellationToken,
    Features = new IExecutionFeature[]
    {
        new UserContextFeature { UserId = currentUser.Id },
        new TenantContextFeature { TenantId = tenant.Id },
        new FeatureFlagFeature { EnableRegex = true, EnableFuzzyMatch = true }
    }
};

var result = await strategy.ExecuteAsync<User>(query, context);

// Access execution metadata
Console.WriteLine($"Strategy used: {result.Strategy}");
Console.WriteLine($"Execution time: {result.ExecutionTime.TotalMilliseconds}ms");
Console.WriteLine($"Expression compiled: {result.AdditionalMetadata["CompilationTime"]}ms");
Console.WriteLine($"Cache hit: {result.AdditionalMetadata["CacheHit"]}");

Integration

With QueryBuilder.Core

Models Used:

QueryDefinition              // Universal query format
FilterDefinition            // Filter expressions
DataSourceDefinition        // FROM sources
OrderingDefinition          // ORDER BY fields

Abstractions Implemented:

IExecutionStrategy          // Strategy pattern
IDataSourceProvider<T>      // Data provider contract

With QueryBuilder.Expressions

Expression Building:

// InMemory delegates to QueryBuilder.Expressions for expression compilation
var builder = _serviceProvider.GetRequiredService<IExpressionBuilder<T>>();
var expression = await builder.BuildAsync(filter, cancellationToken);

// Benefits:
// - L1/L2 caching (10-40x faster for repeated queries)
// - FastExpressionCompiler (10-40x faster compilation)
// - Shared cache across all execution strategies

Cache Sharing:

// First query (InMemory): Expression compiled and cached
var result1 = await inMemoryStrategy.ExecuteAsync<User>(query1);

// Second query (InMemory): Expression retrieved from L1 cache
var result2 = await inMemoryStrategy.ExecuteAsync<User>(query1);

// Even if another strategy runs, cache is shared
// (Note: EntityFramework also benefits from expression caching)

With QueryBuilder.SchemaRegistry

Provider Resolution:

// User submits query
var query = new QueryDefinition
{
    From = "cachedUsers"
};

// InMemory strategy resolves provider
var registry = await _schemaRegistry.GetBySourceNameAsync("cachedUsers");

// Validate it's an in-memory source
if (registry.Type != DataSourceType.InMemory)
{
    throw new CoreException(CoreErrorCode.InvalidConfiguration, "Not an in-memory source");
}

// Get provider type
var providerType = Type.GetType(registry.ProviderTypeName);
var provider = _serviceProvider.GetRequiredService(providerType) as IDataSourceProvider<T>;

// Fetch data
var data = await provider.GetDataAsync(cancellationToken);

Discovery and Registration:

// 1. Discover available providers
var discovery = serviceProvider.GetRequiredService<ProviderDiscoveryService>();
var providers = await discovery.DiscoverProvidersAsync();

// 2. Admin selects which to register
foreach (var provider in providers)
{
    await registrationService.RegisterInMemoryProviderAsync(
        sourceName: provider.SourceName,
        displayName: provider.DisplayName,
        description: provider.Description,
        createdBy: "admin"
    );
}

// 3. Provider is now queryable
var query = new QueryDefinition
{
    From = new DataSourceDefinition { SourceName = provider.SourceName }
};

With ASP.NET Core

Controller Example:

[ApiController]
[Route("api/[controller]")]
public class InMemoryQueryController : ControllerBase
{
    private readonly InMemoryExecutionStrategy _strategy;

    public InMemoryQueryController(InMemoryExecutionStrategy strategy)
    {
        _strategy = strategy;
    }

    [HttpPost("users")]
    public async Task<ActionResult<IQueryResult<User>>> QueryUsers(
        [FromBody] QueryDefinition query,
        CancellationToken ct)
    {
        var result = await _strategy.ExecuteAsync<User>(query, null, ct);

        if (!result.Success)
        {
            return StatusCode(500, new { error = result.Error?.Message });
        }

        return Ok(new
        {
            data = result.Data,
            totalCount = result.TotalCount,
            executionTime = result.ExecutionTime.TotalMilliseconds,
            metadata = result.AdditionalMetadata
        });
    }
}

Best Practices

1. Implement Provider-Side Caching

// ✅ Good - Cache expensive data fetches
public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    return await _cache.GetOrCreateAsync("users", async entry =>
    {
        entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5);
        return await _apiClient.GetUsersAsync(ct);
    });
}

// ❌ Bad - Fetch from API every time
public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    return await _apiClient.GetUsersAsync(ct);  // Slow and wasteful
}

2. Return Full Datasets from Providers

// ✅ Good - Return all data, let LINQ filter
public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    return await _repository.GetAllUsersAsync(ct);
}

// ❌ Bad - Don't filter in provider (defeats purpose)
public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    // Provider shouldn't know about query filters
    return await _repository.GetActiveUsersAsync(ct);  // Too specific
}

3. Use Pagination for Large Datasets

// ✅ Good - Paginate large result sets
var query = new QueryDefinition
{
    From = "large-dataset",
    OrderBy = new[] { new OrderingDefinition { Field = "Id", Direction = SortDirection.Ascending } },
    Offset = (pageNumber - 1) * pageSize,
    Limit = pageSize
};

// ❌ Bad - Materialize millions of rows
var query = new QueryDefinition
{
    From = "large-dataset"
};

4. Enable Parallel Execution for Large Datasets

// ✅ Good - Enable PLINQ for 10,000+ items
services.AddInMemorySupport(options =>
{
    options.EnableParallelExecution = true;
    options.MaxDegreeOfParallelism = Environment.ProcessorCount;
});

// ⚠️ Caution - PLINQ adds overhead for small datasets
// Only enable if typical dataset size > 10,000 items

5. Set Appropriate Timeouts

// ✅ Good - Set reasonable timeouts based on data source
services.AddInMemorySupport(options =>
{
    // Fast in-memory cache: 10s
    // Slow external API: 60s
    options.DefaultTimeout = TimeSpan.FromSeconds(30);
});

// Per-query override for expensive operations
var context = new ExecutionContext
{
    Timeout = TimeSpan.FromMinutes(2)
};

6. Handle Provider Errors Gracefully

// ✅ Good - Return empty on error, log, and optionally retry
public async Task<IEnumerable<User>> GetDataAsync(CancellationToken ct)
{
    try
    {
        return await _apiClient.GetUsersAsync(ct);
    }
    catch (HttpRequestException ex)
    {
        _logger.LogError(ex, "API request failed");
        return Enumerable.Empty<User>();  // or throw, depending on requirements
    }
}

7. Use Appropriate Service Lifetimes

// ✅ Good - Choose lifetime based on provider characteristics
services.AddDataSourceProvider<CachedUsersProvider>(ServiceLifetime.Scoped);     // Per-request cache
services.AddDataSourceProvider<StaticCountriesProvider>(ServiceLifetime.Singleton);  // Static data
services.AddDataSourceProvider<ApiOrdersProvider>(ServiceLifetime.Transient);    // New instance each time

8. Avoid Regex/Fuzzy on Large Datasets

// ⚠️ Slower - Regex on 100,000 items
Where = new FilterDefinition { Field = "Email", Operator = FilterOperator.Regex, Value = @".*@example\.com" }

// ✅ Faster - Use simpler operators when possible
Where = FilterDefinitionExtensions.Contains("Email", "@example.com")

// If regex required, filter to smaller dataset first
Where = FilterDefinitionExtensions.And(
    FilterDefinitionExtensions.Equal("IsActive", true),  // Reduce dataset
    new FilterDefinition { Field = "Email", Operator = FilterOperator.Regex, Value = @".*@example\.com" }
)

9. Monitor Expression Cache Hit Rate

// Log cache metrics
result.AdditionalMetadata["CacheHit"]           // true/false
result.AdditionalMetadata["CompilationTime"]    // 0.1ms (hit) vs 6.5ms (miss)

// High cache hit rate (>80%) = good performance
// Low cache hit rate (<50%) = investigate query variation

10. Test Providers Independently

// ✅ Good - Unit test provider in isolation
[Fact]
public async Task GetDataAsync_ShouldReturnData()
{
    var provider = new CachedUsersProvider(_cache, _logger);
    var data = await provider.GetDataAsync();
    Assert.NotEmpty(data);
}

// ✅ Good - Integration test full query execution
[Fact]
public async Task ExecuteAsync_ShouldFilterData()
{
    var query = new QueryDefinition { /* ... */ };
    var result = await _strategy.ExecuteAsync<User>(query);
    Assert.True(result.Success);
}

Limitations

Current Limitations (Phase 10+ Features)

1. No Custom Projections:

// ❌ Not Supported - Custom SELECT columns
Select = new List<ProjectionDefinition>
{
    new() { Field = "Id" },
    new() { Field = "Name" }
}

// Currently returns full object with all properties
// Workaround: Use .Select() on returned data
var result = await strategy.ExecuteAsync<User>(query);
var projected = result.Data.Select(u => new { u.Id, u.Name });

2. No Aggregations:

// ❌ Not Supported - GROUP BY, COUNT, SUM, AVG
Select = new List<ProjectionDefinition>
{
    new()
    {
        Function = new FunctionDefinition { Name = "COUNT", IsAggregate = true }
    }
}

// Workaround: Use LINQ on returned data
var result = await strategy.ExecuteAsync<User>(query);
var count = result.Data.Count();
var sum = result.Data.Sum(u => u.Age);

4. No Subqueries:

// ❌ Not Supported - Subqueries in WHERE
Where = new FilterDefinition
{
    Operator = FilterOperator.In,
    Subquery = new QueryDefinition { /* ... */ }
}

// Workaround: Execute subquery first, use results in main query
var subResult = await strategy.ExecuteAsync<int>(subquery);
var ids = subResult.Data.ToArray();

var mainQuery = new QueryDefinition
{
    Where = new FilterDefinition { Field = "Id", Operator = FilterOperator.In, Values = ids }
};

5. No Window Functions:

// ❌ Not Supported - ROW_NUMBER, LAG, LEAD
Function = new FunctionDefinition { Name = "ROW_NUMBER", IsWindowFunction = true }

// Workaround: Use LINQ window function libraries or manual implementation

Known Issues

1. Large Datasets Without Pagination:

Issue: Materializing millions of rows can cause OutOfMemoryException
Impact: High memory usage, slow performance
Workaround: Always use pagination (Limit/Offset) for large datasets

2. Provider Errors Not Surfaced:

Issue: If provider returns empty on error, strategy doesn't know
Impact: Silent failures, misleading results
Workaround: Log errors in provider, or throw exceptions

3. No Query Plan Analysis:

Issue: No cost estimation for in-memory queries
Impact: Can't predict performance of complex queries
Workaround: Performance test representative queries

Planned Enhancements (Roadmap)

Phase 10: Projections and Aggregations

  • Custom SELECT columns
  • COUNT, SUM, AVG, MIN, MAX
  • GROUP BY with HAVING

Phase 11: Subqueries

  • Subqueries in WHERE (IN, EXISTS)
  • Scalar subqueries in SELECT
  • Correlated subqueries

Phase 12: Window Functions

  • ROW_NUMBER, RANK, DENSE_RANK
  • LAG, LEAD over partitions
  • FIRST_VALUE, LAST_VALUE

Phase 13: Advanced Features

  • Multi-field DISTINCT (single-field DISTINCT is supported)
  • UNION, INTERSECT, EXCEPT
  • Common Table Expressions (CTEs)

★ Insight ───────────────────────────────────── QueryBuilder.InMemory demonstrates three powerful architectural patterns:

  1. Provider Abstraction: The IDataSourceProvider<T> interface unifies diverse data sources (caches, APIs, files, queues) into a single contract. This allows the execution strategy to treat a Redis cache, a REST API, and a JSON file identically, enabling true polyglot data access.

  2. Shared Expression Caching: By delegating expression compilation to QueryBuilder.Expressions, InMemory gains automatic L1/L2 caching without implementing it. This "composition over inheritance" approach reduces code duplication and ensures consistent performance across all strategies.

  3. PLINQ as Opt-In Optimization: Parallel execution is configurable, not mandatory. This recognizes that parallelism has overhead - for small datasets (<10,000 items), sequential execution is faster. The strategy adapts to data size, showing that performance optimization isn't always about "more threads." ─────────────────────────────────────────────────

Contributing

Contributions are welcome! Please:

  1. Follow the existing code style and patterns
  2. Add comprehensive tests for new features
  3. Update documentation (README, XML comments)
  4. Ensure all tests pass
  5. Consider performance implications of changes

License

MIT License - See LICENSE file for details


Need Help?

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on UniversalQueryBuilder.InMemory:

Package Downloads
UniversalQueryBuilder.EntityFramework

Entity Framework Core execution strategy for Universal Query Builder. Provides SQL query generation with automatic navigation property includes, DTO projections, and multi-DbContext support.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
10.0.13-beta 47 6/3/2026
10.0.12-beta 61 6/1/2026
10.0.11-beta 68 5/31/2026
10.0.10-beta 65 5/28/2026
10.0.9-beta 60 5/27/2026
10.0.8-beta 66 5/18/2026
10.0.7-beta 64 5/16/2026
10.0.6-beta 66 5/11/2026
10.0.5-beta 65 4/30/2026
10.0.4-beta 60 4/23/2026
10.0.3-beta 75 4/23/2026
10.0.2-beta 69 4/10/2026
10.0.1-beta 56 4/10/2026
10.0.0-beta 62 4/9/2026