Linq.OpenJson.InClause.Middleware 1.0.16

dotnet add package Linq.OpenJson.InClause.Middleware --version 1.0.16                
NuGet\Install-Package Linq.OpenJson.InClause.Middleware -Version 1.0.16                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Linq.OpenJson.InClause.Middleware" Version="1.0.16" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Linq.OpenJson.InClause.Middleware --version 1.0.16                
#r "nuget: Linq.OpenJson.InClause.Middleware, 1.0.16"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Linq.OpenJson.InClause.Middleware as a Cake Addin
#addin nuget:?package=Linq.OpenJson.InClause.Middleware&version=1.0.16

// Install Linq.OpenJson.InClause.Middleware as a Cake Tool
#tool nuget:?package=Linq.OpenJson.InClause.Middleware&version=1.0.16                

Linq Performance Interceptor (SQL Server 16+ and EF Core 6\7)

This NuGet Package has been developed for scenarios where an application that can't (yet) be migrated to Entity Framework Core 8 because of legacy issues, but needs to be scaled up and handle queries that will contain millions of values within the IN or NOT IN clause (generated by the LINQ Contains extension method).

This package will interpret the LINQ queries and replace all large IN or NOT IN clauses (that have more than 1000 values, as EF can make queries more complex with duplicating the clauses against joined tables, it's best to implement OpenJson all the time, like EFC8) with TSQL's OpenJson command. It will work for combinations of IN and NOT IN clauses, along with multiples of these clauses.

Within the IN and NOT IN clauses the following datatypes have been catered for:

  1. Integer
  2. Double | Decimal | Float
  3. String

Update Your Code

There are two places that you can update your code, just pick the one that best fits your project type:

Your dbContext Class

Add a reference to the LinqOpenJsonInClauseMiddleware.Classes namespace

using LinqOpenJsonInClauseMiddleware.Classes;

Add the middleware reference within the OnConfiguring method using optionsBuilder parameter.

Add this line below: optionsBuilder.AddInterceptors(new SqlServerReaderExecutingOpenJson());

 protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
 {
     var config = new ConfigurationBuilder()
      .SetBasePath(AppContext.BaseDirectory)
      .AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
      .Build();
     
     var connectionString = config.GetConnectionString("DefaultConnection"); // Get connection string

     // Configure the context to use SQL Server
     optionsBuilder.UseSqlServer(connectionString, options =>
     {
         options.CommandTimeout(180); // Set command timeout (in seconds)                        
     });

     optionsBuilder.AddInterceptors(new SqlServerReaderExecutingOpenJson()); <-- Add this line
 }

Your Program.cs Class

Add a reference to the LinqOpenJsonInClauseMiddleware.Classes namespace

using LinqOpenJsonInClauseMiddleware.Classes;

For each context that you create, add the Interceptor middleware, so that your LINQ statements will be parsed before being executed on SQL Server:

Add this line below: options.AddInterceptors(new SqlServerReaderExecutingOpenJson());

var builder = Host.CreateDefaultBuilder(args).ConfigureServices((context, services) =>
{
    services.AddDbContextFactory<LinqPerformanceDemoContext>(options =>
    {
        options.UseSqlServer(context.Configuration.GetConnectionString("DefaultConnection"));

        options.AddInterceptors(new SqlServerReaderExecutingOpenJson()); <-- Add this line

    }, ServiceLifetime.Scoped);

    services.AddDbContext<LinqPerformanceDemoContext>(options =>
    {
        options.UseSqlServer(context.Configuration.GetConnectionString("DefaultConnection"));

        options.AddInterceptors(new SqlServerReaderExecutingOpenJson()); <-- Add this line

    }, ServiceLifetime.Scoped);

}).Build();

Example Code Usage

The concept when using a large amount (in the millions) of values for your LINQ Contains extension is to create a Parallel loop and iterate over a chunked (manageable) loop of values (say in the 100K range - you can certainly go higher, it all just depends on the complexity of your individual LINQ statement). Add the results to a ConcurrentBag and when finished, convert the ConcurrentBag back to a normal list for further processing locally.

For example below:

  1. Generate a list of values with a chunk size of 100,000.
  2. Create a collection to manage the tasks executed in parallel.
  3. Configure Parallel Options, setting the number of cores to 2. You can load this value from a configuration file to easily adjust it per environment and deployment.
  4. Utilize a ConcurrentBag to store the results, which will be instances of a model class called LargeTable.
const int CHUNK_SIZE = 100_000;            
var tskDatabaseQueries = new List<Task>(); // maintain the running background tasks
var parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 2 };
var resultBag = new ConcurrentBag<List<LargeTable>>(); // ConcurrentBag to hold all results safely

I call a method that will return 1 million integers, which is then chucked into a list of integers, 100K in size.

List<int> cacheInt = await Retrieve1MCacheIntegerAsync();
var chunkInt = cacheInt.Chunk(CHUNK_SIZE);

private async Task<List<int>> Retrieve1MCacheIntegerAsync(int recoundCount = 1_000_000)
{
    using (var context = _dbContextFactory.CreateDbContext())
    {
        return await context.LargeTables.Select(itm => itm.ID).Take(recoundCount).ToListAsync();
    }
}

I then make numerous calls to the database using a Parallel loop (but restraining the calls to 2 at a time). When the call returns, I add the results into a (thread safe) ConcurrentBag (that is expecting the appropriate datatype - LargeTable class model). Finally, I wait for all the threaded tasks to be completed before continuing.

tskDatabaseQueries.Add(Task.Run(async () =>
{
    await Parallel.ForEachAsync(chunkInt, parallelOptions, async (sublist, cancellationToken) =>
    {
        var queryResult = await RetrieveIntegerChunkAsync(sublist);
        resultBag.Add(queryResult);
    });
}));

// Wait for all tasks to complete
await Task.WhenAll(tskDatabaseQueries);

Here is an example of where I am making multiple parallel calls with multiple datatypes, adding the results safely to the ConcurrentBag and waiting for all the tasks to complete.

// make a linq call using Integer as the IN clause
tskDatabaseQueries.Add(Task.Run(async () =>
{
    await Parallel.ForEachAsync(chunkInt, parallelOptions, async (sublist, cancellationToken) =>
    {
        var queryResult = await RetrieveIntegerChunkAsync(sublist);
        resultBag.Add(queryResult);
    });
}));

// make a linq call using Char as the IN clause
tskDatabaseQueries.Add(Task.Run(async () =>
{
    await Parallel.ForEachAsync(chunkChar, parallelOptions, async (sublist, cancellationToken) =>
    {
        var queryResult = await RetrieveCharChunkAsync(sublist);
        resultBag.Add(queryResult);
    });
}));

// make a linq call using Double as the IN clause
tskDatabaseQueries.Add(Task.Run(async () =>
{
    await Parallel.ForEachAsync(chunkDouble, parallelOptions, async (sublist, cancellationToken) =>
    {
        var queryResult = await RetrieveDoubleChunkAsync(sublist);
        resultBag.Add(queryResult);
    });
}));

// make a linq call using String as the IN clause
tskDatabaseQueries.Add(Task.Run(async () =>
{
    await Parallel.ForEachAsync(chunkString, parallelOptions, async (sublist, cancellationToken) =>
    {
        var queryResult = await RetrieveStringChunkAsync(sublist);
        resultBag.Add(queryResult);
    });
}));

// Wait for all tasks to complete
await Task.WhenAll(tskDatabaseQueries);

Limitations

  1. Currently only SQL Server 2016+ will be able to execute SQL statements with OpenJson.
  2. Entity Framework Core 8 Nuget package already has this functionality implemented.
Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.16 88 5/31/2024

Accounting for varchar conversion using N.
For e.g. [l].[Team]) NOT IN (N'Finn', N'Harps', N'FC')