ParquetSharpLINQ.Azure
1.1.0-alpha.3
dotnet add package ParquetSharpLINQ.Azure --version 1.1.0-alpha.3
NuGet\Install-Package ParquetSharpLINQ.Azure -Version 1.1.0-alpha.3
<PackageReference Include="ParquetSharpLINQ.Azure" Version="1.1.0-alpha.3" />
<PackageVersion Include="ParquetSharpLINQ.Azure" Version="1.1.0-alpha.3" />
<PackageReference Include="ParquetSharpLINQ.Azure" />
paket add ParquetSharpLINQ.Azure --version 1.1.0-alpha.3
#r "nuget: ParquetSharpLINQ.Azure, 1.1.0-alpha.3"
#:package ParquetSharpLINQ.Azure@1.1.0-alpha.3
#addin nuget:?package=ParquetSharpLINQ.Azure&version=1.1.0-alpha.3&prerelease
#tool nuget:?package=ParquetSharpLINQ.Azure&version=1.1.0-alpha.3&prerelease
ParquetSharpLINQ.Azure
Azure Blob Storage support for ParquetSharpLINQ with Delta Lake.
Overview
This library extends ParquetSharpLINQ with Azure Blob Storage streaming capabilities, allowing you to query Parquet and Delta Lake tables in Azure without downloading them to disk.
Features
- ✅ Stream Parquet files directly from Azure Blob Storage
- ✅ Delta Lake Support - Automatic Delta transaction log reading from Azure
- ✅ Zero disk I/O - all data cached in memory
- ✅ Same LINQ API as local files
- ✅ Partition pruning works seamlessly
- ✅ Column projection supported
- ✅ Multiple authentication methods
Quick Start
Hive-style Parquet Files:
using ParquetSharpLINQ;
using ParquetSharpLINQ.Azure;
// Create table from Azure Blob Storage
var connectionString = "DefaultEndpointsProtocol=https;AccountName=...";
using var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
connectionString: connectionString,
containerName: "sales-data"
);
// Query with LINQ - partition pruning works automatically!
var results = table
.Where(s => s.Year == 2024) // Only reads year=2024 partitions
.Where(s => s.Region == "us-east") // Further filters partitions
.Select(s => new { s.ProductId, s.Amount })
.ToList();
Delta Lake Tables:
using ParquetSharpLINQ;
using ParquetSharpLINQ.Azure;
// Delta Lake tables work automatically - just point to the container
using var deltaTable = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
connectionString: connectionString,
containerName: "delta-sales" // Container with _delta_log/ prefix
);
// Delta transaction log is read automatically from Azure
// Only active files (after updates/deletes) are queried
var results = deltaTable
.Where(s => s.Year == 2024)
.ToList();
Factory Methods
ParquetTable<T>.Factory.FromAzureBlob()
Creates a ParquetTable for querying Parquet files from Azure Blob Storage with Hive-style partitioning and Delta Lake support.
With connection string:
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
connectionString: "DefaultEndpointsProtocol=https;AccountName=...",
containerName: "sales-data"
);
With existing BlobContainerClient:
var containerClient = new BlobContainerClient(connectionString, containerName);
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(containerClient);
With blob prefix (subfolder):
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
connectionString: connectionString,
containerName: "data",
blobPrefix: "sales/2024/"
);
With custom cache settings:
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
connectionString: connectionString,
containerName: "sales-data",
blobPrefix: "",
cacheExpiration: TimeSpan.FromMinutes(10),
maxCacheSizeBytes: 8L * 1024 * 1024 * 1024 // 8 GB
);
Features:
- Automatically configures optimized BlobContainerClient with HTTP/2, connection pooling, and retry logic
- Supports both Hive-style partitioning and Delta Lake
- Automatically detects
_delta_log/blobs in the container - Reads Delta transaction logs from Azure Blob Storage
- Only queries active files according to the Delta log
Core Components
AzureBlobParquetReader
Low-level reader for streaming Parquet files from Azure with file-based caching and LRU eviction.
var containerClient = new BlobContainerClient(connectionString, containerName);
var reader = new AzureBlobParquetReader(containerClient, maxCacheSizeBytes: 4L * 1024 * 1024 * 1024);
var discoveryStrategy = new AzureBlobPartitionDiscovery(containerClient);
var table = new ParquetTable<SalesRecord>(discoveryStrategy, reader);
AzureBlobPartitionDiscovery
Discovers Hive-style partitions and Delta Lake tables in Azure Blob Storage.
Features:
- Hive-style partitioning (
year=2024/region=us-east/) - Delta Lake transaction logs (
_delta_log/*.json)
Authentication
Connection String (Development)
var connectionString = "DefaultEndpointsProtocol=https;AccountName=...";
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(connectionString, "container");
Managed Identity (Production - Recommended)
using Azure.Identity;
var containerClient = new BlobServiceClient(
new Uri("https://myaccount.blob.core.windows.net"),
new DefaultAzureCredential()
).GetBlobContainerClient("container");
var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(containerClient);
Testing with Azurite
Test locally without an Azure subscription using Azurite (Azure Storage Emulator):
# Start Azurite with Docker
docker run -d -p 10000:10000 --name azurite \
mcr.microsoft.com/azure-storage/azurite
# Use Azurite connection string
const string AzuriteConnectionString =
"DefaultEndpointsProtocol=http;" +
"AccountName=devstoreaccount1;" +
"AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;" +
"BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;";
// Test your code
using var table = ParquetTable<SalesRecord>.Factory.FromAzureBlob(
AzuriteConnectionString,
"test-container"
);
Benefits:
- No Azure costs
- Fast local development (~40x faster than real Azure)
- Perfect for CI/CD pipelines
- Works offline
Dependencies
- ParquetSharpLINQ (core library with Delta Lake support)
- Azure.Storage.Blobs 12.19.1+
License
Same as ParquetSharpLINQ core library.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Azure.Storage.Blobs (>= 12.26.0)
- ParquetSharpLINQ (>= 1.1.0-alpha.3)
-
net8.0
- Azure.Storage.Blobs (>= 12.26.0)
- ParquetSharpLINQ (>= 1.1.0-alpha.3)
-
net9.0
- Azure.Storage.Blobs (>= 12.26.0)
- ParquetSharpLINQ (>= 1.1.0-alpha.3)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.1.0-alpha.3 | 128 | 12/23/2025 |
| 1.1.0-alpha.2 | 124 | 12/23/2025 |
| 1.1.0-alpha.1 | 108 | 12/21/2025 |
| 1.0.0-alpha.6 | 73 | 12/11/2025 |
| 1.0.0-alpha.5 | 386 | 12/10/2025 |
| 1.0.0-alpha.4 | 373 | 12/9/2025 |
| 1.0.0-alpha.3 | 381 | 12/9/2025 |
| 1.0.0-alpha.2 | 391 | 12/8/2025 |
| 1.0.0-alpha.1 | 368 | 12/8/2025 |
Initial release with Azure Blob Storage streaming support, Delta Lake support, and in-memory caching.