Pure .NET library to read and write Apache Parquet files, targeting .NET 4.5 and .NET Standand 1.4 and up. Linux, Windows and Mac are first class citizens, but also works everywhere .NET is running (Android, iOS, IOT). Has zero dependencies on thrid-party libraries or any native code. Provides both low-level access to Apache Parquet files, and high-level utilities for more traditional and humanly understandable row-based access. Includes automatic serializer/deserializer from C# classes into parquet files that works by generating MSIL (bytecode) on the fly and is therefore super fast.
See the version list below for details.
Install-Package Parquet.Net -Version 3.2.5
dotnet add package Parquet.Net --version 3.2.5
<PackageReference Include="Parquet.Net" Version="3.2.5" />
paket add Parquet.Net --version 3.2.5
- bug fixed: multiple nesting levels were not correctly read by row-based helpers (#363)
- bug fixed: Equals method on Row didn't compare list elements correctly (#361)
- bug fixed: multi-page columns were not read to end if their order is not what parquet-dotnet expects (#370)
- new feature: POCO serialiser support for repeatable fields POCO (#358)
- bug fixed: --max-rows 10 not honored by PARQ Global Tool 📺 (#357)
- bug fixed: failure to read columns if data page is larger than it should be (supposably padded by Spark) 🐛
- improvement: Limit number of rows printed by parq. By default only show the first 10 rows in PARQ Global Tool 📺
- includes massive performance improvements in parquet reader, now we are faster than fastparquet (python lib)
- new feature: replaced default ToString() method in Table and Row object to produce json (#346)
- new feature: parquet CLI supports conversion from parquet to json (#341)
parq cli improvements
- re-introducing utilities for row-based access allowing you to access and create parquet files in more readable format.
- Field class now supports MaxRepetitionLevel and MaxDefinitionLevel
- fixed bug #334 preventing reading generated files in Impala
- parquet.net library supports SourceLink
- #321 bug fixed: a nullable field should support all-non-nullable values
- performance improvement around packing definition levels
- bug fixed: Cannot read schema where map elements are structures (#320)
- critical bug fixed: reading parquet files with multiple pages doesn't read beyond 1st page (#318)
- performance improvements (#317)
- improvement: better column validation in row group writer
- bug fixed: Snappy compression writer fails on certain encodings (#315)
the first release of a major rewrite
Showing the top 1 GitHub repositories that depend on Parquet.Net:
ML.NET is an open source and cross-platform machine learning framework for .NET.