Lokad.Utf8Regex.PythonRe 0.2.0

Prefix Reserved
dotnet add package Lokad.Utf8Regex.PythonRe --version 0.2.0
                    
NuGet\Install-Package Lokad.Utf8Regex.PythonRe -Version 0.2.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Lokad.Utf8Regex.PythonRe" Version="0.2.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Lokad.Utf8Regex.PythonRe" Version="0.2.0" />
                    
Directory.Packages.props
<PackageReference Include="Lokad.Utf8Regex.PythonRe" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Lokad.Utf8Regex.PythonRe --version 0.2.0
                    
#r "nuget: Lokad.Utf8Regex.PythonRe, 0.2.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Lokad.Utf8Regex.PythonRe@0.2.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Lokad.Utf8Regex.PythonRe&version=0.2.0
                    
Install as a Cake Addin
#tool nuget:?package=Lokad.Utf8Regex.PythonRe&version=0.2.0
                    
Install as a Cake Tool

Lokad.Utf8Regex.PythonRe

Lokad.Utf8Regex.PythonRe is a net10.0 side extension for Lokad.Utf8Regex that exposes a UTF-8-first Python re flavored API through Utf8PythonRegex.

It is intended for workloads where the input already exists as UTF-8 bytes and you need Python-style regular expression semantics on top of the base Lokad.Utf8Regex library.

Support Scope

  • Semantic target: Utf8PythonRegex models the standard re surface, while remaining a managed .NET library.
  • Implementation model: parser-first translation. Python-style patterns are parsed into a dedicated AST before any backend is selected.
  • Translation-first execution: when a Python-style pattern can be translated without semantic drift, execution is delegated to Utf8Regex.
  • Fallback execution: if UTF-8 native execution cannot be created, the package falls back to managed .NET Regex on the translated pattern.
  • Dependency model: the only non-BCL implementation dependency is Lokad.Utf8Regex.
  • Primary I/O model: the main API surface operates on UTF-8 ReadOnlySpan<byte>.
  • Capture-aware APIs use PythonRe-specific result/context types, independent from the PCRE2 sidecar.
  • Culture model: the profile is culture-invariant by construction; Python re.LOCALE is intentionally unsupported.
  • Named Unicode escapes: \N{...} is intentionally unsupported to avoid embedding a large Unicode-name table in the package.
  • FindAll(...) is a structural full-match collector.
  • FindAllToStrings(...) provides CPython-shaped re.findall(...) string result shaping.
  • FindAllToUtf8(...) provides the same host-facing shape with UTF-8 byte outputs.
  • SearchToString(...), MatchToString(...), and FullMatchToString(...) provide host-friendly scalar helpers.
  • SearchDetailedData(...), MatchDetailedData(...), FullMatchDetailedData(...), and FindIterDetailed(...) provide non-ref struct match snapshots for runtime hosts.
  • ReplaceToString<TState>(...) / SubnToString<TState>(...) support evaluator-driven replacement over match snapshots.
  • Subn(...) / Subn<TState>(...) provide UTF-8 replacement outputs with replacement counts for bytes-first hosts.
  • Replace<TState>(...) / Subn<TState>(...) also expose byte-oriented evaluator overloads for hosts that want to stay UTF-8-native end to end.
  • SplitDetailed(...) exposes split output with segment/capture metadata.

Public API

using Lokad.Utf8Regex.PythonRe;

var regex = new Utf8PythonRegex(
    pattern: @"(?P<word>foo)-(?P=word)",
    options: PythonReCompileOptions.None);

bool isMatch = regex.IsMatch("foo-foo"u8);
var first = regex.Search("xx foo-foo yy"u8);
int count = regex.Count("foo-foo x foo-foo"u8);
byte[] replaced = regex.Replace("foo x foo"u8, "<\\g<word>>");
Utf8PythonMatchData[] matches = regex.FindAll("foo-foo xx foo-foo"u8);
var iter = regex.FindIterDetailed("foo-foo xx foo-foo"u8);
string?[] parts = regex.SplitToStrings("foo-foo xx foo-foo"u8);

String-returning helpers such as SearchToString(...), ReplaceToString(...), and FindAllToStrings(...) are also available as host-friendly conveniences when a runtime needs scalar string results.

Development Notes

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Lokad.Utf8Regex.PythonRe:

Package Downloads
Lokad.Lython

Embeddable, contained-by-design Python runtime for .NET, implemented in managed C# with host-mediated file and process capabilities.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.2.0 150 4/29/2026
0.1.0 101 4/29/2026