FasterWhisper.NET.Gpu 1.0.6

.NET 8.0

dotnet add package FasterWhisper.NET.Gpu --version 1.0.6

NuGet\Install-Package FasterWhisper.NET.Gpu -Version 1.0.6

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="FasterWhisper.NET.Gpu" Version="1.0.6" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="FasterWhisper.NET.Gpu" Version="1.0.6" />
                    

                            Directory.Packages.props

<PackageReference Include="FasterWhisper.NET.Gpu" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add FasterWhisper.NET.Gpu --version 1.0.6

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: FasterWhisper.NET.Gpu, 1.0.6"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package FasterWhisper.NET.Gpu@1.0.6

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=FasterWhisper.NET.Gpu&version=1.0.6
                    

                            Install as a Cake Addin

#tool nuget:?package=FasterWhisper.NET.Gpu&version=1.0.6
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

FasterWhisper.NET.Gpu

by Qourex — Bringing high-performance GPU-accelerated speech recognition to .NET

📖 Read the Documentation for detailed guides, .NET 10.0 samples, and mobile deployment walkthroughs.

FasterWhisper.NET.Gpu is the GPU-accelerated release of the C# port of the popular Python library faster-whisper. It bundles pre-compiled native binaries built with CUDA and cuDNN enabled for CTranslate2, delivering blazing-fast transcription times on NVIDIA GPUs.

For CPU-only execution without CUDA prerequisites, please use the base FasterWhisper.NET package.

⚡ Key GPU Advantages

🎮 CUDA & cuDNN Acceleration — Native GPU-bound inference for Whisper models.
🚀 Flash Attention Support — Substantial throughput improvements on Ampere (RTX 30-series) and newer architectures.
📉 Mixed Precision Compute — Full support for "float16" and "int8_float16" compute types to minimize GPU memory (VRAM) footprint.
🔄 Parallel Mel Extraction — Managed multi-threaded audio pipeline maximizing core usage before GPU scheduling.

📦 Installation

To install the GPU-enabled package:

dotnet add package FasterWhisper.NET.Gpu

🚀 CUDA Prerequisites

To run this package with GPU acceleration (device: "cuda"), you must have the following NVIDIA runtimes installed and configured on your host system:

Windows

NVIDIA CUDA Toolkit 12.x (Compiled with 12.8) — CUDA Downloads
NVIDIA cuDNN 8.9.x — cuDNN Downloads Archive

Ensure that the following DLLs from these installations are available in your system PATH:

cudart64_12.dll (or other CUDA 12 runtime versions)
cublas64_12.dll
cublasLt64_12.dll
cudnn64_8.dll (specifically cuDNN v8)

Linux / WSL2

NVIDIA CUDA Toolkit 12.x (Compiled with 12.8) — WSL/Linux CUDA Downloads
NVIDIA cuDNN 8.9.x — cuDNN Downloads Archive

Ensure that the following shared libraries from these installations are available in your LD_LIBRARY_PATH or system library paths (e.g. /usr/local/cuda/lib64):

libcudart.so.12
libcublas.so.12
libcublasLt.so.12
libcudnn.so.8 (specifically cuDNN v8)

🐳 Docker Compilation (For Linux GPU Binaries)

For Linux and WSL2 environments, you can compile the CUDA native libraries natively without installing compilers on your host system by using a Docker container.

Run the following command from the root of the repository:

docker run --rm --gpus all -v "$(pwd)":/workspace -w /workspace nvcr.io/nvidia/cuda:12.8.0-devel-ubuntu22.04 bash -c "
  apt-get update && \
  apt-get install -y ca-certificates gpg wget && \
  wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && \
  echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | tee /etc/apt/sources.list.d/kitware.list >/dev/null && \
  apt-get update && \
  apt-get install -y cmake build-essential libopenblas-dev ninja-build libcudnn8-dev git && \
  ./build.sh --gpu-only
"

Note: If your local Docker setup does not have the NVIDIA Container Toolkit configured, you can omit the --gpus all flag, as a physical GPU is not required during the compilation step.

This command compiles the wrapper and automatically stages the output qourex_fasterwhisper_native.so and libctranslate2.so files under the C# GPU project runtimes directory (src/Qourex.FasterWhisper.NET.Gpu/runtimes/linux-x64/native/).

💻 Quick Start

Concurrency & Thread-Safety: WhisperModel is thread-safe and supports concurrent transcription calls. Under the hood, concurrent calls are queued and processed safely using a SemaphoreSlim. If you configure the model with NumReplicas > 1, transcription calls will execute concurrently utilizing CTranslate2's native thread-safe replica pool, sharing the same loaded model weights in memory to minimize VRAM overhead.

using Qourex.FasterWhisper.NET;

// 1. Download and load the model on CUDA (cached to ~/.cache/qourex-fasterwhisper)
using var model = await WhisperModel.LoadAsync(
    modelNameOrPath: "large-v3",
    device:          "cuda",       // Use GPU
    computeType:     "float16",    // Half-precision for optimal GPU performance
    flashAttention:  true          // Enable Flash Attention (requires compute capability >= 8.0)
);

// 2. Configure transcription options
var options = new WhisperOptions
{
    BeamSize = 5,
    WordTimestamps = true
};

// 3. Transcribe
var segments = model.Transcribe(
    mediaPath:  "audio.wav",
    language:   "en",
    options:    options
);

// 4. Print timing and text
foreach (var segment in segments)
{
    Console.WriteLine($"[{segment.Start:F2}s -> {segment.End:F2}s] {segment.Text}");
}

🔧 GPU Configuration Options

Compute Types

Choose the optimal precision for your GPU memory and compute capabilities:

Compute Type	Description
`"default"`	Selects `float16` if supported by the GPU, else fallback
`"float16"`	Recommended. Fast FP16 execution, lowest VRAM utilization
`"float32"`	Standard 32-bit floating point precision
`"int8_float16"`	INT8 quantized calculations with FP16 storage

Flash Attention

Enable Flash Attention for compatible GPUs:

flashAttention: true

Note: Flash Attention requires an NVIDIA GPU with compute capability ≥ 8.0 (Ampere architecture or newer, e.g. RTX 30-series, 40-series, A100, H100).

📄 License

This package is licensed under the MIT License — see the LICENSE file for details.

MIT License · Copyright (c) 2026 Qourex

Product	Compatible and additional computed target framework versions.
.NET	net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows is compatible. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows is compatible. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows is compatible.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
net10.0-windows
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
net8.0
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
net8.0-windows
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
net9.0
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
net9.0-windows
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.ML.OnnxRuntime (>= 1.20.1)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.0.6	62	6/30/2026
1.0.5	92	6/28/2026
1.0.4	102	6/26/2026
1.0.3	91	6/26/2026
1.0.2	116	6/25/2026
1.0.0	92	6/25/2026

See https://github.com/qourex/fasterwhisper.net/blob/main/CHANGELOG.md