FasterWhisper.NET.Gpu 1.0.6

dotnet add package FasterWhisper.NET.Gpu --version 1.0.6
                    
NuGet\Install-Package FasterWhisper.NET.Gpu -Version 1.0.6
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="FasterWhisper.NET.Gpu" Version="1.0.6" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="FasterWhisper.NET.Gpu" Version="1.0.6" />
                    
Directory.Packages.props
<PackageReference Include="FasterWhisper.NET.Gpu" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add FasterWhisper.NET.Gpu --version 1.0.6
                    
#r "nuget: FasterWhisper.NET.Gpu, 1.0.6"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package FasterWhisper.NET.Gpu@1.0.6
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=FasterWhisper.NET.Gpu&version=1.0.6
                    
Install as a Cake Addin
#tool nuget:?package=FasterWhisper.NET.Gpu&version=1.0.6
                    
Install as a Cake Tool

FasterWhisper.NET Banner

FasterWhisper.NET.Gpu

by Qourex โ€” Bringing high-performance GPU-accelerated speech recognition to .NET

Build & Test NuGet Downloads Documentation License: MIT .NET

๐Ÿ“– Read the Documentation for detailed guides, .NET 10.0 samples, and mobile deployment walkthroughs.


FasterWhisper.NET.Gpu is the GPU-accelerated release of the C# port of the popular Python library faster-whisper. It bundles pre-compiled native binaries built with CUDA and cuDNN enabled for CTranslate2, delivering blazing-fast transcription times on NVIDIA GPUs.

For CPU-only execution without CUDA prerequisites, please use the base FasterWhisper.NET package.


โšก Key GPU Advantages

  • ๐ŸŽฎ CUDA & cuDNN Acceleration โ€” Native GPU-bound inference for Whisper models.
  • ๐Ÿš€ Flash Attention Support โ€” Substantial throughput improvements on Ampere (RTX 30-series) and newer architectures.
  • ๐Ÿ“‰ Mixed Precision Compute โ€” Full support for "float16" and "int8_float16" compute types to minimize GPU memory (VRAM) footprint.
  • ๐Ÿ”„ Parallel Mel Extraction โ€” Managed multi-threaded audio pipeline maximizing core usage before GPU scheduling.

๐Ÿ“ฆ Installation

To install the GPU-enabled package:

dotnet add package FasterWhisper.NET.Gpu

๐Ÿš€ CUDA Prerequisites

To run this package with GPU acceleration (device: "cuda"), you must have the following NVIDIA runtimes installed and configured on your host system:

Windows

  1. NVIDIA CUDA Toolkit 12.x (Compiled with 12.8) โ€” CUDA Downloads
  2. NVIDIA cuDNN 8.9.x โ€” cuDNN Downloads Archive

Ensure that the following DLLs from these installations are available in your system PATH:

  • cudart64_12.dll (or other CUDA 12 runtime versions)
  • cublas64_12.dll
  • cublasLt64_12.dll
  • cudnn64_8.dll (specifically cuDNN v8)

Linux / WSL2

  1. NVIDIA CUDA Toolkit 12.x (Compiled with 12.8) โ€” WSL/Linux CUDA Downloads
  2. NVIDIA cuDNN 8.9.x โ€” cuDNN Downloads Archive

Ensure that the following shared libraries from these installations are available in your LD_LIBRARY_PATH or system library paths (e.g. /usr/local/cuda/lib64):

  • libcudart.so.12
  • libcublas.so.12
  • libcublasLt.so.12
  • libcudnn.so.8 (specifically cuDNN v8)

๐Ÿณ Docker Compilation (For Linux GPU Binaries)

For Linux and WSL2 environments, you can compile the CUDA native libraries natively without installing compilers on your host system by using a Docker container.

Run the following command from the root of the repository:

docker run --rm --gpus all -v "$(pwd)":/workspace -w /workspace nvcr.io/nvidia/cuda:12.8.0-devel-ubuntu22.04 bash -c "
  apt-get update && \
  apt-get install -y ca-certificates gpg wget && \
  wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && \
  echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | tee /etc/apt/sources.list.d/kitware.list >/dev/null && \
  apt-get update && \
  apt-get install -y cmake build-essential libopenblas-dev ninja-build libcudnn8-dev git && \
  ./build.sh --gpu-only
"

Note: If your local Docker setup does not have the NVIDIA Container Toolkit configured, you can omit the --gpus all flag, as a physical GPU is not required during the compilation step.

This command compiles the wrapper and automatically stages the output qourex_fasterwhisper_native.so and libctranslate2.so files under the C# GPU project runtimes directory (src/Qourex.FasterWhisper.NET.Gpu/runtimes/linux-x64/native/).


๐Ÿ’ป Quick Start

Concurrency & Thread-Safety: WhisperModel is thread-safe and supports concurrent transcription calls. Under the hood, concurrent calls are queued and processed safely using a SemaphoreSlim. If you configure the model with NumReplicas > 1, transcription calls will execute concurrently utilizing CTranslate2's native thread-safe replica pool, sharing the same loaded model weights in memory to minimize VRAM overhead.

using Qourex.FasterWhisper.NET;

// 1. Download and load the model on CUDA (cached to ~/.cache/qourex-fasterwhisper)
using var model = await WhisperModel.LoadAsync(
    modelNameOrPath: "large-v3",
    device:          "cuda",       // Use GPU
    computeType:     "float16",    // Half-precision for optimal GPU performance
    flashAttention:  true          // Enable Flash Attention (requires compute capability >= 8.0)
);

// 2. Configure transcription options
var options = new WhisperOptions
{
    BeamSize = 5,
    WordTimestamps = true
};

// 3. Transcribe
var segments = model.Transcribe(
    mediaPath:  "audio.wav",
    language:   "en",
    options:    options
);

// 4. Print timing and text
foreach (var segment in segments)
{
    Console.WriteLine($"[{segment.Start:F2}s -> {segment.End:F2}s] {segment.Text}");
}

๐Ÿ”ง GPU Configuration Options

Compute Types

Choose the optimal precision for your GPU memory and compute capabilities:

Compute Type Description
"default" Selects float16 if supported by the GPU, else fallback
"float16" Recommended. Fast FP16 execution, lowest VRAM utilization
"float32" Standard 32-bit floating point precision
"int8_float16" INT8 quantized calculations with FP16 storage

Flash Attention

Enable Flash Attention for compatible GPUs:

flashAttention: true

Note: Flash Attention requires an NVIDIA GPU with compute capability โ‰ฅ 8.0 (Ampere architecture or newer, e.g. RTX 30-series, 40-series, A100, H100).


๐Ÿ“„ License

This package is licensed under the MIT License โ€” see the LICENSE file for details.

MIT License ยท Copyright (c) 2026 Qourex
Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows is compatible.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows is compatible.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows is compatible. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.6 62 6/30/2026
1.0.5 92 6/28/2026
1.0.4 102 6/26/2026
1.0.3 91 6/26/2026
1.0.2 116 6/25/2026
1.0.0 92 6/25/2026