PolyPrompt 0.1.0
See the version list below for details.
dotnet add package PolyPrompt --version 0.1.0
NuGet\Install-Package PolyPrompt -Version 0.1.0
<PackageReference Include="PolyPrompt" Version="0.1.0" />
<PackageVersion Include="PolyPrompt" Version="0.1.0" />
<PackageReference Include="PolyPrompt" />
paket add PolyPrompt --version 0.1.0
#r "nuget: PolyPrompt, 0.1.0"
#:package PolyPrompt@0.1.0
#addin nuget:?package=PolyPrompt&version=0.1.0
#tool nuget:?package=PolyPrompt&version=0.1.0
<img src="https://github.com/jchristn/PolyPrompt/blob/main/assets/logo.png?raw=true" width="192" height="192">
PolyPrompt
PolyPrompt is a lightweight, unified .NET library for chat completions, text generation, embeddings, and model management across Ollama, OpenAI, and Google Gemini APIs. Write your LLM integration code once and swap providers without changing your application logic.
What It Does
PolyPrompt provides a single, consistent API surface for interacting with multiple LLM providers. Instead of learning three different SDKs with different conventions, response formats, and streaming patterns, you use one set of methods that work identically across all supported providers.
- Chat Completions — Streaming and non-streaming conversational AI with system prompts
- Text Generation — Streaming and non-streaming text generation (completion-style)
- Embeddings — Single and batch embedding vector generation for semantic search and RAG
- Model Management — List models, check existence, get model details, pull, and delete
- Connectivity Validation — Verify provider reachability before running workloads
- Timing Metrics — Built-in performance tracking including time-to-first-token, tokens/sec, and overall throughput
- Call Recording — Every HTTP call is recorded with full request/response details for debugging and auditing
- Provider-Specific Options — Fine-tune each provider's unique parameters without losing portability
Use Cases
PolyPrompt is a good fit when you need to:
- Build provider-agnostic applications — Let users choose their preferred LLM provider (local Ollama, cloud OpenAI, or Google Gemini) without rewriting integration code
- Compare providers side-by-side — Benchmark the same prompts across Ollama, OpenAI, and Gemini to evaluate quality, latency, and cost
- Prototype rapidly — Get a chat completion, embedding, or text generation working in a few lines of code without studying provider-specific SDKs
- Build RAG pipelines — Generate embeddings for document chunks using any provider's embedding models, then query with semantic search
- Create AI-powered CLI tools — The simple API makes it easy to add LLM capabilities to command-line applications
- Manage local model infrastructure — Pull, list, inspect, and delete Ollama models programmatically
- Monitor LLM performance — Use built-in timing metrics and call recording to track latency, throughput, and errors in production
- Build multi-model workflows — Use different providers for different tasks (e.g., Ollama for embeddings, OpenAI for chat) through the same interface
When Not to Use It
PolyPrompt may not be the right choice if you need:
- Advanced provider-specific features — Tool/function calling, vision/image inputs, structured outputs, or fine-tuning APIs are not currently supported
- Multi-turn conversation management — PolyPrompt handles single-turn request/response; it does not manage conversation history or context windows
- Token counting or cost estimation — While some providers return token usage in responses, PolyPrompt does not provide pre-request token counting
- Official SDK parity — If you need every feature of a specific provider's API, use their official SDK instead
Installation
dotnet add package PolyPrompt
PolyPrompt targets both .NET 8.0 and .NET 10.0.
Quick Start
Ollama
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
ChatResponse response = await client.ChatAsync("What is the capital of France?");
Console.WriteLine(response.Text);
OpenAI
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OpenAiClient client = new OpenAiClient("https://api.openai.com", "sk-your-api-key");
client.Model = "gpt-4o";
ChatResponse response = await client.ChatAsync("What is the capital of France?");
Console.WriteLine(response.Text);
Gemini
using PolyPrompt.Clients;
using PolyPrompt.Models;
using GeminiClient client = new GeminiClient(
"https://generativelanguage.googleapis.com",
"your-api-key");
client.Model = "gemini-2.5-flash";
ChatResponse response = await client.ChatAsync("What is the capital of France?");
Console.WriteLine(response.Text);
Detailed Examples
Chat with System Prompt
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
client.SystemPrompt = "You are a helpful assistant that responds in haiku format.";
client.Temperature = 0.7;
client.MaxTokens = 256;
ChatResponse response = await client.ChatAsync("Tell me about the ocean.");
if (response.Success)
{
Console.WriteLine(response.Text);
Console.WriteLine("Runtime: " + response.OverallRuntimeMs + " ms");
}
else
{
Console.WriteLine("Error: " + response.Error);
}
Chat with Provider-Specific Options
using PolyPrompt.Clients;
using PolyPrompt.Models;
using PolyPrompt.Options;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
OllamaChatCompletionOptions options = new OllamaChatCompletionOptions();
options.Temperature = 0.5;
options.TopP = 0.9;
options.MaxTokens = 512;
options.TopK = 40;
options.RepeatPenalty = 1.1;
options.Seed = 42;
options.SystemPrompt = "You are a concise technical writer.";
ChatResponse response = await client.ChatAsync("Explain dependency injection.", options);
Console.WriteLine(response.Text);
Streaming Chat
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OpenAiClient client = new OpenAiClient("https://api.openai.com", "sk-your-api-key");
client.Model = "gpt-4o";
ChatStreamingResponse stream = await client.ChatStreamingAsync("Write a short story about a robot.");
await foreach (ChatStreamingChunk chunk in stream.Chunks)
{
if (!string.IsNullOrEmpty(chunk.Text))
{
Console.Write(chunk.Text);
}
}
Console.WriteLine();
Console.WriteLine("Time to first token: " + stream.TimeToFirstTokenMs + " ms");
Console.WriteLine("Tokens/sec: " + stream.OverallTokensPerSecond.ToString("F1"));
Console.WriteLine("Total chunks: " + stream.ChunkCount);
Single Embedding
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
OllamaEmbeddingOptions options = new OllamaEmbeddingOptions();
options.Model = "all-minilm";
EmbeddingResponse response = await client.EmbedAsync("The quick brown fox jumps over the lazy dog.", options);
if (response.Success && response.Embeddings.Count > 0)
{
float[] vector = response.Embeddings[0].Embedding;
Console.WriteLine("Dimensions: " + vector.Length);
Console.WriteLine("First 5 values: " + string.Join(", ", vector.Take(5)));
}
Batch Embeddings
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OpenAiClient client = new OpenAiClient("https://api.openai.com", "sk-your-api-key");
OpenAiEmbeddingOptions options = new OpenAiEmbeddingOptions();
options.Model = "text-embedding-3-small";
options.Dimensions = 256;
List<string> documents = new List<string>
{
"Machine learning is a subset of artificial intelligence.",
"Neural networks are inspired by biological neurons.",
"Deep learning uses multiple layers of neural networks."
};
EmbeddingResponse response = await client.EmbedAsync(documents, options);
if (response.Success)
{
for (int i = 0; i < response.Embeddings.Count; i++)
{
Console.WriteLine("Document " + i + ": " + response.Embeddings[i].Embedding.Length + " dimensions");
}
}
Text Generation (Non-Streaming)
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
GenerationResponse response = await client.GenerateAsync("Once upon a time, in a land far away,");
Console.WriteLine(response.Text);
Console.WriteLine("Runtime: " + response.OverallRuntimeMs + " ms");
Text Generation (Streaming)
using PolyPrompt.Clients;
using PolyPrompt.Models;
using GeminiClient client = new GeminiClient(
"https://generativelanguage.googleapis.com",
"your-api-key");
client.Model = "gemini-2.5-flash";
GenerationStreamingResponse stream = await client.GenerateStreamingAsync("Write a limerick about coding.");
await foreach (GenerationStreamingChunk chunk in stream.Chunks)
{
if (!string.IsNullOrEmpty(chunk.Text))
{
Console.Write(chunk.Text);
}
}
Console.WriteLine();
Console.WriteLine("Time to first token: " + stream.TimeToFirstTokenMs + " ms");
Console.WriteLine("Tokens/sec: " + stream.OverallTokensPerSecond.ToString("F1"));
List Available Models
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
await foreach (ModelInformation model in client.ListModelsAsync())
{
Console.WriteLine(model.Name
+ (model.DisplayName != null ? " (" + model.DisplayName + ")" : "")
+ (model.SizeBytes.HasValue ? " [" + (model.SizeBytes.Value / 1_000_000_000.0).ToString("F1") + " GB]" : ""));
}
Check If a Model Exists
using PolyPrompt.Clients;
using OllamaClient client = new OllamaClient("http://localhost:11434");
bool exists = await client.ModelExistsAsync("gemma3:4b");
Console.WriteLine("gemma3:4b exists: " + exists);
// Also matches without tags: "gemma3" matches "gemma3:latest"
bool existsNoTag = await client.ModelExistsAsync("gemma3");
Console.WriteLine("gemma3 exists: " + existsNoTag);
Get Model Details
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
ModelInformation? info = await client.GetModelInformationAsync("gemma3:4b");
if (info != null)
{
Console.WriteLine("Name: " + info.Name);
Console.WriteLine("Modified: " + info.ModifiedUtc);
foreach (KeyValuePair<string, string?> kv in info.Metadata)
{
Console.WriteLine(" " + kv.Key + ": " + kv.Value);
}
}
Pull a Model (Ollama)
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
bool success = await client.PullModelAsync("gemma3:4b", async (ModelPullProgress progress) =>
{
if (progress.PercentComplete.HasValue)
{
Console.Write("\r" + progress.Status + " " + progress.PercentComplete.Value.ToString("F1") + "%");
}
else
{
Console.WriteLine(progress.Status);
}
});
Console.WriteLine();
Console.WriteLine(success ? "Pull succeeded." : "Pull failed.");
Delete a Model (Ollama)
using PolyPrompt.Clients;
using OllamaClient client = new OllamaClient("http://localhost:11434");
bool deleted = await client.DeleteModelAsync("gemma3:4b");
Console.WriteLine(deleted ? "Model deleted." : "Delete failed.");
Validate Connectivity
using PolyPrompt.Clients;
using GeminiClient client = new GeminiClient(
"https://generativelanguage.googleapis.com",
"your-api-key");
bool reachable = await client.ValidateConnectivityAsync();
Console.WriteLine(reachable ? "Connected." : "Cannot reach provider.");
Inspect Call Details
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
ChatResponse response = await client.ChatAsync("Hello!");
foreach (CompletionCallDetail detail in client.CallDetails)
{
Console.WriteLine(detail.Method + " " + detail.Url);
Console.WriteLine(" Status: " + detail.StatusCode);
Console.WriteLine(" Time: " + detail.ResponseTimeMs + " ms");
Console.WriteLine(" Success: " + detail.Success);
}
Using CancellationToken
using PolyPrompt.Clients;
using PolyPrompt.Models;
using OllamaClient client = new OllamaClient("http://localhost:11434");
client.Model = "gemma3:4b";
using CancellationTokenSource cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
try
{
ChatResponse response = await client.ChatAsync("Write a very long essay.", token: cts.Token);
Console.WriteLine(response.Text);
}
catch (OperationCanceledException)
{
Console.WriteLine("Request was cancelled.");
}
Provider-Agnostic Code
using PolyPrompt.Clients;
using PolyPrompt.Models;
CompletionClientBase CreateClient(string provider, string endpoint, string? apiKey)
{
switch (provider)
{
case "ollama":
return new OllamaClient(endpoint, apiKey);
case "openai":
return new OpenAiClient(endpoint, apiKey);
case "gemini":
return new GeminiClient(endpoint, apiKey);
default:
throw new ArgumentException("Unknown provider: " + provider);
}
}
// Same code works regardless of provider
using CompletionClientBase client = CreateClient("ollama", "http://localhost:11434", null);
client.Model = "gemma3:4b";
ChatResponse chat = await client.ChatAsync("Hello!");
Console.WriteLine(chat.Text);
await foreach (ModelInformation model in client.ListModelsAsync())
{
Console.WriteLine(" " + model.Name);
}
API Reference
Client Properties
| Property | Type | Default | Description |
|---|---|---|---|
Endpoint |
string |
varies | API endpoint URL (read-only) |
ApiKey |
string? |
null |
API key (read-only) |
Model |
string |
varies | Model name for requests |
MaxTokens |
int |
4096 |
Maximum tokens to generate (1 to 10,000,000) |
TimeoutMs |
int |
120000 |
HTTP timeout in milliseconds (1,000 to 600,000) |
Temperature |
double? |
null |
Sampling temperature (0.0 to 2.0) |
TopP |
double? |
null |
Nucleus sampling threshold (0.0 to 1.0) |
SystemPrompt |
string? |
null |
System prompt for chat completions |
CallDetails |
List<CompletionCallDetail> |
empty | Recorded HTTP call details |
Client Methods
| Method | Description |
|---|---|
ChatAsync |
Non-streaming chat completion |
ChatStreamingAsync |
Streaming chat completion with timing metrics |
EmbedAsync(string) |
Generate embedding for a single text |
EmbedAsync(List<string>) |
Generate embeddings for a batch of texts |
GenerateAsync |
Non-streaming text generation |
GenerateStreamingAsync |
Streaming text generation with timing metrics |
ListModelsAsync |
List available models (returns IAsyncEnumerable<ModelInformation>) |
ModelExistsAsync |
Check if a specific model exists |
GetModelInformationAsync |
Get detailed information about a model |
PullModelAsync |
Pull/download a model with progress callbacks (Ollama only) |
DeleteModelAsync |
Delete a model (Ollama only) |
ValidateConnectivityAsync |
Verify the provider is reachable |
Provider-Specific Options
Each provider exposes option classes that extend the base options with provider-specific parameters:
| Provider | Chat Options | Embedding Options | Generation Options |
|---|---|---|---|
| Ollama | OllamaChatCompletionOptions |
OllamaEmbeddingOptions |
OllamaGenerationOptions |
| OpenAI | OpenAiChatCompletionOptions |
OpenAiEmbeddingOptions |
OpenAiGenerationOptions |
| Gemini | GeminiChatCompletionOptions |
GeminiEmbeddingOptions |
GeminiGenerationOptions |
Ollama-specific parameters: ContextLength, TopK, RepeatPenalty, Seed, MinP, RepeatLastN
OpenAI-specific parameters: FrequencyPenalty, PresencePenalty, Seed, Dimensions, EncodingFormat, Echo, Suffix, Logprobs
Gemini-specific parameters: TopK, CandidateCount, PresencePenalty, FrequencyPenalty, TaskType, Title
Default Models
| Provider | Default Inference Model | Suggested Embedding Model |
|---|---|---|
| Ollama | gemma3:4b |
all-minilm |
| OpenAI | gpt-4o-mini |
text-embedding-3-small |
| Gemini | gemini-2.5-flash |
gemini-embedding-001 |
Provider Feature Support
| Feature | Ollama | OpenAI | Gemini |
|---|---|---|---|
| Chat (non-streaming) | Yes | Yes | Yes |
| Chat (streaming) | Yes | Yes | Yes |
| Text Generation | Yes | Legacy only | Yes |
| Embeddings (single) | Yes | Yes | Yes |
| Embeddings (batch) | Yes | Yes | Yes |
| List Models | Yes | Yes | Yes |
| Model Exists | Yes | Yes | Yes |
| Get Model Info | Yes | Yes | Yes |
| Pull Model | Yes | No | No |
| Delete Model | Yes | No | No |
| Validate Connectivity | Yes | Yes | Yes |
Project Structure
PolyPrompt/
├── src/
│ ├── PolyPrompt/ # Core library (NuGet package)
│ │ ├── Clients/ # CompletionClientBase, OllamaClient, OpenAiClient, GeminiClient
│ │ ├── Models/ # Request/response data models
│ │ └── Options/ # Provider-specific option classes
│ ├── OllamaConsole/ # Interactive Ollama test harness
│ ├── OpenAIConsole/ # Interactive OpenAI test harness
│ ├── GeminiConsole/ # Interactive Gemini test harness
│ └── Test.Automated/ # Automated test suite
└── assets/
└── logo.png
Building from Source
dotnet restore src/PolyPrompt.sln
dotnet build src/PolyPrompt.sln
Running the Automated Tests
# Ollama
dotnet run --project src/Test.Automated -- ollama http://localhost:11434
# OpenAI
dotnet run --project src/Test.Automated -- openai https://api.openai.com sk-your-key gpt-4o text-embedding-3-small
# Gemini
dotnet run --project src/Test.Automated -- gemini https://generativelanguage.googleapis.com your-key gemini-2.5-flash gemini-embedding-001
Issues and Discussions
Have a bug to report or a feature to request? Please open an issue on GitHub:
https://github.com/jchristn/PolyPrompt/issues
Want to ask a question or start a conversation? Use GitHub Discussions:
https://github.com/jchristn/PolyPrompt/discussions
License
PolyPrompt is available under the MIT License. See the LICENSE.md file for full details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- SerializationHelper (>= 2.0.3)
- SyslogLogging (>= 2.0.13)
-
net8.0
- SerializationHelper (>= 2.0.3)
- SyslogLogging (>= 2.0.13)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.