Compendium.Adapters.Ollama
1.0.0-preview.0
dotnet add package Compendium.Adapters.Ollama --version 1.0.0-preview.0
NuGet\Install-Package Compendium.Adapters.Ollama -Version 1.0.0-preview.0
<PackageReference Include="Compendium.Adapters.Ollama" Version="1.0.0-preview.0" />
<PackageVersion Include="Compendium.Adapters.Ollama" Version="1.0.0-preview.0" />
<PackageReference Include="Compendium.Adapters.Ollama" />
paket add Compendium.Adapters.Ollama --version 1.0.0-preview.0
#r "nuget: Compendium.Adapters.Ollama, 1.0.0-preview.0"
#:package Compendium.Adapters.Ollama@1.0.0-preview.0
#addin nuget:?package=Compendium.Adapters.Ollama&version=1.0.0-preview.0&prerelease
#tool nuget:?package=Compendium.Adapters.Ollama&version=1.0.0-preview.0&prerelease
compendium-adapter-ollama
Ollama AI provider adapter for the Compendium event-sourcing framework. Implements IAIProvider from Compendium.Abstractions.AI against the Ollama HTTP API for local LLM inference — designed for developer workflows and air-gapped production environments.
Extracted from sassy-solutions/compendium per ADR-0006 (multi-repo adapter split).
Install
dotnet add package Compendium.Adapters.Ollama
services.AddCompendiumOllama(builder.Configuration.GetSection("Ollama"));
Quick-start
- Install Ollama (one of):
# macOS brew install ollama && brew services start ollama # Linux curl -fsSL https://ollama.com/install.sh | sh # Docker docker run -d -p 11434:11434 --name ollama ollama/ollama - Pull a model:
ollama pull llama3.2 # general chat + tool calling ollama pull nomic-embed-text # embeddings - Wire the adapter into your DI container:
services.AddCompendiumOllama(opt => { opt.BaseUrl = "http://localhost:11434"; opt.DefaultModel = "llama3.2"; opt.DefaultEmbeddingModel = "nomic-embed-text"; }); - Use
IAIProviderlike any other Compendium AI provider:var result = await provider.CompleteAsync(new CompletionRequest { Model = "llama3.2", Messages = new[] { Message.User("Tell me a joke.") }, });
A runnable example lives in samples/01-local-chat.
Options
| Key | Default | Notes |
|---|---|---|
BaseUrl |
http://localhost:11434 |
Where the Ollama HTTP API listens. |
DefaultModel |
llama3.2 |
Used when CompletionRequest.Model is empty. |
DefaultEmbeddingModel |
nomic-embed-text |
Used when EmbeddingRequest.Model is empty. |
DefaultMaxTokens |
2048 |
Maps to options.num_predict. Use -1 to let the model finish naturally. |
TimeoutSeconds |
600 |
Local inference can be slow on first load — default is intentionally generous. |
AutoPullModel |
false |
On 404 / model-not-found, call POST /api/pull then retry. Dev-only — pulls can be huge. |
EnableLogging |
false |
Logs full request/response bodies at Debug. Never enable in production with PII. |
KeepAlive |
"5m" |
Forwarded as Ollama's keep_alive — how long the model stays resident in memory. |
Tool calling
Some Ollama models advertise OpenAI-compatible tool calling. Use the WithTools() extension:
using Compendium.Adapters.Ollama.Tools;
var tools = new List<AgentTool>
{
new("get_weather", "Returns the current weather.",
"""{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}"""),
};
var request = new CompletionRequest
{
Model = "llama3.2",
Messages = new[] { Message.User("What's the weather in Paris?") },
}.WithTools(tools);
var result = await provider.CompleteAsync(request);
var calls = result.Value.GetToolCalls(); // IReadOnlyList<AgentToolInvocation>
Model support is uneven. Tool calling generally works on:
| Model family | Tool calling? |
|---|---|
llama3.2, llama3.3 |
yes |
llama4 (when out) |
yes |
qwen2.5, qwen3 |
yes |
mistral-nemo, mixtral |
yes |
command-r, command-r-plus |
yes |
older llama2, llama3.1 |
partial / no |
| embedding models | no |
Older models silently ignore the tools field; the response will simply have no tool_calls. Verify against your specific model + Ollama version before relying on tool flow in production.
Streaming
Ollama streams newline-delimited JSON (not Server-Sent Events). The adapter parses chunks automatically — consume them as any other IAsyncEnumerable<Result<CompletionChunk>>:
await foreach (var chunk in provider.StreamCompleteAsync(request, ct))
{
if (chunk.IsFailure) break;
Console.Write(chunk.Value.ContentDelta);
}
Auto-pull
Set opt.AutoPullModel = true to automatically pull a missing model the first time it's requested. Useful for dev and one-shot ops; not recommended for production because pulls block the request for the duration of the download (often gigabytes).
Security
Ollama runs unauthenticated by default on localhost. For production:
- Never expose the raw API to the public internet — there is no auth, no rate-limiting, no per-tenant isolation.
- Put it behind a reverse proxy that handles authentication (e.g. nginx + OIDC, or a Kubernetes ingress with mTLS).
- Bind Ollama to
127.0.0.1only and let the proxy do the public-facing TLS termination. - Treat the model surface as untrusted code execution context — anyone with API access can pull and run arbitrary models.
Recommended models
| Use case | Suggested model | Approx. size |
|---|---|---|
| General chat (development) | llama3.2, qwen2.5:7b |
2–5 GB |
| General chat (production) | llama3.3:70b, qwen2.5:72b |
40–50 GB |
| Tool calling (smallest) | llama3.2:3b, qwen2.5:3b |
2–3 GB |
| Code completion | qwen2.5-coder, deepseek-coder-v2 |
4–8 GB |
| Embeddings | nomic-embed-text, mxbai-embed-large |
0.3–1 GB |
| Tiny CI / smoke tests | qwen2.5:0.5b, tinyllama, all-minilm |
0.3–1 GB |
Versioning
Versions are driven by git tags via MinVer. The first published version is 1.0.0-preview.0 (the orchestrator tags on merge — do not tag from a feature branch).
Build & test locally
dotnet restore
dotnet build -c Release
dotnet test -c Release --filter "FullyQualifiedName!~IntegrationTests"
Integration tests boot a real Ollama container and pull tiny models — opt in via:
RUN_OLLAMA_INTEGRATION=1 dotnet test -c Release --filter "FullyQualifiedName~IntegrationTests"
Expect them to take several minutes the first time the image and models are pulled.
License
MIT — Copyright © 2026 Sassy Solutions.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net9.0
- Compendium.Abstractions.AI (>= 1.0.1)
- Compendium.Core (>= 1.0.1)
- Microsoft.Extensions.Configuration.Abstractions (>= 9.0.16)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.16)
- Microsoft.Extensions.Http (>= 9.0.16)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.16)
- Microsoft.Extensions.Options (>= 9.0.16)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 9.0.16)
- Microsoft.Extensions.Options.DataAnnotations (>= 9.0.16)
- System.Text.Json (>= 9.0.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.0-preview.0 | 33 | 5/17/2026 |