JetsonPDF.Composition 1.1.0

dotnet add package JetsonPDF.Composition --version 1.1.0
                    
NuGet\Install-Package JetsonPDF.Composition -Version 1.1.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="JetsonPDF.Composition" Version="1.1.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="JetsonPDF.Composition" Version="1.1.0" />
                    
Directory.Packages.props
<PackageReference Include="JetsonPDF.Composition" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add JetsonPDF.Composition --version 1.1.0
                    
#r "nuget: JetsonPDF.Composition, 1.1.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package JetsonPDF.Composition@1.1.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=JetsonPDF.Composition&version=1.1.0
                    
Install as a Cake Addin
#tool nuget:?package=JetsonPDF.Composition&version=1.1.0
                    
Install as a Cake Tool

JetsonPDF.Composition

Page-level PDF assembly for JetsonPDF: extract specific pages from a PDF into a new file, and merge several PDFs into one.

Both operations are lossless at the COS level (ISO 32000-2 §7.3): a page's content streams, resources, fonts, images, and annotations are copied over in their original encoded form. Nothing is re-rendered, rasterized, or re-encoded — the bytes that described a glyph or a JPEG in the source describe it in the output too.

dotnet add package JetsonPDF.Composition
using JetsonPDF.Composition;

// Pull pages 1, 3 and 5 out of a report into a new PDF
byte[] excerpt  = PageExtractor.Extract(reportBytes, 1, 3, 5);

// Concatenate three PDFs into one
byte[] combined = Merger.Merge(coverBytes, bodyBytes, appendixBytes);
Targets Depends on Namespace
net8.0, netstandard2.0, net462 JetsonPDF.Common, JetsonPDF.Reader JetsonPDF.Composition

The package builds on the Reader only — it does not pull in the Writer. The two public types are both static:

  • PageExtractor — copy a chosen subset of pages into a new document.
  • Merger — concatenate whole documents into one.

How it works

A merge or extract is a COS object-graph copy, not a render pass:

  1. The source is parsed by the Reader's FileParser, which resolves the cross-reference table/stream and decrypts the file if a password was supplied.
  2. For each selected page, the page dictionary and everything reachable from it (content streams, /Resources, fonts, images, annotations, …) is deep-copied into a fresh object table, with all indirect references remapped. A dedup map keyed by (source, object-number) handles cycles and resources shared between pages, so a font referenced by ten pages is copied once.
  3. Inherited page attributes (/Resources, /MediaBox, /CropBox, /Rotate) are materialized onto each copied page before it is reparented under the new page tree (ISO 32000-2 §7.7.3.4), and /Parent is dropped.
  4. A fresh catalog, page tree, classic cross-reference table (§7.5.4), and trailer (with a new /ID) are written. The output PDF version is the maximum of the source versions.

Because the copy is byte-faithful, output quality is identical to the input — there is no generational loss from repeatedly extracting and merging.


PageExtractor

Extracts a subset of pages from an existing PDF into a brand-new PDF.

Page numbers are 1-based. The output keeps pages in the exact order you list them, so the same call also reorders and duplicates pages.

In-memory (byte[])

using JetsonPDF.Composition;

// Single page
byte[] cover = PageExtractor.Extract(sourceBytes, 1);

// Several pages, in the order given
byte[] picked = PageExtractor.Extract(sourceBytes, 3, 1, 5);

// Reorder + duplicate: page 2, then page 1 twice
byte[] shuffled = PageExtractor.Extract(sourceBytes, 2, 1, 1);

// Inclusive 1-based range (pages 5..12)
byte[] chapter = PageExtractor.ExtractRange(sourceBytes, 5, 12);

Encrypted input

Pass the user (or owner) password and the file is decrypted before extraction. The output is not encrypted.

byte[] unlocked = PageExtractor.Extract(sourceBytes, password: "secret", 1, 2);

File to file

PageExtractor.Extract("report.pdf", "summary.pdf", 1, 2, 10);
PageExtractor.Extract("locked.pdf", "summary.pdf", password: "secret", 1, 2);

Stream to stream

Neither stream is closed by the call, so you stay in control of their lifetimes.

using var input  = File.OpenRead("report.pdf");
using var output = File.Create("summary.pdf");
PageExtractor.Extract(input, output, 1, 2, 10);

Method summary

Member Returns Notes
Extract(byte[] source, params int[] pageNumbers) byte[] 1-based page numbers, order preserved.
Extract(byte[] source, string password, params int[] pageNumbers) byte[] Decrypts an encrypted source.
Extract(string inputPath, string outputPath, params int[] pageNumbers) void Reads and writes files.
Extract(string inputPath, string outputPath, string password, params int[] pageNumbers) void File overload with password.
Extract(Stream input, Stream output, params int[] pageNumbers) void Streams; neither is closed.
ExtractRange(byte[] source, int firstPage, int lastPage) byte[] Inclusive 1-based range.

Merger

Concatenates multiple PDFs into one, in the order supplied. Every page of every source is copied losslessly into a fresh page tree and catalog.

In-memory (byte[])

using JetsonPDF.Composition;

// params overload
byte[] combined = Merger.Merge(firstBytes, secondBytes, thirdBytes);

// IEnumerable overload — e.g. merge a whole folder, in name order
byte[] all = Merger.Merge(
    Directory.EnumerateFiles("chapters", "*.pdf")
             .OrderBy(p => p)
             .Select(File.ReadAllBytes));

File to file

Merger.Merge(new[] { "a.pdf", "b.pdf", "c.pdf" }, "combined.pdf");

Stream to stream

The output stream is written but not closed.

using var output = File.Create("combined.pdf");
Merger.Merge(new[] { File.OpenRead("a.pdf"), File.OpenRead("b.pdf") }, output);

Method summary

Member Returns Notes
Merge(params byte[][] sources) byte[] Concatenate in argument order.
Merge(IEnumerable<byte[]> sources) byte[] Concatenate a sequence.
Merge(IEnumerable<string> inputPaths, string outputPath) void Read files, write the result.
Merge(IEnumerable<Stream> inputs, Stream output) void Streams; the output is not closed.

Encrypted sources must be decrypted first — Merge has no password parameter and throws if it meets an encrypted file it can't read. Extract each source with its password (which produces a decrypted byte[]), then merge the results.


This is where Composition does more than naive byte-splicing. The document-level features that reference pages are merged across all sources, and cross-document name collisions are disambiguated so nothing silently shares state.

  • Outlines / bookmarks — each source's outline tree is appended under one merged /Outlines root. Destinations are remapped to the new page objects; a bookmark whose target page was dropped (and which has no surviving children) is pruned. Prev/Next/ First/Last/Count linkage is rebuilt, preserving open/closed state.
  • Named destinations — the modern /Names /Dests name tree and the legacy /Dests dictionary are merged into one name tree. Destinations targeting dropped pages are removed; name collisions across documents are suffixed (intro, intro_2, …).
  • AcroForm fields — a combined /AcroForm with a unified /Fields list, a merged default-resource (/DR) dictionary, OR-combined /NeedAppearances and /SigFlags, and a concatenated calculation order (/CO). Top-level field-name collisions are suffixed (signaturesignature + signature_2) so two forms that happen to use the same field name stay independent instead of sharing a value.
  • Default-resource fonts — identical standard fonts coming from different documents are shared under one resource name; a genuinely different font that lands under an already-used name (F1 from doc A vs. a different F1 from doc B) is added under a fresh name and the referring /DA appearance strings are rewritten to match.

Collision suffixing is consistent across features: a bookmark that points at a renamed named destination follows the rename, and a widget on a renamed field carries the new name too.

// Two PDFs that both define a "signature" field merge into
// "signature" + "signature_2" — each keeps its own value.
byte[] combined = Merger.Merge(formA, formB);

Full example

A runnable, end-to-end demo lives in samples/PdfCompositionDemo. It builds two source documents with the Writer, then exercises both operations. The shape of it:

using JetsonPDF;
using JetsonPDF.Reading;
using JetsonPDF.Composition;
using Path = System.IO.Path; // disambiguate from JetsonPDF.Path (the vector-path type)

// --- 1. Build two sources -----------------------------------------------------
byte[] reportBytes = BuildReport(); // 4 pages, with an outline + named destinations
byte[] formBytes   = BuildForm();   // 1 page, with AcroForm fields

// --- 2. Extract pages 4 and 1 (reordered) into a 2-page summary ----------------
byte[] summaryBytes = PageExtractor.Extract(reportBytes, 4, 1);
var summary = Reader.Load(summaryBytes);
Console.WriteLine($"summary: {summary.Pages.Count} pages, " +
                  $"dests [{string.Join(", ", summary.NamedDestinations.Keys.OrderBy(k => k))}]");

// --- 3. Merge the report and the form into one document ------------------------
byte[] mergedBytes = Merger.Merge(reportBytes, formBytes);
var merged = Reader.Load(mergedBytes);
Console.WriteLine($"merged: {merged.Pages.Count} pages, " +
                  $"{merged.Outlines.Count} top-level bookmarks");

File.WriteAllBytes("summary.pdf", summaryBytes);
File.WriteAllBytes("merged.pdf",  mergedBytes);

static byte[] BuildReport()
{
    var doc  = new Document { Title = "Quarterly Report" };
    var head = new Font(FontFamily.Helvetica, 20, FontStyle.Bold);
    var body = new Font(FontFamily.Helvetica, 12);

    string[] sections = { "Overview", "Revenue", "Costs", "Outlook" };
    for (int i = 0; i < sections.Length; i++)
    {
        var page = doc.AddPage(PageSize.Letter);
        page.DrawText(sections[i], head, 72, 720);
        page.DrawText($"The {sections[i].ToLowerInvariant()} section.", body, 72, 690);
        doc.Outlines.Add(sections[i], Destination.FitEntire(i));
        doc.NamedDestinations[sections[i].ToLowerInvariant()] = Destination.FitEntire(i);
    }
    return doc.Save();
}

static byte[] BuildForm()
{
    var doc   = new Document { Title = "Signup Form" };
    var label = new Font(FontFamily.Helvetica, 12);
    var page  = doc.AddPage(PageSize.Letter);

    page.DrawText("Name:", label, 72, 700);
    page.AddTextField("name", 130, 696, 200, 18, label).Value = "Ada Lovelace";
    page.DrawText("Subscribe:", label, 72, 660);
    page.AddCheckBox("subscribe", 130, 658, 14, 14).IsChecked = true;
    return doc.Save();
}

Run it with dotnet run --project samples/PdfCompositionDemo; it prints the page counts, outline titles, named-destination keys, and field names of every output so you can confirm what carried over.


Errors

The operations validate their arguments eagerly:

Condition Exception
source / sources / inputPaths is null ArgumentNullException
No page numbers passed to Extract ArgumentException
A page number < 1 ArgumentOutOfRangeException (page numbers are 1-based)
ExtractRange with firstPage < 1 or lastPage < firstPage ArgumentException
Empty sources passed to Merge ArgumentException
Source is encrypted and the password didn't unlock it (or none was supplied) InvalidOperationException

Scope & limitations

A fresh catalog and page tree are always emitted. What is and isn't carried over:

Carried over Not carried over
Page content, resources, fonts, images Document structure tree (tagged-PDF /StructTreeRoot)
Per-page annotations (links, widgets, markup) Catalog-level viewer preferences
Outlines / bookmarks (remapped + pruned) Page labels
Named destinations (modern tree + legacy dict) Article threads, OCG layer config
AcroForm fields (/Fields, /DR, /CO, flags)

The document information dictionary (/Info) is preserved — from the source on extract, and from the first document on merge.

Both types are stateless and their methods are safe to call concurrently from multiple threads (each call builds its own assembler over the input bytes). Input streams are read fully into memory before processing.

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 is compatible.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.1.0 33 6/6/2026