PommaLabs.HtmlArk 1.3.0

The ID prefix of this package has been reserved for one of the owners of this package by NuGet.org. Prefix Reserved
This package has a SemVer 2.0.0 package version: 1.3.0+e23dae1.
There is a newer version of this package available.
See the version list below for details.
dotnet add package PommaLabs.HtmlArk --version 1.3.0
NuGet\Install-Package PommaLabs.HtmlArk -Version 1.3.0
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="PommaLabs.HtmlArk" Version="1.3.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add PommaLabs.HtmlArk --version 1.3.0
#r "nuget: PommaLabs.HtmlArk, 1.3.0"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install PommaLabs.HtmlArk as a Cake Addin
#addin nuget:?package=PommaLabs.HtmlArk&version=1.3.0

// Install PommaLabs.HtmlArk as a Cake Tool
#tool nuget:?package=PommaLabs.HtmlArk&version=1.3.0

HtmlArk

License: MIT Donate Docs NuGet version NuGet downloads

standard-readme compliant GitLab pipeline status Quality gate Code coverage Renovate enabled

Embeds images, fonts, CSS and JavaScript into an HTML file. Resources are embedded using data URIs.

This project is a .NET rewrite of the homonymous Python project, from which the command line interface has been copied in order to ease interoperability.

Most disclaimers which were valid for the original library apply here too:

  • ⚠️ HtmlArk should be used with trusted HTML pages only or in a sandboxed environment. Untrusted HTML pages might contain resource links which are valid for HtmlArk but they might pose a serious security risk to your organization.
  • HtmlArk works with static HTML pages only. If an image or other resource is loaded with JavaScript, HtmlArk won't even know it exists.
  • Most browsers support data URIs, but as usual IE support might be less than ideal. Check data URIs compatibility on Can I use.

HtmlArk can be used to "pack" web pages into single HTML files. However, HtmlArk is not a crawler, so it must be paired with one in order to pack entire websites.

💡 If you plan to serve packed web pages, please remember to turn on GZIP compression. It usually yields good results and it helps to reduce download size.

Table of Contents

Install

NuGet package PommaLabs.HtmlArk is available for download:

dotnet add package PommaLabs.HtmlArk

HtmlArk .NET tool can be installed with following command:

dotnet tool install PommaLabs.HtmlArk.Tool

Usage

Library

As a library, HtmlArk can be included with the following using statement in your class:

using PommaLabs.HtmlArk;

And then, it can be used like this, for example:

IHtmlArchiver htmlArchiver = new HtmlArchiver(NullLogger<HtmlArchiver>.Instance);
string archivedHtml = await htmlArchiver.ArchiveAsync(new Uri("https://www.example.com/"));

If you use dependency injection, it can be registered this way:

services.AddHtmlArchiver(); // Maps IHtmlArchiver to HtmlArchiver as singleton.

Tool

HtmlArk .NET tool accepts the following command line arguments:

  -M, --http-client-max-resource-size    How many bytes can be downloaded for each resource.

  -T, --http-client-timeout              Timeout of the internal HTTP client.

  -A, --ignore-audios                    Ignores audios during archival.

  -C, --ignore-css                       Ignores style sheets during archival.

  -E, --ignore-errors                    Ignores unreadable resources.

  -I, --ignore-images                    Ignores images during archival.

  -J, --ignore-js                        Ignores external JavaScript during archival.

  -V, --ignore-videos                    Ignores videos during archival.

  -m, --minify                           Minifies output HTML.

  -o, --output                           Output file path. If not specified, output will be written to STDOUT.

  -v, --verbose                          Prints detailed information during HTML archival.

  --help                                 Display this help screen.

  --version                              Display version information.

  input (pos. 0)                         Required. Input URI or file path.

Interface is modeled after the original Python project, so it should be pretty easy to switch between them.

Maintainers

@pomma89.

Contributing

MRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

Editing

Visual Studio Code, with Remote Containers extension, is the recommended way to work on this project.

A development container has been configured with all required tools.

Visual Studio Community is also supported and an updated solution file, htmlark.sln, has been provided.

Restoring dependencies

When opening the development container, dependencies should be automatically restored.

Anyway, dependencies can be restored with following command:

dotnet restore

Running tests

Tests can be run with following command:

dotnet test

Tests can also be run with following command, which collects coverage information:

./build.sh --target run-tests

License

MIT © 2020-2023 Alessio Parma

Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 is compatible.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.6.0 150 11/15/2023
1.5.0 88 5/6/2023
1.4.0 67 4/28/2023
1.3.0 78 12/18/2022
1.2.0 413 11/21/2021
1.1.0 327 8/13/2021
1.0.0 286 8/12/2021
0.2.0 315 4/25/2021
0.1.1 491 11/15/2020
0.1.0 390 11/15/2020