X.Web.MetaExtractor 2.0.2

Prefix Reserved
dotnet add package X.Web.MetaExtractor --version 2.0.2                
NuGet\Install-Package X.Web.MetaExtractor -Version 2.0.2                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="X.Web.MetaExtractor" Version="2.0.2" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add X.Web.MetaExtractor --version 2.0.2                
#r "nuget: X.Web.MetaExtractor, 2.0.2"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install X.Web.MetaExtractor as a Cake Addin
#addin nuget:?package=X.Web.MetaExtractor&version=2.0.2

// Install X.Web.MetaExtractor as a Cake Tool
#tool nuget:?package=X.Web.MetaExtractor&version=2.0.2                

X.Web.MetaExtractor

NuGet version Twitter URL

X.Web.MetaExtractor is a powerful library that allows you to extract meta information from any web page URL. It provides a variety of content loaders to handle HTTP requests using different libraries.

Breaking Changes

  • Metadata class was changes: The Content field has been removed from the Metadata class. Ensure to update your code to reflect this change if you were using the Content field.
  • Description Extraction Logic: The Extractor class now only extracts the description from meta tags, without attempting to parse the content of the page. Adjust your implementation if it relied on content parsing for the description.

Features

  • Extract meta information from any web page URL.
  • Support for multiple HTTP libraries:
    • Flurl
    • FsHttp
    • RestSharp
  • Detect the language of the page content.

Installation

To install the library, use the following command:

dotnet add package X.Web.MetaExtractor

Usage

Here is a basic example of how to use the X.Web.MetaExtractor library:

using X.Web.MetaExtractor;
using X.Web.MetaExtractor.ContentLoaders;
using X.Web.MetaExtractor.LanguageDetectors;

// Create instances of the necessary components
IPageContentLoader contentLoader = new FlurlPageContentLoader();
ILanguageDetector languageDetector = new LanguageDetector();
string defaultImage = "https://example.com/example.jpg";

// Create an instance of the Extractor
IExtractor extractor = new Extractor(defaultImage, contentLoader, languageDetector);

// Extract meta information from a URL
var metaInfo = await extractor.ExtractAsync( new Uri("https://example.com"));

// Display the extracted meta information
Console.WriteLine($"Title: {metaInfo.Title}");
Console.WriteLine($"Description: {metaInfo.Description}");
Console.WriteLine($"Keywords: {metaInfo.Keywords}");
Console.WriteLine($"Language: {metaInfo.Language}");

Interfaces and Classes

IExtractor

IExtractor defines the interface for extracting meta information.

ILanguageDetector

ILanguageDetector defines the interface for detecting the language of the page content.

IPageContentLoader

IPageContentLoader defines the interface for loading the content of a web page.

Metadata

Metadata is a class that holds the meta information of a web page, including the title, description, keywords, and language.

Content Loaders

Flurl

X.Web.MetaExtractor.ContentLoaders.Flurl provides a content loader using the Flurl HTTP library, enabling efficient and fluent HTTP request handling for meta information extraction from any page URL.

FsHttp

X.Web.MetaExtractor.ContentLoaders.FsHttp leverages the FsHttp library to load content, facilitating robust and type-safe HTTP request execution for extracting meta information from any page URL.

HttpClient

X.Web.MetaExtractor.ContentLoaders.HttpClient utilizes the HttpClient class to load content, offering a flexible and reliable approach to perform HTTP requests for meta information extraction from any page URL.

RestSharp

X.Web.MetaExtractor.ContentLoaders.RestSharp uses the RestSharp library for content loading, providing an intuitive and powerful way to handle HTTP requests for extracting meta information from any page URL.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 is compatible. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (3)

Showing the top 3 NuGet packages that depend on X.Web.MetaExtractor:

Package Downloads
X.Bluesky

Simple client for posting to Bluesky

X.Web.MetaExtractor.ContentLoaders.RestSharp

X.Web.MetaExtractor.ContentLoaders.RestSharp uses the RestSharp library for content loading, providing an intuitive and powerful way to handle HTTP requests for extracting meta information from any page URL.

X.Web.MetaExtractor.ContentLoaders.FsHttp

X.Web.MetaExtractor.ContentLoaders.FsHttp leverages the FsHttp library to load content, facilitating robust and type-safe HTTP request execution for extracting meta information from any page URL.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
2.0.2 150 7/9/2024
1.8.0 11,961 3/5/2023
1.7.0 520 9/8/2022
1.5.7688.23013 732 1/18/2021
1.4.7641.34535 401 12/2/2020
1.4.7312.26620 663 1/8/2020
1.1.7147.30235 599 9/26/2019
1.1.7147.30212 511 7/27/2019
1.0.12 847 11/24/2018
1.0.11 694 11/9/2018
1.0.10 657 11/5/2018
1.0.9 819 9/3/2018
1.0.8 747 8/31/2018
1.0.7 988 4/26/2018
1.0.5 905 4/11/2018
1.0.4 884 4/4/2018
1.0.3 1,021 1/31/2018
1.0.2 1,109 8/20/2017
1.0.0 888 7/23/2017