ExcavatorSharp.WebScraper.x64 1.2.3

ExcavatorSharp is a multi-threaded server for scraping web data. It converts HTML code into a structured array of data. The library allows data scraping from multiple sites in parallel mode, within a single running application. Create scraping tasks and perform data extraction on a schedule.

The library is designed for professional extraction and parsing of large volumes of data. Under the hood there are .css-selectors and xpath support, data export into .csv/.xlsx/.sql/.json, online data export, support for proxy servers, dynamic content crawling, interaction with the site via javascript and much more. The library uses .NET Sockets and Chromium Embedded Framework.

The library can be used separately as crawler or parser. We support the formats sitemap.xml and robots.txt. We support the gzip / deflate compression.

Attention! Only x64 versions are supported for .NET 4.5.2 and 4.6 platforms. AnyCPU build does not support! You will NOT be able to run the library when building AnyCPU. This is caused by the features of CEF.

Install-Package ExcavatorSharp.WebScraper.x64 -Version 1.2.3
dotnet add package ExcavatorSharp.WebScraper.x64 --version 1.2.3
<PackageReference Include="ExcavatorSharp.WebScraper.x64" Version="1.2.3" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add ExcavatorSharp.WebScraper.x64 --version 1.2.3
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Release Notes

1) Fixed an error in working with the test project directory.
2) Fixed an error in working with the project directory for viewing links.
3) Fixed error of eternal locking of a log file.
4) Fixed corrections in texts of log files. Several additional branches of log records have been added.
5) The algorithm of parallel analysis of links that are added by the user to indexing has been changed. Instead of Parallel.ForEach, linear foreach is now used. It is a more reliable way with minimum differences in performance.

This package is not used by any popular GitHub repositories.

Version History

Version Downloads Last updated
1.2.3 40 5/20/2020
1.2.2 22 5/10/2020
1.2.1 35 5/5/2020
1.2.0 30 4/30/2020
1.1.0 36 4/23/2020
1.0.53 56 4/12/2020
1.0.52 57 4/11/2020
1.0.51 61 4/11/2020
1.0.6 31 4/23/2020
1.0.5 61 4/11/2020
1.0.4 96 4/3/2020
1.0.3 52 2/12/2020
1.0.2 113 1/30/2020
1.0.1 64 1/30/2020
1.0.0 46 1/23/2020