ExcavatorSharp is a multi-threaded server for scraping web data. It converts HTML code into a structured array of data. The library allows data scraping from multiple sites in parallel mode, within a single running application. Create scraping tasks and perform data extraction on a schedule.
The library can be used separately as crawler or parser. We support the formats sitemap.xml and robots.txt. We support the gzip / deflate compression.
Attention! Only x64 versions are supported for .NET 4.5.2 and 4.6 platforms. AnyCPU build does not support! You will NOT be able to run the library when building AnyCPU. This is caused by the features of CEF.
Install-Package ExcavatorSharp.WebScraper.x64 -Version 1.2.3
dotnet add package ExcavatorSharp.WebScraper.x64 --version 1.2.3
<PackageReference Include="ExcavatorSharp.WebScraper.x64" Version="1.2.3" />
paket add ExcavatorSharp.WebScraper.x64 --version 1.2.3
1) Fixed an error in working with the test project directory.
2) Fixed an error in working with the project directory for viewing links.
3) Fixed error of eternal locking of a log file.
4) Fixed corrections in texts of log files. Several additional branches of log records have been added.
5) The algorithm of parallel analysis of links that are added by the user to indexing has been changed. Instead of Parallel.ForEach, linear foreach is now used. It is a more reliable way with minimum differences in performance.
This package is not used by any popular GitHub repositories.