10 packages returned for Tags:"extraction"
- 5,473 total downloads
- last updated 9/22/2015
- Latest version: 1.2.0
The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page. Boilerpipe.Net is a port of the Java boilerpipe library.
- 2,165 total downloads
- last updated 8/20/2017
- Latest version: 1.0.1
Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config file with xPath rules. It can scrape even multi-level complex objects such as tables and forum posts.
Find and extract translatable strings from C# and F# sources
- 195 total downloads
- last updated 4/12/2017
- Latest version: 1.1.0
Finds localizable messages in *.fs and *.cs files by looking for calls such as I18n.Translate("message") in those sources. Puts unique messages into specified JSON file (updates it if neccessary). Class name, method name and other things are configurable