TikaOnDotnet.TextExtractor 1.17.1

Classes for running Apache Tika through **TikaOnDotNet**. Just use TextExtractor.Extract() and you'll be on your way.

Install-Package TikaOnDotnet.TextExtractor -Version 1.17.1
dotnet add package TikaOnDotnet.TextExtractor --version 1.17.1
<PackageReference Include="TikaOnDotnet.TextExtractor" Version="1.17.1" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add TikaOnDotnet.TextExtractor --version 1.17.1
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Release Notes

- Add new overloads to the `TextExtractor.Extract` allowing users to provide their own extraction result assemblers. Example:
```cs
public class CustomResult
{
public string Text { get; set; }
public IDictionary&lt;string, string[]&gt; Metadata { get; set; }
}
public static CustomResult CreateCustomResult(string text, Metadata metadata)
{
var metaDataDictionary = metadata.names().ToDictionary(name =&gt; name, metadata.getValues);
return new CustomResult
{
Metadata = metaDataDictionary,
Text = text,
};
}
[Test]
public void should_extract_author_list_from_pdf()
{
var textExtractionResult = new TextExtractor().Extract("file_with_authors.pdf", CreateCustomResult);
textExtractionResult.Metadata["meta:author"].Should().ContainInOrder("Fred Jones, M. D.", "Donald Evans D. M.");
}
```

This package is not used by any popular GitHub repositories.

Version History

Version Downloads Last updated
1.17.1 37,840 4/3/2018
1.17.0 6,653 2/15/2018
1.16.0 12,817 7/30/2017
1.15.0 284 7/30/2017
1.14.2 7,850 4/22/2017
1.14.2-pre 270 4/15/2017
1.14.1 6,986 1/13/2017
1.14.0 1,219 12/8/2016
1.13.1 1,964 8/16/2016
1.13.0 1,094 6/30/2016
1.12.2 1,027 4/12/2016
1.12.1 330 4/12/2016
1.12.0 366 4/11/2016