Bytescout.PDFExtractor
13.4.1.4801
dotnet add package Bytescout.PDFExtractor --version 13.4.1.4801
NuGet\Install-Package Bytescout.PDFExtractor -Version 13.4.1.4801
<PackageReference Include="Bytescout.PDFExtractor" Version="13.4.1.4801" />
paket add Bytescout.PDFExtractor --version 13.4.1.4801
#r "nuget: Bytescout.PDFExtractor, 13.4.1.4801"
// Install Bytescout.PDFExtractor as a Cake Addin #addin nuget:?package=Bytescout.PDFExtractor&version=13.4.1.4801 // Install Bytescout.PDFExtractor as a Cake Tool #tool nuget:?package=Bytescout.PDFExtractor&version=13.4.1.4801
Bytescout PDF Extractor SDK for .NET, ASP.NET, ActiveX - extract data from PDF documents
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.1 is compatible. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Framework | net20 is compatible. net35 was computed. net40 is compatible. net403 was computed. net45 was computed. net451 was computed. net452 was computed. net46 was computed. net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
-
.NETCoreApp 2.1
- Microsoft.Windows.Compatibility (>= 5.0.2)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Bytescout.PDFExtractor:
Package | Downloads |
---|---|
BizDoc.Applications.Invoice-Scan
Invoice for BizDoc |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
13.4.1.4801 | 7,317 | 7/24/2023 |
13.4.1.4792 | 1,624 | 7/20/2023 |
13.4.1.4787 | 1,855 | 7/14/2023 |
13.4.0.4760 | 2,484 | 5/24/2023 |
13.4.0.4755 | 1,652 | 5/24/2023 |
13.4.0.4734 | 2,014 | 5/9/2023 |
13.4.0.4727 | 1,840 | 5/2/2023 |
13.4.0.4717 | 1,739 | 4/17/2023 |
13.3.0.4514 | 6,279 | 9/27/2022 |
13.2.1.4489 | 8,383 | 6/13/2022 |
13.2.0.4485 | 5,565 | 6/7/2022 |
13.1.1.4480 | 4,498 | 5/25/2022 |
13.1.0.4386 | 30,575 | 1/25/2022 |
13.0.1.4281 | 3,055 | 11/8/2021 |
13.0.0.4254 | 2,098 | 10/4/2021 |
12.1.5.4183 | 2,883 | 7/5/2021 |
12.1.5.4181 | 1,829 | 7/5/2021 |
12.1.4.4171 | 2,146 | 6/17/2021 |
12.1.4.4169 | 1,634 | 6/17/2021 |
12.1.3.4167 | 2,027 | 6/16/2021 |
12.1.2.4156 | 1,970 | 5/28/2021 |
12.1.1.4149 | 2,051 | 5/26/2021 |
12.1.1.4145 | 1,866 | 5/26/2021 |
12.1.0.4136 | 2,167 | 5/18/2021 |
12.0.0.4062 | 2,689 | 2/8/2021 |
11.3.0.3983 | 2,818 | 10/26/2020 |
11.2.1.3959 | 4,009 | 9/1/2020 |
11.2.1.3929 | 2,652 | 7/14/2020 |
11.2.1.3926 | 2,103 | 7/9/2020 |
11.2.0.3919 | 2,343 | 6/30/2020 |
11.1.0.3869 | 4,799 | 4/10/2020 |
11.1.0.3864 | 2,321 | 4/4/2020 |
11.1.0.3849 | 2,431 | 3/27/2020 |
11.1.0.3845 | 2,182 | 3/19/2020 |
11.0.0.3834 | 2,313 | 3/6/2020 |
11.0.0.3832 | 2,329 | 3/4/2020 |
11.0.0.3830 | 2,402 | 3/4/2020 |
11.0.0.3815 | 2,350 | 2/21/2020 |
11.0.0.3805 | 2,451 | 2/11/2020 |
10.8.0.3758 | 3,519 | 12/19/2019 |
10.8.0.3750 | 2,341 | 12/17/2019 |
10.8.0.3744 | 2,281 | 12/12/2019 |
10.8.0.3741 | 2,080 | 12/10/2019 |
10.8.0.3736 | 2,299 | 12/6/2019 |
10.8.0.3732 | 2,216 | 12/4/2019 |
10.7.2.3710 | 2,779 | 11/13/2019 |
10.7.1.3705 | 2,266 | 11/11/2019 |
10.7.0.3697 | 2,411 | 11/2/2019 |
10.6.0.3666 | 3,649 | 10/1/2019 |
10.5.0.3637 | 3,051 | 9/2/2019 |
10.4.0.3618 | 2,707 | 8/15/2019 |
10.4.0.3613 | 2,249 | 8/13/2019 |
10.4.0.3602 | 2,625 | 8/7/2019 |
10.3.0.3566 | 2,921 | 7/2/2019 |
10.2.0.3548 | 3,598 | 6/13/2019 |
10.2.0.3534 | 2,200 | 6/11/2019 |
10.2.0.3525 | 2,283 | 6/7/2019 |
10.2.0.3514 | 2,270 | 5/28/2019 |
10.1.0.3444 | 2,730 | 4/5/2019 |
10.1.0.3439 | 2,014 | 4/4/2019 |
10.0.0.3429 | 2,081 | 3/25/2019 |
10.0.0.3427 | 2,080 | 3/25/2019 |
10.0.0.3424 | 2,026 | 3/23/2019 |
10.0.0.3423 | 1,952 | 3/23/2019 |
10.0.0.3422 | 1,989 | 3/23/2019 |
10.0.0.3421 | 1,963 | 3/21/2019 |
9.4.0.3398 | 2,000 | 3/12/2019 |
9.3.0.3366 | 4,046 | 2/12/2019 |
9.3.0.3357 | 1,799 | 2/4/2019 |
9.3.0.3354 | 1,768 | 1/31/2019 |
9.2.0.3293 | 3,593 | 11/20/2018 |
9.2.0.3262 | 2,071 | 10/24/2018 |
9.2.0.3259 | 1,711 | 10/24/2018 |
9.1.0.3170 | 3,212 | 7/26/2018 |
9.1.0.3167 | 2,077 | 7/18/2018 |
9.1.0.3165 | 1,885 | 7/18/2018 |
9.1.0.3163 | 1,888 | 7/18/2018 |
9.0.0.3095 | 3,270 | 4/23/2018 |
9.0.0.3087 | 2,286 | 4/13/2018 |
9.0.0.3080 | 2,162 | 4/11/2018 |
8.8.1.3046 | 3,048 | 2/20/2018 |
8.8.1.3025 | 3,141 | 1/29/2018 |
8.8.0.3021 | 2,219 | 1/23/2018 |
8.7.0.2981 | 16,736 | 11/8/2017 |
8.6.0.2917 | 3,081 | 8/2/2017 |
8.6.0.2912 | 1,903 | 8/1/2017 |
8.5.0.2863 | 2,246 | 6/9/2017 |
8.5.0.2861 | 2,371 | 6/8/2017 |
8.5.0.2856 | 2,116 | 6/1/2017 |
8.4.1.2829 | 6,520 | 4/12/2017 |
8.4.0.2821 | 2,175 | 3/29/2017 |
8.3.0.2809 | 3,061 | 3/13/2017 |
8.3.0.2806 | 1,968 | 3/12/2017 |
8.3.0.2803 | 1,990 | 3/6/2017 |
8.3.0.2801 | 2,006 | 3/6/2017 |
8.3.0.2800 | 1,989 | 3/6/2017 |
8.3.0.2798 | 2,060 | 3/6/2017 |
8.3.0.2796 | 1,968 | 3/6/2017 |
8.3.0.2794 | 1,988 | 3/6/2017 |
8.2.0.2699 | 2,451 | 1/11/2017 |
8.1.1.2606 | 3,500 | 10/25/2016 |
8.1.0.2600 | 2,023 | 10/21/2016 |
8.0.0.2542 | 2,337 | 9/1/2016 |
8.0.0.2541 | 2,074 | 9/1/2016 |
8.0.0.2528 | 2,105 | 8/23/2016 |
8.0.0.2523 | 2,201 | 8/19/2016 |
7.0.0.2493 | 32,763 | 6/27/2016 |
7.0.0.2489 | 1,775 | 6/27/2016 |
7.0.0.2480 | 4,534 | 6/10/2016 |
7.0.0.2474 | 6,654 | 5/26/2016 |
6.30.0.2421 | 2,110 | 3/24/2016 |
6.20.0.2354 | 2,273 | 1/20/2016 |
6.12.0.2239 | 5,180 | 9/22/2015 |
5.20.0.1871 | 2,723 | 2/5/2015 |
5.0.0.1626 | 2,582 | 8/14/2014 |
4.0.0.1487 | 1,913 | 5/31/2014 |
3.40.0.1349 | 2,183 | 3/11/2014 |
3.20.0.1092 | 2,368 | 8/5/2013 |
3.20.0.1075 | 3,211 | 7/12/2013 |
3.10.0.1051 | 2,123 | 6/29/2013 |
3.0.0.839 | 2,094 | 3/26/2013 |
2.50.0.769 | 2,116 | 2/25/2013 |
Bytescout PDF Extractor SDK for .NET, ASP.NET, ActiveX.
Artifex Software, Inc. (c) 2008-2023.
Compatibility: .NET Framework 2.0 or later; .NET Core 2.0 or later.
Works with: .NET, ASP.NET, ActiveX, Visual Basic 6, Classic ASP, Delphi and others.
Features:
- Extracts data from PDF files in TXT, CSV, XML, XLS, XLSX, JSON formats;
- Extracts embedded images, files and attachments from PDF files;
- Splits and merges PDF files, extracts a single page or range of pages;
- Extracts data from whole document page or specified rectangular region;
- Extracts PDF document information (author, subject, producer etc);
- Detects tables;
- Searches text inside document with regex support;
- Extracts data from PDF forms;
- Reads text from scanned PDF documents using OCR (Optical Character Recognition);
- Provides ActiveX interface to use from legacy programming languages (Visual Basic 6, Delphi) and scripting (VBscript, JScript and others);
- And much more...
History of changes:
13.4.1.4780 (July 14, 2023)
================================
+ Enhanced text parsing
+ Improved image file rendering
= Other minor fixes and improvements.
13.4.0.4659 (April 10, 2023)
================================
+ Added support for WEBP image format in 'RasterRenderer' and 'HTMLExtractor'
+ Adding Variant methods to extractors
- Improved fonts rendering
- fixing crash on text object where contentLength
= Performance improvements
= Other minor fixes and improvements.
13.3.0.4514 (September 27, 2022)
================================
+ DocumentSplitter: added support for "**" split range that splits document into pairs of pages.
+ Added methods to all extractors that support Variant datatype for input and output. They allow to perform in-memory processing when using the SDK as COM/ActiveX object from Delphi, VC++, VBScript, etc.
- Fixed text search for RTL languages.
- Input photo images are now rotated according to EXIF information.
= Improved parsing of PDF documents.
= Other minor fixes and improvements.
13.2.0.4485 (June 7, 2022)
==========================
= 'DocumentRotator' now can automatically fix rotation of PDF files using OCR.
= Improved line removal algorithm.
= Improved loading of embedded fonts.
= Performance improvements.
- Rotated text objects were combined with unrotated ones. Fixed now.
- Fixed parsing of names of file attachments.
- 'SearchablePDFMaker': fixed coordinates of transparent text in the output document when the input is an image.
= Suppressed junk console message.
= Improved parsing of PDF documents.
= Other minor fixes and improvements.
13.1.0.4386 (January 24, 2022)
==============================
+ DocumentMerger: Added property 'MergedDocumentTitle' allowing to override the title of merged document.
+ XLSExtractor: Added property 'CustomColumnWidths' allowing to specify exact column widths in generated Excel spreadsheet.
= JSONExtractor: The mode 'OutputStructure.Full' is renamed to 'OutputStructure.LegacyFixed' and made maximally compatible in field names with the mode 'OutputStructure.Legacy'.
+ Added support for UniKS-UCS2-H text encoding.
+ InfoExtractor: Added method 'GetFormFields()' returning information about form fields in PDF document.
= Improved COM/ActiveX interfaces for in-memory processing without file operations.
+ Extractors and SearchablePDFMaker: Added property 'OCRDisableAutoSegmentation' to solve OCR engine's segmentation issues.
= .NET Core min required version is 2.1 now (was 2.0).
- Line grouping was not affected by 'ConsiderFontSizes' and 'ConsiderFontColors' properties. Fixed now.
- Fixed disposing issue in 'SearchablePDFMaker'.
= Improved parsing of PDF documents.
= Other minor fixes and improvements.
13.0.0.4253 (October 4, 2021)
=============================
+ New column detection mode 'ColumnDetectionMode.ContentGroupsAI' that works better on tables without borders and on pages with multiple tables.
= Greatly improved tables detection in 'TableDetector2'.
= Improved filtering of shadow-like text ('ExtractShadowLikeText' option).
= Improved the 'LineGroupingMode.JoinOrphanedRows'.
= 'DocumentMerger': Improved merging of PDF forms. Now it can link fields with matching names or rename them to avoid unwanted linking. See the property 'RenameMatchingFieldsDuringMerge'.
= 'JSONExtractor' and 'XMLExtractor' now output the page size for each page.
= All extractor classes now support extraction of page ranges.
+ Added properties 'DetectUnderlineTextStyle' and 'DetectStrikeoutTextStyle' to `CSVExtractor` and `XLSExtractor`. They helps to prevent underlined text affecting the line grouping in table cells.
= Improved background color detection for the option 'ConsiderBackgroundColors'.
+ Added property 'NormalizeText' to all extractors. It replaced unicode spaces and hyphens in the extracted text with normal ' ' and '-' characters.
- 'Remover2': fixed handling of PDF page rotation.
- 'Remover2': making unsearchable now performed only for edited pages.
+ 'XMLExtractor': Added property 'IndentedXML' to control indentation.
+ 'JSONExtractor': Added property 'IndentedJSON' to control indentation.
- 'Stamper': fixed stamping of rotated pages.
+ Added new OCR mode - 'OCRMode.AutoRepairFonts'. It automatically tries to detect PDF documents with corrupted text and forces OCR font repair for them. Works only for English texts.
+ Added property 'PageSeparator' to CSV and XLS extractors.
= 'XLSExtractor': improved negative numbers detection.
- 'TextExtractor.FindAll()' method was ignoring the case sensitivity option. Fixed now.
+ Added property 'OCRDetectLines' that helps to detect table structure in scanned documents.
+ 'JSONExtractor' and 'XMLExtractor' now outputs number of pages in the result and number of pages for which OCR was performed.
+ Added property 'OCRPageCount' to extractors that contains number of pages for which OCR was performed during the last extraction.
+ 'JSONExtractor': Added property 'OutputStructure' that allows to select structure of output JSON.
+ 'JSONExtractor': Added property 'OutputTransformation' that allows to apply JSONPath expression to the output JSON.
= Performance improvements.
= Improved parsing of PDF documents.
= Other minor fixes and improvements.
12.1.0.4136 (May 18, 2021)
==========================
+ Added property 'TextExtractor.FuzzySearch' that enables 'fuzzy' text search algorithm. It allows to find 'approximately equal' strings.
+ Added 'DocumentSplitter2' class that splits document by found text.
+ Added 'CSVExtractor.NormalizeCSV' property. It makes CSV data produced from different document pages to contain the same number of columns.
+ Added property 'JSONExtractor.OutputStructure' that allows to change the structure of the generated JSON to one of predefined variants for easier postprocessing.
+ Added property 'JSONExtractor.OutputTransformation' that allows to apply JSONPath expression to the generated JSON.
+ Added property 'OCRPageCount' to extractor classes that contains number of pages for which OCR was performed.
+ 'JSONExtractor' and 'XMLExtractor' now add to the generated JSON and XML result the number of process pages and the number of pages for which OCR was performed.
+ Added property 'OCRDetectLines' to extractor classes that improves column detection in scanned documents.
+ Added property 'ConsiderBackgroundColors' to extractor classes that enables detection of background color under text objects. It may helps to improve row and column detection in tables without borders but with color stripes.
+ Added properties 'DocumentMerger.GenerateBookmarks' and 'DocumentMerger.BookmarkTitles' to enable automatic generation of bookmarks pointing to the merged parts.
= Improved PDF optimization in 'DocumentSplitter'.
= 'DocumentMerger' now uses the first input document as the base for the merged document. This allows to keep document information properties and outlines.
= DocumentMerger: added support for profiles.
= MultimediaExtractor: added support for more media types.
- 'TextExtractor.FindAll()' method was ignoring the case sensitivity option.
- Fixed issue with junk empty temporary files generated during OCR.
= Improved parsing of PDF documents.
= Other minor fixes and improvements.
...