Name	Name	Last commit message	Last commit date
parent directory ..
C#	C#
VB.NET	VB.NET
README.md	README.md

Name

Last commit message

Last commit date

OCR PDF and convert to searchable document in C# and VB.NET

This sample shows how to create a searchable PDF document from a non-searchable (image-based) document using Docotic.Pdf library and Tesseract OCR Engine.

Follow these steps to do OCR when a PDF page does not contain searchable text:

Save the page as high-resolution image using Docotic.Pdf. Higher resolution leads to better recognition quality.
Recognize the image using Tesseract OCR engine.
Insert recognized text chunks back to PDF using Docotic.Pdf.

If your documents contain text in language(s) other than English, provide Language Data Files for Tesseract 4.00 for the language(s) of your document.

Provide feedback