The software release for DocSight OCR 4.2.0 (18.104.22.16824) includes new features.
The following software is required to successfully use DocSight OCR.
- Windows Server™ 2008 R2, 2012 R2 or 2016
- Microsoft® .NET Framework 4.6.2 (If it is not detected, it is installed automatically)
- Supported browsers (for DocSight Verifier): Chrome and Firefox
Pentium 1.6 GHz or higher processor (Intel Core or higher CPU is recommended).
- 4 GB minimum RAM; 6 GB Recommended for grayscale or color images and more for multithreaded applications.
- 1 GB of free hard disk space
- 2 GB minimum RAM; 4 GB Recommended
- 600 MB of free hard disk space
Note: If installing an ActivePDF product on a Windows 2012 R2 server for the first time, you must download and install two Microsoft updates for Windows 2012 R2 servers. The updates resolve issues with Microsoft Visual C++ Redistributable Runtime Components. For links and step-by-step instructions, see the ActivePDF Knowledge Base article Installing Products on Windows 2012 R2 Servers.
DocSight OCR 4.2.0 has new features available through the Configuration Manager.
- Character Filter: Specify which characters are searchable in the resulting output PDF, by using the Character Filter option in the OCR Profiles General tab for the Searchable PDF (Image over Text) OCR Type. Search all characters (by default), numbers only, case-sensitive words, or punctuation.
- Auto Detect Language: OCR auto detects languages for word recognition. Use the Auto Detect Language check box in the OCR Profiles Character Recognition tab to automatically recognize the language in your input document.
Note: Install the corresponding language font locally for auto detect to work. For example, if OCR detects the document's language as Japanese, OCR requires a Japanese font to correctly process the document characters.
- File Mask: Create a filter to ignore a file during processing. Enter a file name, such as Thumbs.db, and OCR ignores that file during conversion, but processes all other files in the Input folder.
Note: The text box is for a specific, single file name; for example, generic syntax such as *.txt does not mask all .txt files.
- Document Confidence Level: When Debug is enabled, the logging results now display the confidence level for the entire document as a percentage. It also includes the number of suspicious characters out of the total number of characters in the document.
The Arabic letters display correctly.
OCR remote conversion for .NET works as expected.
Installation and Getting Started
API information is available in the DocSight OCR User Guide:
OCR Product Page
For more information, go to the DocSight OCR product page: