Optical Character Recognition
17-Mar-2017Kommentare (0)
Generate text from screenshots using the free program "tesseract-ocr".
"tesseract-ocr" is a good free program for optical character recognition. I used it for generating German text from screenshots. Here is how to proceed:
Installation
- Learn about tesseract-ocr: tesseract/wiki
- Download it from github.com/UB-Mannheim
I got the version 3.05. - Download the language file "deu.traineddata" for German from tessdata/tree.
Usage
- Use cygwin on Windows
- Get help:
$PATH/tesseract.exe -h
$PATH is the path to the directory with tesseract.exe - Generate text from image:
$PATH/tesseract.exe test1.tif test1 -l deu
Neuer Kommentar