Tesseract on Windows Photo from Unsplash

Originally Posted On: https://medium.com/@ahmetxgenc/how-to-use-tesseract-on-windows-fe9d2a9ba5c6

Tesseract is an optical character recognition software which developed by Google. Its an open source OCR tool. There are many versions of tesseract but we will use the 4.0 version.

In version 4, Tesseract has implemented a Long Short Term Memory (LSTM) based recognition engine. LSTM is a kind of Recurrent Neural Network (RNN). The LSTM-based recognition works much more effectively than the old (CNN-based) recognition processes.

Thanks to tesseract, we will be able to save the contents of our images as text files.

Installation

If you’d prefer skipping the setup entirely, IronOCR offers a simpler approach for .NET developers. It installs via NuGet with no PATH configuration or external dependencies. Just add the package and start extracting text.

using IronOcr;var ocr = new IronTesseract();using var input = new OcrInput("receipt.png");var result = ocr.Read(input);Console.WriteLine(result.Text);

IronOCR also handles low-DPI images and preprocessing automatically, which addresses the optimization challenges mentioned later in this article.

That said, Tesseract remains a solid choice if you need a free, open-source solution and don’t mind the configuration steps. Let’s continue with the Windows setup.

The installation is depends on your operating system. Now we’re going to go through the windows. First, let’s download and install tesseract through this link. (It downloads an exe file.) We setup the exe file easily.

After that we should add an PATH to windows system variables. Actually it’s an easy step. Firstly we find and copy the root folder of the tesseract installation. It will should be like that :

C:\Program Files\Tesseract-OCR

And then in the search bar of the windows Advanced System Settings

Advanced system settings > Advanced > Environment variables > PATH > New

We paste the source path which copied and we save this configurations. After this step the computer must be rebooted to apply configurations.

The tesseract installation completed. You can confirm the installation from the command line. When we run tesseract command on the command line, it should give us information about the program.

Press enter or click to view image in full size

Now we can move on to the python part. To use tesseract on python, we should download pytesseract library. This library can be downloaded via pip to the environment you are using.

pip install pytesseract

Now the tesseract is ready to use!!

Coding

It’s realy simple to use tesseract. The hard part is the optimizing the settings.
Because if you want to make a successful ocr, you need to be careful in image processing step and ocr settings.

Get Ahmet Genç’s stories in your inbox

Join Medium for free to get updates from this writer.

Let’s apply OCR to the receipt.

Importing The Libraries

import pytesseractfrom PIL import Imageimport cv2import numpy as np

Setting DPI Value of Image

Dots per inch (DPI, or dpi) is a measure of video or image scanner dot density. DPI value is an important thing to run OCR. Because if DPI value is lower then 300, it may reduce the success of OCR.

file_path= ‘receipt.jpg’im = Image.open(file_path)im.save(‘ocr.png’, dpi=(300, 300))

Applying Some Techniques to Make Image Cleaner

Firstly we scale our image with x2. If characters are small then we need to scale the image to recognize it. After that we apply a simple threshold technique. Its Binary Threshold. First you shold try with 127 value after that different variables can be tried. The thrashold change the pixel with black if the pixel value over the threshold value. If we make the image grayscale, it will give us a black and white image.

There is different threshold techniques. You can check the source website with this link.

image = cv2.imread(‘ocr.png’)image = cv2.resize(image, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)retval, threshold = cv2.threshold(image,127,255,cv2.THRESH_BINARY)

Press enter or click to view image in full size

Running Tesseract

Now we can run tesseract. It has an image_to_string() function. It gives us a string as an output.

text = pytesseract.image_to_string(treshold)

Saving Output

We can save the output with the following code.

with open(“Output.txt”, “w”,5 ,”utf-8") as text_file:text_file.write(text)

The output of the OCR is as follows:

Berghote 1Grosse Scheidegg3818 GrindelsaldFam ieRech. tin, 4572 30. 07, 20077 13:29: 17Ban Tach — 7/01Pxlatte Macchiato — 4 4.50 CHF — 9,00IxGlcki a 5.00 CF — 5.00IxSchusinschnitze} A 22.00 OF 22.00IxChasspatz li a 18,50Total : _ {HFIncl. 7.6% HuSt — 54.50 CHF: 3.85Entsnricht in Euro — 36.33 EUREs bediente Sig: UrsulaBuSt Nn. : 430 25Tel.: 033 853 67 16Fax. : 033 853 67 19E-mail: grossescn¢idegg@bluswin. ch

The result is very successful. If higher success is desired, different operations can be applied to the image.

The tesseract github page was referenced.

Information contained on this page is provided by an independent third-party content provider. Frankly and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact [email protected]