Advance

Tutorial Install and Use Tesseract OCR on Debian 11

Richard 7 Min Read
Tutorial Install and Use Tesseract OCR on Debian 11

Tesseract is considered one of the most accurate open-source OCR engines currently, and its development has been supported by Google since 2006. Its capabilities can be more limited than commercial software such as Adobe Acrobat Pro and ABBYY FineReader. In this article, we are going to teach you How to Install and Use Tesseract OCR on Debian 11. You can visit the packages available in Eldernode if you wish to purchase a Linux VPS server.

How to Install and Use Tesseract OCR on Debian Linux

Introduction Tesseract OCR

Tesseract is free and open-source software that runs through the command-line interface and is an optical character recognition (OCR) system. Tesseract has been sponsored by Google since 2006.

 

How to analyze documents by Tesseract

  • User inputs desired title, document title and desired format into Tesseract.
  • Tesseract analyzes images and creates a new and searchable document in the user’s desired format.
  • You cannot scan something directly into Tesseract.

Install Tesseract OCR on Debian 11 | Debian 10

First, update Debian with the following command:

apt update -y

Then install Tesseract on Debian 11 by executing the following command:

sudo apt install tesseract-ocr

Tesseract will install under /usr/share/tesseract-ocr/4.00/tessdata.

The convert command is useful for converting between image formats and resizing an image, blurring, cropping, despeckling, dithering, drawing on, fliping, joining, re-sampling and more. This tool is provided by Imagemagick and you should enter the following command to install it:

sudo apt install imagemagick

Now you should test Tesseract. To do this, find an image containing the text and then execute the following command:

tesseract <image_name> <output file_name>

Tesseract extracts text from the image. To work with Tesseract, all you need to do is create word count documents. You have to train it to understand the handwriting.

Installing Tesseract with Sources

On different Linux distributions, you can also get Tesseract using the following command:

git clone https://github.com/tesseract-ocr/tesseract.git

Now you can go into the tesseract directory by running cd:

cd tesseract

At this point, you should run the autogen.sh script. To do this, enter the following command:

sudo ./autogen.sh

The above command creates the installation files. You can start the installation process by entering the following command:

sudo ./configure

You should enter the following command to start compiling Tesseract:

sudo make

Next, run the following command;

sudo make install

Then enter Idconfig command:

sudo Idconfig

Now you need to compile the training tools. To do this, run the following command:

sudo make training

Finally, run the following command:

sudo make training-install

Conclusion

InThis article taught you how to install and use Tesseract on Debian 11. We hope this article was useful for you.

View More Posts
Richard
Eldernode Writer
We Are Waiting for your valuable comments and you can be sure that it will be answered in the shortest possible time.

    Leave Your Comment

    Your email address will not be published.

    We are by your side every step of the way

    Think about developing your online business; We will protect it compassionately

    We are by your side every step of the way

    +8595670151

    7 days a week, 24 hours a day