Recognize Text & Objects in Graphical Images with PHP

21 05 2008

An OCR with PHP ? it doesn’t sounds very common topic for PHP developers, but Andrey Kucherenko from Ukraine have made a very interesting project to realize the first phpOCR. His classes can recognize text in monochrome graphical images after a training phase. The training phase is necessary to let the class build recognition data structures from images that have known characters. The training data structures are used during the recognition process to attempt to identify text in real images using the corner algorithm.

PHPOCR have win the PHPClasses innovation awards of march 2006, and it shows the power of what could be implemented with PHP5.

Certain types of applications require reading text from documents that are stored as graphical images. That is the case of scanned documents.

An OCR (Optical Character Recognition) tool can be used to recover the original text that is written in scanned documents. These are sophisticated tools that are trained to recognize text in graphical images.

This class provides a base implementation for an OCR tool. It can be trained to learn how to recognize each letter drawn in an image. Then it can be used to recognize longer texts in real documents.


Actions

Information

3 responses

24 05 2008
codebudo

While I appreciate the noble efforts of Andrey Kucherenko, writing image processing software in PHP will never match the speeds of a compiled language. This project is still quite young and requires a good deal of effort to actually do anything with. If you’re looking to get something working quickly, GOCR or Tesseract are much better options.

2 09 2009
Dipesh

Hi,

Great work Andrey Kucherenko.
by the way how do i train my class for empty space?
so i have to first make an image of all the characters and train the class and then add the image to process?

20 10 2009
Web design company

Yep, its great. Very useful.

Leave a comment