public interface IRecognizedText
This interface is for work with recognized text. The result may be in multiple formats (plain text, array parts with details on each, in the hOCR format).
OcrEngine ocr = new OcrEngine(); ocr.getLanguages().addLanguage(Language.load("english")); ocr.setImage(ImageStream.fromFile(@"image.tiff")); if (ocr.process()) { } for(IRecognizedBlockInfo recognizedBlockInfo : ocr.getText().getBlocksInfo()) { String text = recognizedBlockInfo.getText(); if (recognizedBlockInfo.getBold()) text = String.format("<b>{0}</b>
", text); if (recognizedBlockInfo.getItalic()) text = String.format("<i>{0}</i>
", text); System.out.println(text); }
Modifier and Type | Method and Description |
---|---|
IRecognizedTextPartInfo[] |
getPartsInfo()
Gets an array of recognized text by parts.
|
java.lang.String |
toString()
Gets whole recognized text without formatting.
|
IRecognizedTextPartInfo[] getPartsInfo()
Each part has its own style, font, text size, color, language and more. If the text consists of several parts that are written by different font (or a different language, etc.), then according to each will have an element in this array. It is divided by the words into parts, if have large text that has the same style. Parts are consistently followed, that they found in the original text so they are following here.
java.lang.String toString()
toString
in class java.lang.Object