public final class OcrEngine
extends java.lang.Object
Main Aspose.OCR class. Users will use instance of this class most of the time.
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile(@"image.tiff")); ocr.getLanguages().addLanguage(Language.load("english")); ocr.setResource(new FileStream(resourceFileName, FileMode.Open))); { if (ocr.process()) { System.out.println(ocr.getText()); } }
Constructor and Description |
---|
OcrEngine()
Initializes a new instance of the
OcrEngine class. |
Modifier and Type | Method and Description |
---|---|
void |
addNotifier(INotifier processor)
Adds notifier.
|
void |
addRecognitionBlock(IRecognitionBlock recognitionBlock)
Adds recognition region.
|
void |
clearNotifies()
Clear notifiers list.
|
void |
clearRecognitionBlocks()
Clear recognition blocks array.
|
float |
getConfidence()
Gets confidence of recognized text.
|
OCRConfig |
getConfig()
Gets configuration.
|
boolean |
getDetectTextOnly()
Gets a value indicating whether automatical detection of the regions with text must be used.
|
LanguageContainer |
getLanguages()
Gets container recognition languages.
|
Page[] |
getPages()
Gets recognized text divided to pages.
|
java.lang.String |
getProbabilitySymbols()
Gets next probable symbols for recognized text.
|
boolean |
getProcessAllPages()
Gets a value indicating whether all frames in image must be processed.
|
IRecognizedText |
getText()
Gets recognized text.
|
boolean |
process()
Runs the recognition process.
|
void |
setConfig(OCRConfig value)
Sets configuration.
|
void |
setDetectTextOnly(boolean value)
Sets a value indicating whether automatical detection of the regions with text must be used.
|
void |
setImage(IImageStream value)
Sets the picture to recognize the text from.
|
void |
setProcessAllPages(boolean value)
Sets a value indicating whether all frames in image must be processed.
|
void |
setResource(FolderStream value)
Sets the resouce stream.
|
void |
setResource(ZipFileStream value)
Sets the resouce stream.
|
public void addNotifier(INotifier processor)
Adds notifier.
Each notifier can send event (recognized word, recognized several characters). You can add many notifiers.
[C#] OcrEngine ocr = new OcrEngine(); ocr.Image = ImageStream.fromFile(@"image.tiff"); ocr.Languages.AddLanguage(Language.Load("english")); INotifier wordNotifier = Notifier.Word(); wordNotifier.Elapsed += delegate { Console.WriteLine("1!" + wordNotifier.Text); }; INotifier blockNotifier = Notifier.Block(1024); blockNotifier.Elapsed += delegate { Console.WriteLine("2!" + blockNotifier.Text); }; ocr.AddNotifier(wordNotifier); ocr.AddNotifier(blockNotifier); using(ocr.Resource = new FileStream(resourceFileName, FileMode.Open)) { if (ocr.Process()) { Console.WriteLine(ocr.Text); } }
processor
- The processor to add.public void addRecognitionBlock(IRecognitionBlock recognitionBlock)
Adds recognition region. In case of adding region DetectTextOnly property will be automatically switched to "false".
In OCREngine there can be many different blocks of recognition (a rectangular block, page, rectangular block on a particular page). After recognition, you may get recognition results on each recognition block separately. Several recognition blocks may be defined for recognition process. When a block is not valid, empty string will be returned (no error raised).
[C#] OcrEngine ocr = new OcrEngine(); ocr.Image = ImageStream.fromFile(@"image.tiff"); ocr.Languages.AddLanguage(Language.Load("english")); IRecognitionBlock rectangleRecognitionBlock = RecognitionBlock.fromRectangle(0, 0, 100, 200); ocr.AddRecognitionBlock(rectangleRecognitionBlock); IRecognitionBlock pageRecognitionBlock = RecognitionBlock.fromPageBlock(4); ocr.AddRecognitionBlock(pageRecognitionBlock); using(ocr.Resource = new FileStream(resourceFileName, FileMode.Open)) { if (ocr.Process()) { } } Console.WriteLine(rectangleRecognitionBlock.Text); Console.WriteLine(pageRecognitionBlock.Text);
recognitionBlock
- The region to add.public void clearNotifies()
Clear notifiers list.
public void clearRecognitionBlocks()
Clear recognition blocks array.
public float getConfidence()
Gets confidence of recognized text. Confidence value is a minimal probability of all text symbols.
public OCRConfig getConfig()
Gets configuration.
public boolean getDetectTextOnly()
Gets a value indicating whether automatical detection of the regions with text must be used. If this property is set to "true", manual blocks adding will be ignored.
public LanguageContainer getLanguages()
Gets container recognition languages. Enables user to define a list of languages to use for recognition. You may add several languages here. This property must be set before recognition or specified as one of arguments during “Process” call.
Recognition of multiple languages. The text is recognized by words. Each recognized word has a specific language. There is priority of recognition languages. Language that was added earlier to the collection has a higher priority. If the word is identical in several languages, a language that was earlier added to the collection will be selected.
public Page[] getPages()
Gets recognized text divided to pages. This property is only available after recognition is complete, otherwise exception will be raised.
OcrException
- Thrown when use before recognition.public java.lang.String getProbabilitySymbols()
Gets next probable symbols for recognized text. Level of probability depends on ocrEngine.Config.ProbabilityRow property. This property depends on ocrEngine.Config.UseDefaultDictionaries. If this set to "true" and recognized word was confirmed by dictionary validation, probability symbols for these words always are the same as words themselves.
public boolean getProcessAllPages()
Gets a value indicating whether all frames in image must be processed.
public IRecognizedText getText()
Gets recognized text. This property is only available after recognition is complete, otherwise exception will be raised.
OcrException
- Thrown when use before recognition.public boolean process()
Runs the recognition process.
OcrEngine must be configured before running this method, otherwise Exception will be thrown. Once this method is called, you may get recognized text from Text property. Before calling the method, add at least one language to Languages and set image.
OcrException
- Thrown if it instance is not configured.public void setConfig(OCRConfig value)
Sets configuration.
public void setDetectTextOnly(boolean value)
Sets a value indicating whether automatical detection of the regions with text must be used. If this property is set to "true", manual blocks adding will be ignored.
public void setImage(IImageStream value)
Sets the picture to recognize the text from. This property must be set before recognition or specified as one of arguments during “Process” call.
public void setProcessAllPages(boolean value)
Sets a value indicating whether all frames in image must be processed.
public void setResource(FolderStream value)
Sets the resouce stream. This property must be set before recognition or specified as one of arguments during “Process” call.
public void setResource(ZipFileStream value)
Sets the resouce stream. This property must be set before recognition or specified as one of arguments during “Process” call.