Platforms to show: All Mac Windows Linux Cross-Platform
Back to TessEngineMBS class.
TessEngineMBS.AllWordConfidences as Integer()
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
The number of confidences should correspond to the number of space-delimited words in GetText.
TessEngineMBS.AnalyseLayout as TessPageIteratorMBS
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
May optionally be called prior to Recognize to get access to just the page layout results. Returns an iterator to the results.
Returns nil on error or an empty page.
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Afterwards, you must call SetImage or TesseractRect before doing any Recognize or Get* operation.
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Please call Initialize after this to get started.
TessEngineMBS.GetAltoText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
TessEngineMBS.GetAvailableLanguages as String()
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Some examples using this method:
TessEngineMBS.GetBoolVariable(Name as String, byref value as boolean) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.
TessEngineMBS.GetBoxText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Constructs coordinates in the original image - not just the rectangle.
PageNumber is a 0-based page index that will appear in the box file.
TessEngineMBS.GetDoubleVariable(Name as String, byref value as Double) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.
TessEngineMBS.GetHOCRText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
PageNumber is 0-based but will appear in the output as 1-based.
TessEngineMBS.GetIntVariable(Name as String, byref value as Integer) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.
TessEngineMBS.GetLoadedLanguages as String()
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Includes all languages loaded by the last Init, including those loaded as dependencies of other loaded languages.
Some examples using this method:
TessEngineMBS.GetLSTMBoxText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Constructs coordinates in the original image - not just the rectangle.
PageNumber is a 0-based page index that will appear in the box file.
TessEngineMBS.GetStringVariable(Name as String) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.
TessEngineMBS.GetText as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
TessEngineMBS.GetTsvText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
PageNumber is 0-based but will appear in the output as 1-based.
TessEngineMBS.GetUNLVText as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
TessEngineMBS.GetWordStrBoxText(PageNumber as Integer) as String
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
PageNumber is a 0-based page index that will appear in the box file.
TessEngineMBS.Initialize(dataPath as String, language as String, Mode as Integer = 3, configs() as String = nil) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true on success and false on failure.
The datapath must be the name of the tessdata directory.
The language is (usually) an ISO 639-3 string or "" will default to eng.
It is entirely safe (and eventually will be efficient too) to call Initialize multiple times on the same instance to change language, or just to reset the classifier.
The language may be a string of the form [~]<lang>[+[~]<lang>]* indicating that multiple languages are to be loaded. Eg hin+eng will load Hindi and English. Languages may specify internally that they want to be loaded with one or more other languages, so the ~ sign is available to override that. Eg if hin were set to load eng by default, then hin+~eng would force loading only hin. The number of loaded languages is limited only by memory, with the caveat that loading additional languages will impact both speed and accuracy, as there is more work to do to decide on the applicable language, and there is more chance of hallucinating incorrect words.
Warning: On changing languages, all Tesseract parameters are reset back to their default values. (Which may vary between languages.)
If you have a rare need to set a Variable that controls initialization for a second call to Init you should explicitly call End() and then use SetVariable before Init. This is only a very rare use case, since there are very few uses that require any parameters to be set before Init.
TessEngineMBS.IsValidWord(Word as String) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Return false if the word is invalid, true if valid.
TessEngineMBS.PrintVariablesToFile(File as FolderItem) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true on success.
Fails if the file can't be created.
TessEngineMBS.PrintVariablesToPath(Path as String) as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true on success.
Fails if the file can't be created.
TessEngineMBS.Recognize as Boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true on success.
Optional. The Get*Text functions below will call Recognize if needed.
After Recognize, the output is kept internally until the next SetImage.
Some examples using this method:
TessEngineMBS.ResultIterator as TessResultIteratorMBS
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Loop over it to query details.
The result iterator is only valid until you end the engine.
Some examples using this method:
TessEngineMBS.SetImage(pic as picture)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Pass Xojo picture and we copy the pixels.
Mask or alpha channel is ignored.
TessEngineMBS.SetImageData(Data as MemoryBlock)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Image data can be an image file content like JPEG or PNG.
Supported formats depends on what leptonica was compiled to support.
See also:
TessEngineMBS.SetImageData(Data as String)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Image data can be an image file content like JPEG or PNG.
Supported formats depends on what leptonica was compiled to support.
See also:
TessEngineMBS.SetImageFile(File as FolderItem)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Point to an image file like JPEG or PNG.
Supported formats depends on what leptonica was compiled to support.
See also:
TessEngineMBS.SetImageFile(Path as String)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Point to an image file like JPEG or PNG.
Supported formats depends on what leptonica was compiled to support.
See also:
TessEngineMBS.SetRectangle(Left as Integer, Top as Integer, Width as Integer, Height as Integer)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Call after SetImage.
Each SetRectangle clears the recogntion results so multiple rectangles can be recognized with the same image.
TessEngineMBS.SetVariable(Name as String, Value as String)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | OCR | MBS OCR Plugin | 21.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Supply the name of the parameter and the value as a string, just as you would in a config file.
Returns false if the name lookup failed.
e.g.
SetVariable("tessedit_char_blacklist", "xyz") to ignore x, y and z.
Or
SetVariable("classify_bln_numeric_mode", "1") to set numeric-only mode.
SetVariable may be used before Init, but settings will revert to defaults on End().
Note: Must be called after Initialize(). Only works for non-init variables (init variables should be passed to Initialize()).
The items on this page are in the following plugins: MBS OCR Plugin.
