Platforms to show: All Mac Windows Linux Cross-Platform

Back to TesseractMBS class.

TesseractMBS.Clear

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Free up recognition results and any stored image data, without actually freeing any recognition data that would be time-consuming to reload.

Afterwards, you must call SetImage or TesseractRect before doing any Recognize or Get* operation.

TesseractMBS.ClearAdaptiveClassifier

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Call between pages or documents etc to free up memory and forget adaptive data.

TesseractMBS.Constructor

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Constructor which calls only InitForAnalysePage.

See also:

TesseractMBS.Constructor(folder as folderitem, lang as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Initializes tesseract.

Same as Init method.
Pass folderitem to parent folder of tessdata folder and language you need.

Instances are now mostly thread-safe and totally independent, but some global parameters remain. Basically it is safe to use multiple TessBaseAPIs in different threads in parallel, UNLESS: you use SetVariable on some of the Params in classify and textord. If you do, then the effect will be to change it for all your instances.

Note that the only members that may be called before Init are: SetInputName, SetOutputName, SetVariable, Get*Variable and PrintVariables.

The language is (usually) an ISO 639-3 string or "" will default to eng.
To use multiple languages, please concat them with plus sign: e.g. "eng+deu"

See also:

TesseractMBS.Constructor(path as string, lang as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Initializes tesseract.

Same as Init method.
Pass path to parent folder of tessdata folder and language you need.

Instances are now mostly thread-safe and totally independent, but some global parameters remain. Basically it is safe to use multiple TessBaseAPIs in different threads in parallel, UNLESS: you use SetVariable on some of the Params in classify and textord. If you do, then the effect will be to change it for all your instances.

Note that the only members that may be called before Init are: SetInputName, SetOutputName, SetVariable, Get*Variable and PrintVariables.

The language is (usually) an ISO 639-3 string or "" will default to eng.
To use multiple languages, please concat them with plus sign: e.g. "eng+deu"

See also:

TesseractMBS.GetBoolVariable(name as string, byref value as boolean) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Queries a variable as boolean value.

Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.

TesseractMBS.GetBoxText(page as Integer) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
The recognized text is returned as a char* which is coded in the same format as a box file used in training.

Constructs coordinates in the original image - not just the rectangle. page is a 0-based page index that will appear in the box file.

TesseractMBS.GetDoubleVariable(name as string, byref value as Double) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Queries a variable as Double value.

Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.

TesseractMBS.GetHOCRText(page as Integer) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Make a HTML-formatted string with hOCR markup from the internal data structures.

Page is 0-based but will appear in the output as 1-based.

TesseractMBS.GetIntVariable(name as string, byref value as Integer) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Queries a variable as Integer value.

Returns true if the parameter was found among Tesseract parameters.
Fills in value with the value of the parameter.

TesseractMBS.GetLastInitLanguage as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Return the language used in the last valid initialization.

TesseractMBS.GetStringVariable(name as string) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Queries variable's value as string.

TesseractMBS.GetText as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns recognized text.

Calls Recognize if needed internally.

TesseractMBS.GetVariableAsString(name as string) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Get value of named variable as a string, if it exists.

TesseractMBS.Init(folder as folderitem, lang as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Initializes tesseract.

Pass folderitem to parent folder of tessdata folder and language you need.

Instances are now mostly thread-safe and totally independent, but some global parameters remain. Basically it is safe to use multiple TessBaseAPIs in different threads in parallel, UNLESS: you use SetVariable on some of the Params in classify and textord. If you do, then the effect will be to change it for all your instances.

Start tesseract. Returns zero on success and -1 on failure.

Note that the only members that may be called before Init are: SetInputName, SetOutputName, SetVariable, Get*Variable and PrintVariables.

The language is (usually) an ISO 639-3 string or "" will default to eng.

See also:

TesseractMBS.Init(path as string, lang as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Initializes tesseract.

Pass path to parent folder of tessdata folder and language you need.

Instances are now mostly thread-safe and totally independent, but some global parameters remain. Basically it is safe to use multiple TessBaseAPIs in different threads in parallel, UNLESS: you use SetVariable on some of the Params in classify and textord. If you do, then the effect will be to change it for all your instances.

Start tesseract. Returns zero on success and -1 on failure.

Note that the only members that may be called before Init are: SetInputName, SetOutputName, SetVariable, Get*Variable and PrintVariables.

The language is (usually) an ISO 639-3 string or "" will default to eng.

See also:

TesseractMBS.InitForAnalysePage

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Init only for page layout analysis.

Use only for calls to SetImage and AnalysePage. Calls that attempt recognition will generate an error.

TesseractMBS.MeanTextConf as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns the (average) confidence value between 0 and 100.

TesseractMBS.NumDawgs as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Return the number of dawgs loaded.

Should be bigger than 0 if data files have been loaded.

Some examples using this method:

TesseractMBS.PageSegMode as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
property OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
The current page segmentation mode.

Defaults to kPageSegModeSingleBlock.
The mode is stored as an IntParam so it can also be modified by ReadConfigFile or SetVariable("tessedit_pageseg_mode", mode as string).
(Read and Write computed property)

TesseractMBS.PrintVariablesToStdErr

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Print Tesseract parameters to standard error.

TesseractMBS.PrintVariablesToStdOut

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Print Tesseract parameters to standard output.

TesseractMBS.Recognize as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Recognize the image.

Returns 0 on success.
Optional. The Get*Text functions below will call Recognize if needed. After Recognize, the output is kept internally until the next SetImage.

TesseractMBS.RecognizeMT as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 14.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Recognize the image.

Returns 0 on success.
Optional. The Get*Text functions below will call Recognize if needed. After Recognize, the output is kept internally until the next SetImage.

Same as Recognize, but thread friendly.

The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.
If you run several threads calling MT methods, you can get all CPU cores busy while main thread shows GUI with progress window.

TesseractMBS.ResultIterator as TesseractResultIteratorMBS

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Get an iterator to the results of LayoutAnalysis and/or Recognize.

This object points to data held within the TesseractMBS class, and therefore can only be used while the TesseractMBS class still exists and has not been subjected to a call of Init, SetImage, Recognize, Clear, End, DetectOS, or anything else that changes the internal PAGE_RES.

TesseractMBS.SetImage(buffer as memoryblock, width as Integer, height as Integer, BytesPerPixel as Integer, BytesPerLine as Integer) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Provide an image for Tesseract to recognize as a memoryblock.

Returns true on success and false on failure.

See also:

TesseractMBS.SetImage(Pic as Picture) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Provide an image for Tesseract to recognize.

The plugin makes a copy of the picture.
Returns true on success and false on failure.

See also:

TesseractMBS.SetInputName(name as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Set the name of the input file. Needed only for training and reading a UNLV zone file.

TesseractMBS.SetOutputName(name as string)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Set the name of the bonus output files. Needed only for debugging.

TesseractMBS.SetRectangle(left as Integer, top as Integer, width as Integer, height as Integer)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Restrict recognition to a sub-rectangle of the image.

Call after SetImage. Each SetRectangle clears the recogntion results so multiple rectangles can be recognized with the same image.

TesseractMBS.SetResolution(Resolution as Integer)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 15.1 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Set the resolution of the source image in pixels per inch so font size information can be calculated in results.

Call this after SetImage.

TesseractMBS.SetVariable(name as string, value as string) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method OCR MBS OCR Plugin 12.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Set the value of an internal "parameter."

Supply the name of the parameter and the value as a string, just as you would in a config file.
Returns false if the name lookup failed.
E.g. SetVariable("tessedit_char_blacklist", "xyz"); to ignore x, y and z.
Or SetVariable("classify_bln_numeric_mode", "1"); to set numeric-only mode.
SetVariable may be used before Init, but settings will revert to defaults on End().

Note: Must be called after Init(). Only works for non-init variables (init variables should be passed to Init()).

The items on this page are in the following plugins: MBS OCR Plugin.


The biggest plugin in space...