Platforms to show: All Mac Windows Linux Cross-Platform
Back to RegExMBS class.
RegExMBS.Compile(pattern as string) as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Some predefined patterns like \b do not support Unicode well, so you may work around that by using your own pattern.
Returns true on success and false on failure.
ErrorMessage, Lasterror, ErrorOffset and Handle are set.
The following table lists the error codes than may be returned by Compile(), along with the error messages that may be returned by both compiling functions.
0 | no error |
1 | \ at end of pattern |
2 | \c at end of pattern |
3 | unrecognized character follows \ |
4 | numbers out of order in {} quantifier |
5 | number too big in {} quantifier |
6 | missing terminating ] for character class |
7 | invalid escape sequence in character class |
8 | range out of order in character class |
9 | nothing to repeat |
10 | operand of unlimited repeat could match the empty string |
11 | internal error: unexpected repeat |
12 | unrecognized character after (? |
13 | POSIX named classes are supported only within a class |
14 | missing ) |
15 | reference to non-existent subpattern |
16 | erroffset passed as NULL |
17 | unknown option bit(s) set |
18 | missing ) after comment |
19 | parentheses nested too deeply |
20 | regular expression too large |
21 | failed to get memory |
22 | unmatched parentheses |
23 | internal error: code overflow |
24 | unrecognized character after (?< |
25 | lookbehind assertion is not fixed length |
26 | malformed number after (?( |
27 | conditional group contains more than two branches |
28 | assertion expected after (?( |
29 | (?R or (?digits must be followed by ) |
30 | unknown POSIX class name |
31 | POSIX collating elements are not supported |
32 | this version of PCRE is not compiled with PCRE_UTF8 support |
33 | spare error |
34 | character value in \x{...} sequence is too large |
35 | invalid condition (?(0) |
36 | \C not allowed in lookbehind assertion |
37 | PCRE does not support \L, \l, \N, \U, or \u |
38 | number after (?C is > 255 |
39 | closing ) for (?C expected |
40 | recursive call could loop indefinitely |
41 | unrecognized character after (?P |
42 | syntax error after (?P |
43 | two named groups have the same name |
44 | invalid UTF-8 string |
45 | support for \P, \p, and \X has not been compiled |
46 | malformed \P or \p sequence |
47 | unknown property name after \P or \p |
RegExMBS.CompileMemory(pattern as memoryblock, ByteOffset as Integer) as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Same as Compile, but the text is stored in a memoryblock and must be a 0 terminated C string.
Be careful to use valid UTF8 input and provide offset in byte units and not in characters.
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
A value of 0 means that \R matches any Unicode line ending sequence; a value of 1 means that \R matches only CR, LF, or CRLF. The default can be overridden when a pattern is compiled or matched.
RegExMBS.ConfigLinkSize as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
The value is 2, 3, or 4. Larger values allow larger regular expressions to be compiled, at the expense of slower matching. The default value of 2 is sufficient for all but the most massive patterns, since it allows the compiled pattern to be up to 64K in size.
RegExMBS.ConfigMallocThreshold as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
RegExMBS.ConfigMatchLimit as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
RegExMBS.ConfigMatchLimitRecursion as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
RegExMBS.ConfigNewLine as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
The output is an integer whose value specifies the default character sequence that is recognized as meaning "newline". The four values that are supported are: 10 for LF, 13 for CR, 3338 for CRLF, -2 for ANYCRLF, and -1 for ANY. Though they are derived from ASCII, the same values are returned in EBCDIC environments. The default should normally correspond to the standard sequence for your operating system.
RegExMBS.ConfigStackRecurse as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
This is the usual way that PCRE is compiled. The output is zero if PCRE was compiled to use blocks of data on the heap instead of recursive function calls. In this case, malloc and free are called to manage memory blocks on the heap, thus avoiding the use of the stack.
RegExMBS.ConfigUnicodeProperties as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Should be true for the plugin.
RegExMBS.ConfigUTF8 as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 11.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
If this ever is false, please complain. This plugin is designed to work only on UTF8 strings for best performance.
Some examples using this method:
RegExMBS.Constructor(VecSize as Integer = 0)
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.4 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
You pass here the internal vector size which limits how many substrings you can find.
For 20 substrings, you need to pass (20+1)*3 for vector size.
RegExMBS.Escape(text as string) as string
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 7.8 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
The string is converted to UTF8 and all the RegEx special characters are escaped.
Returns "" on low memory.
RegExMBS.Execute(start as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.1 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
You can use this variant of execute to continue a search in the same string/memoryblock at a new starting offset.
See also:
RegExMBS.Execute(text as string, start as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns the number of found offsets.
text must be in UTF-8 text encoding.
Start must be 0 for the first character and the byte offset for other chacters. Do not pass values from OffsetCharacters here!
Return values from Execute:
If Execute() fails, it returns a negative number. The following are defined in the header file:
PCRE_ERROR_NOMATCH | -1 | The subject string did not match the pattern. |
PCRE_ERROR_NULL | -2 | Either code or subject was passed as "". |
PCRE_ERROR_BADOPTION | -3 | An unrecognized bit was set in the options argument. |
PCRE_ERROR_BADMAGIC | -4 | PCRE stores a 4-byte "magic number" at the start of the compiled code, to catch the case when it is passed a junk pointer and to detect when a pattern that was compiled in an environment of one endianness is run in an environment with the other endianness. This is the error that PCRE gives when the magic number is not present. |
PCRE_ERROR_UNKNOWN_NODE | -5 | While running the pattern match, an unknown item was encountered in the compiled pattern. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern. |
PCRE_ERROR_NOMEMORY | -6 | If a pattern contains back references, but the ovector that is passed to Execute() is not big enough to remember the referenced substrings, PCRE gets a block of memory at the start of matching to use for this purpose. If the call via pcre_malloc() fails, this error is given. The memory is automatically freed at the end of matching. |
PCRE_ERROR_MATCHLIMIT | -8 | The backtracking limit, as specified by the match_limit field in a pcre_extra structure (or defaulted) was reached. |
PCRE_ERROR_RECURSIONLIMIT | -21 | The internal recursion limit, as specified by the match_limit_recursion field in a pcre_extra structure (or defaulted) was reached. |
PCRE_ERROR_CALLOUT | -9 | This error is never generated by Execute() itself. It is provided for use by callout functions that want to yield a distinctive error code. See the pcrecallout documentation for details. |
PCRE_ERROR_BADUTF8 | -10 | A string that contains an invalid UTF-8 byte sequence was passed as a subject. |
PCRE_ERROR_BADUTF8_OFFSET | -11 | The UTF-8 byte sequence that was passed as a subject was valid, but the value of startoffset did not point to the beginning of a UTF-8 character. |
PCRE_ERROR_PARTIAL | -12 | The subject string did not match, but it did match partially. See the pcrepartial documentation for details of partial matching. |
PCRE_ERROR_BADPARTIAL | -13 | The PCRE_PARTIAL option was used with a compiled pattern containing items that are not supported for partial matching. See the pcrepartial documentation for details of partial matching. |
PCRE_ERROR_INTERNAL | -14 | An unexpected internal error has occurred. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern. |
PCRE_ERROR_BADCOUNT | -15 | This error is given if the value of the ovecsize argument is negative. |
See also:
RegExMBS.ExecuteMemory(text as memoryblock, ByteOffset as Integer = 0, ByteLength as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Same as Execute, but the text is stored in a memoryblock.
Be careful to use valid UTF8 input and provide offset and length in byte units and not in characters.
If ByteLength is zero, we take the length of the memoryblock.
RegExMBS.ExecuteMemoryMT(text as memoryblock, ByteOffset as Integer = 0, ByteLength as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.1 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Same as ExecuteMemory, but more thread friendly.
The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.
RegExMBS.ExecuteMT(start as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.1 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Same as Execute, but more thread friendly.
The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.
See also:
RegExMBS.ExecuteMT(text as string, start as Integer = 0) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.1 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Same as Execute, but more thread friendly.
The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.
See also:
RegExMBS.InfoNameEntry(Index as Integer) as string
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 13.4 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Only valid after pattern was compiled.
RegExMBS.Match(text as string) as boolean
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 16.0 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns true if text matches.
Does not set properties like Execute, so subString() won't work.
See also:
RegExMBS.Match(text() as string, inverse as boolean = false) as string()
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 16.0 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Returns the list of matching values.
If inverse is set to true, it returns the list of non matching values.
See also:
RegExMBS.Match(text() as Variant, inverse as boolean = false) as string()
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 16.0 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
The variant array should have entries which convert well to string.
Returns the list of matching values.
If inverse is set to true, it returns the list of non matching values.
See also:
RegExMBS.Offset(index as Integer) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
If you found a pattern in a string you get here:
index | offset |
0 | start of matched pattern |
1 | end of matched pattern |
2 | start of subexpression 1 |
3 | end of subexpression 1 |
2*n | start of subexpression n |
2*n+1 | end of subexpression n |
Invalid indexes return 0.
Count is the number of entries here.
RegExMBS.OffsetCharacters(index as Integer) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.3 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
This function is identical to Offset(), but returns characters instead of bytes.
Works only with valid UTF-8 strings as input.
Value is calculated on each function call based on Offset(index) and current text.
If you found a pattern in a string you get here:
index | offset |
0 | start of matched pattern |
1 | end of matched pattern |
2 | start of subexpression 1 |
3 | end of subexpression 1 |
2*n | start of subexpression n |
2*n+1 | end of subexpression n |
Invalid indexes return 0.
Count is the number of entries here.
Please note that if you just need offsets for calling Mid() function, you can get better performance by using just Offset and MidB function. Than neither Mid and OffsetCharacters need to calculate the character offset.
Some examples using this method:
RegExMBS.Replace(NewText as string) as string
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
You need to call Execute before.
Lasterror is set.
NewText must have UTF-8 text encoding.
\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.
RegExMBS.ReplaceAll(Target as string, NewText as string = "") as string
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
You need to call Compile before to initialize the pattern and you should call Study before to optimize the pattern.
Lasterror is set.
Target and NewText must have UTF-8 text encoding.
\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.
RegExMBS.ReplaceSelection(NewText as string) as string
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
This method is for text editors where you will store result in editfield.seltext to replace the current selection.
Lasterror is set.
You need to call Execute before.
NewText must have UTF-8 text encoding.
\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.
RegExMBS.StringNumber(name as string) as Integer
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
name: Name whose number is required
The yield of the function is the number of the parenthesis if the name is found, or PCRE_ERROR_NOSUBSTRING otherwise.
Type | Topic | Plugin | Version | macOS | Windows | Linux | iOS | Targets |
method | Regular Expressions | MBS RegEx Plugin | 6.2 | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | All |
Only useful if you use Execute several times.
In that case you call one time Compile, one time Study and several times Execute.
Errormessage is set.
The items on this page are in the following plugins: MBS RegEx Plugin.