Platforms to show: All Mac Windows Linux Cross-Platform

Back to RegExMBS class.

Next items

RegExMBS.Compile(pattern as string) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Compiles a pattern.
Example
Var r as new RegExMBS
Var searchString as string = ".o"

if r.Compile(searchString) then
msgbox "OK"
else
MsgBox "failed to compile"
end if

Some predefined patterns like \b do not support Unicode well, so you may work around that by using your own pattern.

Returns true on success and false on failure.
ErrorMessage, Lasterror, ErrorOffset and Handle are set.

The following table lists the error codes than may be returned by Compile(), along with the error messages that may be returned by both compiling functions.

0no error
1\ at end of pattern
2\c at end of pattern
3unrecognized character follows \
4numbers out of order in {} quantifier
5number too big in {} quantifier
6missing terminating ] for character class
7invalid escape sequence in character class
8range out of order in character class
9nothing to repeat
10operand of unlimited repeat could match the empty string
11internal error: unexpected repeat
12unrecognized character after (?
13POSIX named classes are supported only within a class
14missing )
15reference to non-existent subpattern
16erroffset passed as NULL
17unknown option bit(s) set
18missing ) after comment
19parentheses nested too deeply
20regular expression too large
21failed to get memory
22unmatched parentheses
23internal error: code overflow
24unrecognized character after (?<
25lookbehind assertion is not fixed length
26malformed number after (?(
27conditional group contains more than two branches
28assertion expected after (?(
29(?R or (?digits must be followed by )
30unknown POSIX class name
31POSIX collating elements are not supported
32this version of PCRE is not compiled with PCRE_UTF8 support
33spare error
34character value in \x{...} sequence is too large
35invalid condition (?(0)
36\C not allowed in lookbehind assertion
37PCRE does not support \L, \l, \N, \U, or \u
38number after (?C is > 255
39closing ) for (?C expected
40recursive call could loop indefinitely
41unrecognized character after (?P
42syntax error after (?P
43two named groups have the same name
44invalid UTF-8 string
45support for \P, \p, and \X has not been compiled
46malformed \P or \p sequence
47unknown property name after \P or \p

RegExMBS.CompileMemory(pattern as memoryblock, ByteOffset as Integer) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Compiles a pattern.

Same as Compile, but the text is stored in a memoryblock and must be a 0 terminated C string.
Be careful to use valid UTF8 input and provide offset in byte units and not in characters.

RegExMBS.ConfigBSR as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns an integer whose value indicates what character sequences the \R escape sequence matches by default.

A value of 0 means that \R matches any Unicode line ending sequence; a value of 1 means that \R matches only CR, LF, or CRLF. The default can be overridden when a pattern is compiled or matched.

RegExMBS.ConfigLinkSize as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns an integer that contains the number of bytes used for internal linkage in compiled regular expressions.

The value is 2, 3, or 4. Larger values allow larger regular expressions to be compiled, at the expense of slower matching. The default value of 2 is sufficient for all but the most massive patterns, since it allows the compiled pattern to be up to 64K in size.

RegExMBS.ConfigMallocThreshold as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
The output is an integer that contains the threshold above which the POSIX interface uses malloc() for output vectors.

RegExMBS.ConfigMatchLimit as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns an integer that gives the default limit for the number of internal matching function calls in a Execute execution.

RegExMBS.ConfigMatchLimitRecursion as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns an integer that gives the default limit for the depth of recursion when calling the internal matching function in a Execute() execution.

RegExMBS.ConfigNewLine as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
What newline character is used as default.

The output is an integer whose value specifies the default character sequence that is recognized as meaning "newline". The four values that are supported are: 10 for LF, 13 for CR, 3338 for CRLF, -2 for ANYCRLF, and -1 for ANY. Though they are derived from ASCII, the same values are returned in EBCDIC environments. The default should normally correspond to the standard sequence for your operating system.

RegExMBS.ConfigStackRecurse as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns an integer that is set to one if internal recursion when running Execute() is implemented by recursive function calls that use the stack to remember their state.

This is the usual way that PCRE is compiled. The output is zero if PCRE was compiled to use blocks of data on the heap instead of recursive function calls. In this case, malloc and free are called to manage memory blocks on the heap, thus avoiding the use of the stack.

RegExMBS.ConfigUnicodeProperties as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Returns true if unicode properties are available.

Should be true for the plugin.

RegExMBS.ConfigUTF8 as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 11.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Whether UTF8 is supported.

If this ever is false, please complain. This plugin is designed to work only on UTF8 strings for best performance.

Some examples using this method:

RegExMBS.Constructor(VecSize as Integer = 0)

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.4 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
The constructor.

You pass here the internal vector size which limits how many substrings you can find.
For 20 substrings, you need to pass (20+1)*3 for vector size.

RegExMBS.Escape(text as string) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 7.8 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Escapes the string.
Example
Var r as new RegExMBS

Var s as string = "Hello []"
Var e as string = r.Escape(s)
MsgBox e // shows Hello \[\]

Var d as string = r.Unescape(e)
MsgBox d // shows original string

The string is converted to UTF8 and all the RegEx special characters are escaped.
Returns "" on low memory.

RegExMBS.Execute(start as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.1 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern again.

You can use this variant of execute to continue a search in the same string/memoryblock at a new starting offset.

See also:

RegExMBS.Execute(text as string, start as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern.
Example
Var r as RegExMbs
Var s as string
Var c as Integer

s="123 ABC 456"

r=new RegExMBS
if r.Compile(" \D+ ") then
c=r.Execute(s,0)
MsgBox str(c)+" "+str(r.Offset(0))+" "+str(r.Offset(1))
// shows: 1 3 8
// 1 for number of results
// 3 for 3 bytes before the matched pattern
// 8 for the 8 bytes before the end of the matched pattern
end if

Returns the number of found offsets.
text must be in UTF-8 text encoding.
Start must be 0 for the first character and the byte offset for other chacters. Do not pass values from OffsetCharacters here!

Return values from Execute:
If Execute() fails, it returns a negative number. The following are defined in the header file:

PCRE_ERROR_NOMATCH -1The subject string did not match the pattern.
PCRE_ERROR_NULL -2Either code or subject was passed as "".
PCRE_ERROR_BADOPTION -3An unrecognized bit was set in the options argument.
PCRE_ERROR_BADMAGIC -4PCRE stores a 4-byte "magic number" at the start of the compiled code, to catch the case when it is passed a junk pointer and to detect when a pattern that was compiled in an environment of one endianness is run in an environment with the other endianness. This is the error that PCRE gives when the magic number is not present.
PCRE_ERROR_UNKNOWN_NODE -5While running the pattern match, an unknown item was encountered in the compiled pattern. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern.
PCRE_ERROR_NOMEMORY -6If a pattern contains back references, but the ovector that is passed to Execute() is not big enough to remember the referenced substrings, PCRE gets a block of memory at the start of matching to use for this purpose. If the call via pcre_malloc() fails, this error is given. The memory is automatically freed at the end of matching.
PCRE_ERROR_MATCHLIMIT -8The backtracking limit, as specified by the match_limit field in a pcre_extra structure (or defaulted) was reached.
PCRE_ERROR_RECURSIONLIMIT -21The internal recursion limit, as specified by the match_limit_recursion field in a pcre_extra structure (or defaulted) was reached.
PCRE_ERROR_CALLOUT -9This error is never generated by Execute() itself. It is provided for use by callout functions that want to yield a distinctive error code. See the pcrecallout documentation for details.
PCRE_ERROR_BADUTF8 -10A string that contains an invalid UTF-8 byte sequence was passed as a subject.
PCRE_ERROR_BADUTF8_OFFSET -11The UTF-8 byte sequence that was passed as a subject was valid, but the value of startoffset did not point to the beginning of a UTF-8 character.
PCRE_ERROR_PARTIAL -12The subject string did not match, but it did match partially. See the pcrepartial documentation for details of partial matching.
PCRE_ERROR_BADPARTIAL -13The PCRE_PARTIAL option was used with a compiled pattern containing items that are not supported for partial matching. See the pcrepartial documentation for details of partial matching.
PCRE_ERROR_INTERNAL -14An unexpected internal error has occurred. This error could be caused by a bug in PCRE or by overwriting of the compiled pattern.
PCRE_ERROR_BADCOUNT -15This error is given if the value of the ovecsize argument is negative.

See also:

RegExMBS.ExecuteMemory(text as memoryblock, ByteOffset as Integer = 0, ByteLength as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern.

Same as Execute, but the text is stored in a memoryblock.
Be careful to use valid UTF8 input and provide offset and length in byte units and not in characters.
If ByteLength is zero, we take the length of the memoryblock.

RegExMBS.ExecuteMemoryMT(text as memoryblock, ByteOffset as Integer = 0, ByteLength as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.1 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern.

Same as ExecuteMemory, but more thread friendly.

The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.

RegExMBS.ExecuteMT(start as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.1 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern.

Same as Execute, but more thread friendly.

The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.

See also:

RegExMBS.ExecuteMT(text as string, start as Integer = 0) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.1 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Performs a search with the compiled pattern.

Same as Execute, but more thread friendly.

The work is performed on a preemptive thread, so this function does not block the application and can yield time to other Xojo threads. Must be called in a Xojo thread to enjoy benefits. If called in main thread will block, but keep other background threads running.

See also:

RegExMBS.InfoNameEntry(Index as Integer) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 13.4 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Queries a name entry in the list of captures.

Only valid after pattern was compiled.

RegExMBS.Match(text as string) as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 16.0 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Checks if current search pattern matches against given text.
Example
Var r as new RegExMBS

if r.Compile("e.l") then

if r.Match("Hello") then
MsgBox "match"
end if

if r.Match("Helro") then
MsgBox "wrong match"
end if
end if

Returns true if text matches.

Does not set properties like Execute, so subString() won't work.

See also:

RegExMBS.Match(text() as string, inverse as boolean = false) as string()

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 16.0 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Checks if current search pattern matches against given text array.
Example
Var r as new RegExMBS

r.CompileOptionCaseLess = true

if r.Compile("e.l") then

Var t() as string = array("Hello", "World", "Xojo", "test")

Var match1() as string = r.Match(t)
Var match2() as string = r.Match(t, true)

MsgBox "Matching: "+Join(match1, ", ")+EndOfLine+"Other: "+Join(match2, ", ")
end if

Returns the list of matching values.
If inverse is set to true, it returns the list of non matching values.

See also:

RegExMBS.Match(text() as Variant, inverse as boolean = false) as string()

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 16.0 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Checks if current search pattern matches against given variant array.
Example
Var r as new RegExMBS

r.CompileOptionCaseLess = true

if r.Compile("e.l") then

Var dic as new Dictionary
dic.Value("Hello") = 1
dic.Value("World") = 2
dic.Value("Xojo") = 3
dic.Value("Test") = 4

Var match1() as string = r.Match(dic.keys)
Var match2() as string = r.Match(dic.keys, true)

MsgBox "Matching: "+Join(match1, ", ")+EndOfLine+"Other: "+Join(match2, ", ")
end if

The variant array should have entries which convert well to string.
Returns the list of matching values.
If inverse is set to true, it returns the list of non matching values.

See also:

RegExMBS.Offset(index as Integer) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Get the offset in the offset list with given index in bytes.
Example
Var r as RegExMbs
Var s as string
Var c as Integer

s="123 äöü ABC 456"

r=new RegExMBS
if r.Compile(".ö.") then
c=r.Execute(s,0)
MsgBox str(c)+" "+str(r.Offset(0))+" "+str(r.Offset(1))
// shows: 1 4 10
// 1 for ubound of the offset array
// 4 for 4 bytes before the matched pattern
// 10 for the 10 bytes before the end of the matched pattern
end if

r=new RegExMBS
if r.Compile(".\xF6.") then // finds ö using Unicode codepoint
c=r.Execute(s,0)
MsgBox str(c)+" "+str(r.Offset(0))+" "+str(r.Offset(1))
// shows: 1 4 10
// 1 for ubound of the offset array
// 4 for 4 bytes before the matched pattern
// 10 for the 10 bytes before the end of the matched pattern
end if

If you found a pattern in a string you get here:

indexoffset
0start of matched pattern
1end of matched pattern
2start of subexpression 1
3end of subexpression 1
2*nstart of subexpression n
2*n+1end of subexpression n

Invalid indexes return 0.
Count is the number of entries here.

RegExMBS.OffsetCharacters(index as Integer) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.3 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Get the offset in the offset list with given index in characters.
Example
Var r as new RegExMBS
Var searchString as string = ".o"

if r.Compile(searchString) then

Var s as string="äöü Hello World"

if r.Execute(s,0)>0 then
Var lines(-1) as string

lines.Append str(R.Count)+" offset found."
lines.Append "In Bytes:"
lines.Append " Start of matched patern: "+str(R.Offset(0))
lines.Append " End of matched patern: "+str(R.Offset(1))
lines.Append " Length of matched patern: "+str(R.Offset(1)-r.Offset(0))

lines.Append "In Characters:"
lines.Append " Start of matched patern: "+str(R.OffsetCharacters(0))
lines.Append " End of matched patern: "+str(R.OffsetCharacters(1))
lines.Append " Length of matched patern: "+str(R.OffsetCharacters(1)-r.OffsetCharacters(0))

MsgBox Join(lines,EndOfLine)
end if

else
MsgBox "failed to compile"
end if

This function is identical to Offset(), but returns characters instead of bytes.
Works only with valid UTF-8 strings as input.
Value is calculated on each function call based on Offset(index) and current text.

If you found a pattern in a string you get here:

indexoffset
0start of matched pattern
1end of matched pattern
2start of subexpression 1
3end of subexpression 1
2*nstart of subexpression n
2*n+1end of subexpression n

Invalid indexes return 0.
Count is the number of entries here.

Please note that if you just need offsets for calling Mid() function, you can get better performance by using just Offset and MidB function. Than neither Mid and OffsetCharacters need to calculate the character offset.

Some examples using this method:

RegExMBS.Replace(NewText as string) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Replaces the text on the current found position and returns the complete new text.

You need to call Execute before.
Lasterror is set.
NewText must have UTF-8 text encoding.

\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.

RegExMBS.ReplaceAll(Target as string, NewText as string = "") as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Searches the target string for current pattern and replaces all occurrences with the new text.

You need to call Compile before to initialize the pattern and you should call Study before to optimize the pattern.
Lasterror is set.
Target and NewText must have UTF-8 text encoding.

\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.

RegExMBS.ReplaceSelection(NewText as string) as string

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
Replaces the text on the current found position and returns new text for that selection.

This method is for text editors where you will store result in editfield.seltext to replace the current selection.
Lasterror is set.
You need to call Execute before.
NewText must have UTF-8 text encoding.

\0 references the whole found pattern, \1 to \15 the subexpressions.
\t is replaced with chr(9), \r and \n with chr(13) and \\ with \.

RegExMBS.StringNumber(name as string) as Integer

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
This convenience function finds the number of a named substring capturing parenthesis in a compiled pattern.

name: Name whose number is required
The yield of the function is the number of the parenthesis if the name is found, or PCRE_ERROR_NOSUBSTRING otherwise.

RegExMBS.Study as boolean

Type Topic Plugin Version macOS Windows Linux iOS Targets
method Regular Expressions MBS RegEx Plugin 6.2 ✅ Yes ✅ Yes ✅ Yes ✅ Yes All
After you compiled a pattern study can optimize it.

Only useful if you use Execute several times.
In that case you call one time Compile, one time Study and several times Execute.
Errormessage is set.

Next items

The items on this page are in the following plugins: MBS RegEx Plugin.


The biggest plugin in space...