Top Qs
Timeline
Chat
Perspective

Comparison of regular expression engines

From Wikipedia, the free encyclopedia

Remove ads

This is a comparison of regular expression engines.

Libraries

Summarize
Perspective
More information Name, Official website ...
  1. Formerly called Regex++.
  2. Included since version 2.13.0.
  3. ICU4J, the Java version, does not support regular expressions.
  4. C++ bindings were developed by Google and became officially part of PCRE in 2006.
Remove ads

Languages

More information Language, Official website ...
  1. "STD.regex - D Programming Language - Digital Mars".
  2. "Dotnet/Corefx". GitHub. 16 February 2022.
  3. "Dotnet/Corefx". GitHub. 16 February 2022.
Remove ads

Language features

Summarize
Perspective

NOTE: An application using a library for regular expression support does not necessarily support the full set of features of the library, e.g., GNU grep uses PCRE, but supports no lookahead, though PCRE does.

Part 1

More information "+" quantifier, Negated character classes ...
  1. Non-greedy quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-POSIX engines were non-greedy and didn't have greedy quantifiers at all.
  2. Shy groups, also called non-capturing groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the group's content does not need to be accessed later.
  3. Backreferences enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ([ab]+)\1 matches "abab" but not "abaab".
  4. FREJ have no repetitive quantifiers, but have "optional" element which behaves similar to simple "?" quantifier.
  5. As of ES2018
  6. Lua's only non-greedy quantifier is -, which is a non-greedy version of *. It does not have non-greedy versions of + or ?; in the former case, the non-greedy effect can be achieved by repeating the token followed by -, but in the latter case, there is no equivalent.
  7. Supported by the optional regex library only.

Part 2

More information Directives, Conditionals ...
  1. Also known as flags modifiers, modes modifiers or option letters. Example pattern: "(?i:test)".
  2. Also called independent sub-expressions.
  3. Similar to back references, but with names instead of indices.
  4. Special feature allowing to match balanced constructs without recursion.
  5. Refers to the possibility of including quantifiers in look-behinds, thus making their length unpredictable.
  6. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.
  7. Available as of ICU55.
  8. Available as of JDK7.
  9. The support and range of properties is dependent on implementation.
  10. Experimental support added in v5.29.9.
  11. Supported by Python v3.11 and later, and the optional regex library only.
  12. May only be available in the regex library when used with Python versions after 3.3.
  13. Supported by the optional regex library only.
Remove ads

API features

More information Native UTF-16 support, Native UTF-8 support ...
  1. Means the format can be used internally without explicit conversion.
  2. Partial match of the whole regular expression. For example the pattern ".*END$" will match any string partially, but only strings ending with END fully..
  3. Supports Unicode 15.0 standard from 2023..
  4. Implementation uses original UCS-2 support/features, so it only recognizes 64K chars total (vs UTF-16's 1,112,064 characters). A Microsoft developer-representative answered a bug report on this as "will not fix" in 2010..
  5. Since version 8.30.
  6. Partial matching is performed implicitly, requiring a separate call to matchedLength() if an exact match fails.
  7. Tcl includes facilities to convert to and from UTF-8.
  8. wxRegEx uses any system supplied POSIX library or if not available and for Unicode mode uses Henry Spencer's library.
Remove ads

See also

References

Loading content...
Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads