Group match

Group number rules

\b(\d\d(\d\d))-\2-\2\b
From outer to inner , from left to right number increase start from 1.

Noncapturing group

\b(?i:Mary|Jane|Sue)\b

  • Regex options: None
  • Regex flavors: .NET, Java, PCRE, Perl, Ruby

(?i-sm:Mary.*?\r\n)

  • Regex options: None
  • Regex flavors: .NET, Java, PCRE, Perl, Ruby

Inner config the group capture options
parentheses start like ‘?:’ ‘?xxx:’, means group match but no capture.

Match Previously Matched Again

\b\d\d(\d\d)-\1-\1\b

  • Regex options: None
  • Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

\01 is either an octal escape or an error, dont’t use it.
\xFF hexadecimal escapes are much easier to understand

JavaScript if the \1 appear in front of group it match like a zero-length
\d\d\1-\1-(\d\d) can match 12--34
after ‘2’ and ‘-‘ two zero-length position matches.

Name the capture group

Named capture

\b(?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d)\b

  • Regex options: None
  • Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9

\b(?'year'\d\d\d\d)-(?'month'\d\d)-(?'day'\d\d)\b

  • Regex options: None
  • Regex flavors: .NET, PCRE 7, Perl 5.10, Ruby 1.9

\b(?P<year>\d\d\d\d)-(?P<month>\d\d)-(?P<day>\d\d)\b

  • Regex options: None
  • Regex flavors: PCRE 4 and later, Perl 5.10, Python

Named backreferences

\b\d\d(?<magic>\d\d)-\k<magic>-\k<magic>\b

  • Regex options: None
  • Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9

\b\d\d(?'magic'\d\d)-\k'magic'-\k'magic'\b

  • Regex options: None
  • Regex flavors: .NET, PCRE 7, Perl 5.10, Ruby 1.9

\b\d\d(?P<magic>\d\d)-(?P=magic)-(?P=magic)\b

  • Regex options: None
  • Regex flavors: PCRE 4 and later, Perl 5.10, Python

Group’s name must consist of word characters matched by \w

Angle brackets not friend in xml style environment.
So some language support use in ' single quotes instead of angle brackets.

Use same name in expression is not recommanded

Syntax history

Python was the first regular expression flavor to support named capture.

Perhaps due to .NET’s popularity over Python, the .NET syntax seems to be the one
that other regex library developers prefer to copy. Perl 5.10 and later have it, and so
does the Oniguruma engine in Ruby 1.9. Perl 5.10 and Ruby 1.9 support both the syntax
using angle brackets and single quotes. Java 7 also copied the .NET syntax, but only
the variant using angle brackets. Standard JavaScript does not support named capture.
XRegExp adds support for named capture using the .NET syntax, but only the variant
with angle brackets.

PCRE copied Python’s syntax long ago, at a time when Perl did not support named
capture at all. PCRE 7, the version that adds the new features in Perl 5.10, supports
both the .NET syntax and the Python syntax. Perhaps as a testament to the success of
PCRE, in a reverse compatibility move, Perl 5.10 also supports the Python syntax. In
PCRE and Perl 5.10, the functionality of the .NET syntax and the Python syntax for
named capture is identical.