Match based on condition

Problem

Create a regular expression that matches a comma-delimited list of the words one,
two, and three. Each word can occur any number of times in the list, and the words
can occur in any order, but each word must appear at least once.

Solution

\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}(?(1)|(?!))(?(2)|(?!))(?(3)|(?!))

  • Regex options: None
  • Regex flavors: .NET, PCRE, Perl, Python

First an words anchor, then no capture group, inside three possible group anticipate be matched.
Behind is a comma , or an anchor, these pattern happen at least three times.
If group 1th 2th 3th group exists then txt matched.
if not then (?!) anticipate it’s back is not a null, so definitely fail, no matched.


\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}

  • Regex options: None
  • Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

  • .NET, Python, and PCRE 6.7 (?(name)then|else)
  • Perl 5.10 (?(<name>)then|else) or (?('name')then|else)
  • PCRE 7.0 supports Perl’s syntax for named conditional, .NET, Python as well.
  • .NET, PCRE, and Perl support lookaround (?(?=if)then|else) or (?(?<=if)then|else)

Example

(a)?b(?(1)c|d)1 matches abc and bd
(a)?b(?(?<=ab)c|d) matches abc and bd

tip

lookaround can be written without the conditional as (?=if)then|(?!if)else

(a)?b((?<=ab)c|(?<!ab)d) matches abc and bd