Perl is one of the strongest languages for working with regular expressions. The operators =~ and !~ are used for matching.
Regular expression modifiers:
i — case insensitiveg — global search (all occurrences)m — multiline mode (works with ^ and $ for each line)s — dot (.) includes newline characterExample:
my $text = "My email: max@mail.com and ann@ya.ru."; while ($text =~ /([\w.]+@[\w.]+)/g) { print "Email: $1 "; }
In this example, the loop extracts all emails from the string.
After a match, values are captured in special variables $1, $2, ... (for each capturing group).
What will be stored in variable $1 after an unsuccessful search using a regular expression?
Answer:
$1is not updated on an unsuccessful match — it holds the value of the last successful match. This is a potential source of errors! To avoid this, it is better to explicitly clear$1,$2, ... in complex conditions.
History
Validating input data with a complex regular expression depended on the value of
$1. After another match that did not find a suitable string, variable$1did not change, and the processing proceeded with the old incorrect email address.
History
The programmer made a mistake with the modifier. In a large text,
/abc/gwas applied, expecting multiline mode. As a result, the pattern did not find the necessary lines with line breaks, although/abc/msshould have been used.
History
The project forgot to use the global flag
/gwhen searching for all occurrences. The consequence was that the script only extracted the first match, while the user expected a complete list of data.