Answer.

In Perl, the quantifiers in regular expressions — *, +, ?, {n,m} — are greedy by default: they capture the maximum possible number of characters that match the pattern.

Adding ? after a quantifier makes it lazy (or non-greedy): it captures the minimum possible number of characters for the entire regex to match.

Example of greedy matching:

my $str = 'foo <bar> baz <quux>'; 
$str =~ /<.*>/;   # Will capture '<bar> baz <quux>'

Example of lazy matching:

my $str = 'foo <bar> baz <quux>'; 
$str =~ /<.*?>/;   # Will capture '<bar>'

Feature:

A greedy expression can "eat" more than you expect when parsing HTML or other nested constructs!

Trick question.

What are the differences between the following two regexes when parsing the string <a><b><c>: /<(.*)>/ and /<(.*?)>/?

Answer:

/<(.*)>/ (greedy) will capture the maximum block — match: <a><b><c>
/<(.*?)>/ (lazy) — only the first group: <a>

Example:

my $s = '<a><b><c>';
$s =~ /<(.*)>/;    # $1: 'a><b><c'
$s =~ /<(.*?)>/;  # $1: 'a'

Examples of real errors due to ignorance of the subtleties of the topic.

Story

In a news headline import application, the programmer wanted to parse the tag name in the string <title>News</title> using /\<(.*)\>/. As a result, the regex captured the entire string between the first < and the last >, rather than the desired element. The error was found when nested tags appeared.

Story

In a logical parser for extracting quoted strings, the pattern /"(.*)"/ unexpectedly captured everything between the first and last quote. As a result, the markup was broken incorrectly until the pattern was replaced with /"(.*?)"/.

Story

In an automatic CSV parser with quoted capabilities, a pattern written in "greedy" was wrong, causing multiple columns to merge into one. The limitation of the introduced parser emerged only with large data — a lazy modification of the pattern solved the problem.