Perl Tutorial
Fundamentals
Input and Output
Control Flow
Arrays and Lists
Hash
Scalars
Strings
Object Oriented Programming in Perl
Subroutines
Regular Expressions
File Handling
Context Sensitivity
CGI Programming
Misc
Regular expressions (regex) are a powerful feature in Perl, allowing for advanced pattern matching and manipulation. Perl's regex engine supports several operators to define these patterns. Let's dive into the primary regex operators in Perl:
m//
: Match operator. The m
can usually be omitted if the delimiters are slashes.
if ($string =~ m/pattern/) { # code to execute if pattern matches } # or simply if ($string =~ /pattern/) { # code to execute if pattern matches }
s///
: Substitution operator. Replaces the matched pattern.
$string =~ s/pattern/replacement/;
To replace globally in the string:
$string =~ s/pattern/replacement/g;
=~
: Binds a scalar value to a pattern match.
!~
: Does not bind a scalar value to a pattern match.
if ($string =~ /perl/) { print "Match found!"; } if ($string !~ /ruby/) { print "Match not found!"; }
Quantifiers define how many times an element can appear:
*
: 0 or more times
+
: 1 or more times
?
: 0 or 1 time
{n}
: Exactly n times
{n,}
: n or more times
{n,m}
: Between n and m times
$string =~ /ab*c/; # Matches 'ac', 'abc', 'abbc', etc.
[...]
: Matches any one character inside the brackets.
$string =~ /[aeiou]/; # Matches any one vowel.
[^...]
: Matches any one character not inside the brackets.
$string =~ /[^0-9]/; # Matches any non-digit character.
\d
: Matches any digit ([0-9]
).\D
: Matches any non-digit.\w
: Matches any word character (alphanumeric plus underscore).\W
: Matches any non-word character.\s
: Matches any whitespace character (spaces, tabs, line breaks).\S
: Matches any non-whitespace character.^
: Start of the string.$
: End of the string.\b
: Word boundary.\B
: Not a word boundary.()
: Groups several characters together and captures the matched text.
if ($string =~ /(ab)c/) { print "Captured: $1"; # Prints 'ab' if it matches. }
Modifiers can be added at the end of the regex to alter its behavior:
i
: Case-insensitive search.
m
: Treats the string as multiple lines (^
and $
can match start/end of lines).
s
: Treats the string as a single line (dot .
matches newline as well).
g
: Global match (finds all matches).
x
: Allows for extended whitespace and comments within the pattern.
$string =~ /pattern/ig; # Case-insensitive global match.
Perl's regex engine is incredibly robust and versatile. With these operators and modifiers, you can craft intricate patterns to match almost any textual scenario. Familiarity with these tools will significantly enhance your text-processing capabilities in Perl.
Perl regex operators and their meanings:
=~
(binding operator) and !~
(negative binding operator), used for matching and testing matches.my $string = "Hello, World!"; if ($string =~ /Hello/) { print "Match found!\n"; }
Using quantifiers in Perl regex:
my $pattern = 'a{2,4}'; # Matches 'aa', 'aaa', or 'aaaa'
Perl regex alternation operator:
|
allows for matching one of multiple patterns.my $pattern = 'cat|dog'; # Matches 'cat' or 'dog'
Lookahead and lookbehind in Perl regex:
(?=...)
) and lookbehind ((?<=...)
) are zero-width assertions that check for patterns without consuming characters.my $pattern = 'foo(?=bar)'; # Matches 'foo' only if followed by 'bar'
Backreferences and capture groups in Perl regex:
( ... )
capture portions of a matched string for later use, and backreferences (\1
, \2
, etc.) refer to those captured groups.my $pattern = '(\d+)\s+([A-Za-z]+)'; if ($input =~ /$pattern/) { my $number = $1; my $word = $2; }
Character classes and ranges in Perl regex:
[...]
) and ranges (-
) match characters within a specified set or range.my $pattern = '[aeiou]'; # Matches any vowel my $pattern2 = '[0-9]'; # Matches any digit my $pattern3 = '[A-Za-z]'; # Matches any uppercase or lowercase letter
Perl regex modifiers and flags:
/i
for case-insensitive matching.my $pattern = 'hello'; if ($input =~ /$pattern/i) { print "Case-insensitive match found!\n"; }
Using anchors in Perl regex patterns:
^
for the start of a line, $
for the end of a line) specify where a pattern should occur in the string.my $pattern = '^start'; # Matches 'start' only at the beginning of the string my $pattern2 = 'end$'; # Matches 'end' only at the end of the string
Escape sequences in Perl regex:
\
) allow you to match special characters literally.my $pattern = '\d+'; # Matches one or more digits