Perl Tutorial
Fundamentals
Input and Output
Control Flow
Arrays and Lists
Hash
Scalars
Strings
Object Oriented Programming in Perl
Subroutines
Regular Expressions
File Handling
Context Sensitivity
CGI Programming
Misc
Regular expressions are a core feature of Perl, and understanding character classes is fundamental to crafting effective patterns. Here's a tutorial on regex character classes in Perl:
A character class allows you to specify a set of characters. It matches exactly one of the characters within the set.
Syntax:
[...]
[abc]
: Matches any one of the characters a
, b
, or c
.[a-z]
: Matches any lowercase alphabetic character.[A-Z]
: Matches any uppercase alphabetic character.[0-9]
: Matches any digit.[a-zA-Z]
: Matches any alphabetic character (both uppercase and lowercase).[0-9a-fA-F]
: Matches any hexadecimal digit.By placing a caret (^
) at the start of the character class, you can negate it. This will match any character not in the set.
[^a-z]
: Matches any character that's not a lowercase letter.[^0-9]
: Matches any character that's not a digit.Perl regex offers predefined shortcuts for commonly used character classes:
\d
: Matches any digit. Equivalent to [0-9]
.\D
: Matches any non-digit. Equivalent to [^0-9]
.\w
: Matches any word character (alphanumeric characters plus underscore). Equivalent to [a-zA-Z0-9_]
.\W
: Matches any non-word character.\s
: Matches any whitespace character (spaces, tabs, line breaks).\S
: Matches any non-whitespace character.Perl also supports POSIX-style character classes. These classes are more descriptive:
[:alpha:]
: Matches any alphabetic character. Equivalent to [a-zA-Z]
.[:digit:]
: Matches any digit. Equivalent to [0-9]
.[:alnum:]
: Matches any alphanumeric character. Equivalent to [a-zA-Z0-9]
.[:space:]
: Matches any whitespace character.[:punct:]
: Matches any punctuation character.[:lower:]
: Matches any lowercase alphabetic character.[:upper:]
: Matches any uppercase alphabetic character.To use a POSIX character class, embed it within a bracket expression. For example: [[:digit:]]
.
Here are some Perl snippets that use character classes:
#!/usr/bin/perl use strict; use warnings; my $str = "Price: $45"; if ($str =~ /(\d+)/) { print "The price is $1.\n"; } if ($str =~ /[^\d\s]+/) { print "Found non-digit, non-whitespace sequence: $&\n"; }
Character classes are a vital tool in regex, allowing you to match specific sets of characters. Whether you're using basic bracket notation or predefined classes, understanding these patterns helps make your Perl regexes more effective and readable.
Using character classes in Perl regex patterns:
if ($string =~ /[aeiou]/) { print "Vowel found!\n"; }
Defining custom character classes in Perl:
if ($string =~ /[A-Za-z]/) { print "Alphabetic character found!\n"; }
Negating character classes in Perl:
if ($string =~ /[^0-9]/) { print "Non-digit character found!\n"; }
Predefined character classes in Perl regex:
\d
(digits), \w
(word characters), and \s
(whitespace).if ($string =~ /\d/) { print "Digit found!\n"; }
Character class metacharacters in Perl:
-
for specifying a range.if ($string =~ /[0-9a-f]/) { print "Hexadecimal digit found!\n"; }
Matching digits with \d
in Perl regex:
\d
matches any digit (0-9) in Perl regex patterns.if ($string =~ /\d/) { print "Digit found!\n"; }
Matching word characters with \w
in Perl regex:
\w
matches any word character (alphanumeric + underscore) in Perl regex patterns.if ($string =~ /\w/) { print "Word character found!\n"; }
Matching whitespace characters with \s
in Perl regex:
\s
matches any whitespace character (space, tab, newline) in Perl regex patterns.if ($string =~ /\s/) { print "Whitespace character found!\n"; }
Unicode character classes in Perl regex:
if ($string =~ /\p{Greek}/) { print "Greek character found!\n"; }