Perl Tutorial

Fundamentals

Input and Output

Control Flow

Arrays and Lists

Hash

Scalars

Strings

Object Oriented Programming in Perl

Subroutines

Regular Expressions

File Handling

Context Sensitivity

CGI Programming

Misc

Regular Expressions in Perl

Regular expressions (regex or regexp) are one of the most powerful features in Perl. They allow you to match, extract, replace, or even split text based on patterns. Let's dive into a comprehensive tutorial on regular expressions in Perl:

1. Basic Matching:

Use =~ to match a string against a regex:

my $str = "Hello, world!";
if ($str =~ /world/) {
    print "Matched!\n";
}

2. Basic Quantifiers:

  • *: Match 0 or more times.
  • +: Match 1 or more times.
  • ?: Match 0 or 1 time.
  • {n}: Match exactly n times.
  • {n,}: Match n or more times.
  • {n,m}: Match between n and m times.
my $str = "wooooow";
if ($str =~ /wo{5}w/) {
    print "5 'o' characters matched!\n";
}

3. Character Classes:

Character classes match one character out of a set:

  • [abc]: Matches a single character which can be a, b, or c.
  • [a-z]: Matches any single lowercase letter.
  • [A-Z]: Matches any single uppercase letter.
  • [^a-z]: Matches any character except lowercase letters.

4. Predefined Character Classes:

  • \d: Matches a digit ([0-9]).
  • \D: Matches any non-digit.
  • \w: Matches a word character (equivalent to [a-zA-Z0-9_]).
  • \W: Matches any non-word character.
  • \s: Matches any whitespace character (spaces, tabs, newlines).
  • \S: Matches any non-whitespace character.

5. Anchors:

  • ^: Matches the start of the string.
  • $: Matches the end of the string.
if ($str =~ /^Hello/) { ... } # Matches "Hello" at the beginning.

6. Grouping and Capturing:

You can group parts of your pattern and capture matched content:

my $date = "2021-09-04";
if ($date =~ /(\d{4})-(\d{2})-(\d{2})/) {
    print "Year: $1, Month: $2, Day: $3\n";
}

7. Modifiers:

  • i: Case-insensitive match.
  • m: Treat the string as multiple lines.
  • s: Treat the string as a single line (so . matches even \n).
  • x: Extended mode (allows you to use whitespace and comments in the regex).

8. Substitution:

Replace parts of the string:

my $str = "blue sky";
$str =~ s/blue/green/;
print $str; # Outputs "green sky"

9. Global Matching:

Match all occurrences in a string:

my $str = "cats and dogs";
while ($str =~ /cat|dog/g) {
    print "Found: $&\n";
}

10. Non-capturing Groups:

Use (?: ... ) for non-capturing groups:

if ($str =~ /(?:cat|dog)s/) {
    print "Found plural: $&\n";
}

11. Lookaheads and Lookbehinds:

  • Positive lookahead: (?=...)
  • Negative lookahead: (?!...)
  • Positive lookbehind: (?<=...)
  • Negative lookbehind: (?<!...)
if ($str =~ /cat(?=s)/) {
    print "Found 'cat' followed by 's'\n";
}

12. Splitting Strings:

You can split strings using regex:

my $data = "John:25:USA";
my @fields = split /:/, $data;

Summary:

Regular expressions in Perl are a vast topic. This tutorial provides an overview of the basics and some advanced concepts. Mastery of regex can save a lot of time and code, making it a valuable skill for any Perl programmer. To dive deeper, consider studying the Perl regex documentation (perldoc perlre).

  1. Introduction to regex in Perl:

    • Description: Regular expressions are powerful tools for pattern matching in strings. In Perl, regex are often used for tasks like validation, extraction, and substitution.
    • Code Example:
      my $string = "The quick brown fox jumps over the lazy dog.";
      
      if ($string =~ /quick/) {
          print "Match found!\n";
      }
      
  2. Perl regex pattern examples:

    • Description: Examples of common regex patterns for various scenarios.
    • Code Example:
      my $email = "user@example.com";
      
      if ($email =~ /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/) {
          print "Valid email address!\n";
      }
      
  3. Using regular expressions for pattern matching in Perl:

    • Description: Perl provides the =~ operator to apply regex patterns and perform pattern matching on strings.
    • Code Example:
      my $sentence = "The cat in the hat.";
      
      if ($sentence =~ /cat/) {
          print "Found 'cat' in the sentence!\n";
      }
      
  4. Quantifiers and modifiers in Perl regex:

    • Description: Quantifiers specify how many times a character or group should be repeated, and modifiers alter regex behavior.
    • Code Example:
      my $numbers = "123 456 789";
      
      if ($numbers =~ /\d{3}/) {
          print "Three consecutive digits found!\n";
      }
      
  5. Perl regex character classes:

    • Description: Character classes allow you to match a set of characters.
    • Code Example:
      my $code = "A1b C3";
      
      if ($code =~ /[A-Za-z]\d/) {
          print "Alphabetic character followed by a digit found!\n";
      }
      
  6. Lookahead and lookbehind in Perl regex:

    • Description: Lookahead and lookbehind are zero-width assertions that check for patterns without consuming characters.
    • Code Example:
      my $price = "$100";
      
      if ($price =~ /\d+(?=\$)/) {
          print "Price found without including the dollar sign!\n";
      }
      
  7. Substitution and matching with regex in Perl:

    • Description: Regex can be used for both matching and substitution operations in Perl.
    • Code Example:
      my $text = "Replace me!";
      
      $text =~ s/Replace/Updated/;
      
      print "$text\n";  # Output: "Updated me!"
      
  8. Advanced regex techniques in Perl:

    • Description: Advanced regex techniques include non-capturing groups, backreferences, and using regex modifiers.
    • Code Example:
      my $html = "<p>Hello, <b>world</b>!</p>";
      
      if ($html =~ m{<b>(.*?)</b>}) {
          my $bold_text = $1;
          print "Bold text: $bold_text\n";
      }