Perl Tutorial

Fundamentals

Input and Output

Control Flow

Arrays and Lists

Hash

Scalars

Strings

Object Oriented Programming in Perl

Subroutines

Regular Expressions

File Handling

Context Sensitivity

CGI Programming

Misc

Perl Operators in Regular Expression

Regular expressions (regex) are a powerful feature in Perl, allowing for advanced pattern matching and manipulation. Perl's regex engine supports several operators to define these patterns. Let's dive into the primary regex operators in Perl:

1. Basic Matching

  • m//: Match operator. The m can usually be omitted if the delimiters are slashes.

    if ($string =~ m/pattern/) {
        # code to execute if pattern matches
    }
    
    # or simply
    
    if ($string =~ /pattern/) {
        # code to execute if pattern matches
    }
    

2. Substitution

  • s///: Substitution operator. Replaces the matched pattern.

    $string =~ s/pattern/replacement/;
    

    To replace globally in the string:

    $string =~ s/pattern/replacement/g;
    

3. Binding Operators

  • =~: Binds a scalar value to a pattern match.

  • !~: Does not bind a scalar value to a pattern match.

    if ($string =~ /perl/) {
        print "Match found!";
    }
    
    if ($string !~ /ruby/) {
        print "Match not found!";
    }
    

4. Quantifiers

Quantifiers define how many times an element can appear:

  • *: 0 or more times

  • +: 1 or more times

  • ?: 0 or 1 time

  • {n}: Exactly n times

  • {n,}: n or more times

  • {n,m}: Between n and m times

    $string =~ /ab*c/;    # Matches 'ac', 'abc', 'abbc', etc.
    

5. Character Classes

  • [...]: Matches any one character inside the brackets.

    $string =~ /[aeiou]/; # Matches any one vowel.
    
  • [^...]: Matches any one character not inside the brackets.

    $string =~ /[^0-9]/;  # Matches any non-digit character.
    

6. Predefined Character Classes

  • \d: Matches any digit ([0-9]).
  • \D: Matches any non-digit.
  • \w: Matches any word character (alphanumeric plus underscore).
  • \W: Matches any non-word character.
  • \s: Matches any whitespace character (spaces, tabs, line breaks).
  • \S: Matches any non-whitespace character.

7. Anchors

  • ^: Start of the string.
  • $: End of the string.
  • \b: Word boundary.
  • \B: Not a word boundary.

8. Grouping and Capturing

  • (): Groups several characters together and captures the matched text.

    if ($string =~ /(ab)c/) {
        print "Captured: $1";  # Prints 'ab' if it matches.
    }
    

9. Modifiers

Modifiers can be added at the end of the regex to alter its behavior:

  • i: Case-insensitive search.

  • m: Treats the string as multiple lines (^ and $ can match start/end of lines).

  • s: Treats the string as a single line (dot . matches newline as well).

  • g: Global match (finds all matches).

  • x: Allows for extended whitespace and comments within the pattern.

    $string =~ /pattern/ig; # Case-insensitive global match.
    

Conclusion

Perl's regex engine is incredibly robust and versatile. With these operators and modifiers, you can craft intricate patterns to match almost any textual scenario. Familiarity with these tools will significantly enhance your text-processing capabilities in Perl.

  1. Perl regex operators and their meanings:

    • Description: Regular expression operators in Perl include =~ (binding operator) and !~ (negative binding operator), used for matching and testing matches.
    • Example Code:
      my $string = "Hello, World!";
      if ($string =~ /Hello/) {
          print "Match found!\n";
      }
      
  2. Using quantifiers in Perl regex:

    • Description: Quantifiers specify the number of occurrences of a pattern.
    • Example Code:
      my $pattern = 'a{2,4}';  # Matches 'aa', 'aaa', or 'aaaa'
      
  3. Perl regex alternation operator:

    • Description: The alternation operator | allows for matching one of multiple patterns.
    • Example Code:
      my $pattern = 'cat|dog';  # Matches 'cat' or 'dog'
      
  4. Lookahead and lookbehind in Perl regex:

    • Description: Lookahead ((?=...)) and lookbehind ((?<=...)) are zero-width assertions that check for patterns without consuming characters.
    • Example Code:
      my $pattern = 'foo(?=bar)';  # Matches 'foo' only if followed by 'bar'
      
  5. Backreferences and capture groups in Perl regex:

    • Description: Capture groups ( ... ) capture portions of a matched string for later use, and backreferences (\1, \2, etc.) refer to those captured groups.
    • Example Code:
      my $pattern = '(\d+)\s+([A-Za-z]+)';
      if ($input =~ /$pattern/) {
          my $number = $1;
          my $word = $2;
      }
      
  6. Character classes and ranges in Perl regex:

    • Description: Character classes ([...]) and ranges (-) match characters within a specified set or range.
    • Example Code:
      my $pattern = '[aeiou]';    # Matches any vowel
      my $pattern2 = '[0-9]';      # Matches any digit
      my $pattern3 = '[A-Za-z]';   # Matches any uppercase or lowercase letter
      
  7. Perl regex modifiers and flags:

    • Description: Modifiers and flags alter the behavior of a regex pattern. Common ones include /i for case-insensitive matching.
    • Example Code:
      my $pattern = 'hello';
      if ($input =~ /$pattern/i) {
          print "Case-insensitive match found!\n";
      }
      
  8. Using anchors in Perl regex patterns:

    • Description: Anchors (^ for the start of a line, $ for the end of a line) specify where a pattern should occur in the string.
    • Example Code:
      my $pattern = '^start';   # Matches 'start' only at the beginning of the string
      my $pattern2 = 'end$';    # Matches 'end' only at the end of the string
      
  9. Escape sequences in Perl regex:

    • Description: Escape sequences (\) allow you to match special characters literally.
    • Example Code:
      my $pattern = '\d+';  # Matches one or more digits