More Exampes of Regular Expressions in Perl

Some Example of Basic Perl Regular Expressions

In this and the following pages we will provide examples of Perl regular expressions (regexp).
It is worth stating that much of what we show here is applicable to PHP, Python, Ruby and many
other languages. However, beware languages/ utilities such as AWK, grep, egrep take a
different, more efficient, but less flexible approach, to regular expression matching.
We will discuss these differences in future pages.

Example regular expression that matches zero or more characters

This example shows a Perl regular expression that matches zero or more characters. Shown here and similarly in the examples below are, the results of a match. We see, the part of the input before the match,the characters actually matched, and the characters after the match. Since .* matches everything the parts before and after the match are empty. Note we match the whole input string so the parts before and after a match are empty.

Example 1: matches zero or more characters .*
regexp input pre match $` match $& post match $'
.* lower case (alpha str) lower case (alpha str)
.* UPPER CASE (ALHPA CHAR) UPPER CASE (ALHPA CHAR)
.* 12345116122 underscore _R_8 12345116122 underscore _R_8
.* (_1_(_Mix_It UPaBIT)1567)5551 (_1_(_Mix_It UPaBIT)1567)5551
.* (Numbers)(01234567 89)2 (Numbers)(01234567 89)2
.* meta(gr)[or]{qu}rn-esc/\(}-{) meta(gr)[or]{qu}rn-esc/\(}-{)
.* (www.review-pc.com) (www.review-pc.com)
.* lotsOf1.26(inside job)a4_ lotsOf1.26(inside job)a4_

Example regular expression that matches three digits

This example shows a Perl regular expression that matches three digits. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. Note which three digits are matched.

Example 2: matches three digits \d\d\d
regexp input pre match $` match $& post match $'
\d\d\d lower case (alpha str)
\d\d\d UPPER CASE (ALHPA CHAR)
\d\d\d 12345116122 underscore _R_8 123 45116122 underscore _R_8
\d\d\d (_1_(_Mix_It UPaBIT)1567)5551 (_1_(_Mix_It UPaBIT) 156 7)5551
\d\d\d (Numbers)(01234567 89)2 (Numbers)( 012 34567 89)2
\d\d\d meta(gr)[or]{qu}rn-esc/\(}-{)
\d\d\d (www.review-pc.com)
\d\d\d lotsOf1.26(inside job)a4_

Example regular expression that matches three word characters

This example shows a Perl regular expression that matches three word characters. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. We see that, word \w include digits and _.

Example 3: matches three word characters \w\w\w
regexp input pre match $` match $& post match $'
\w\w\w lower case (alpha str) low er case (alpha str)
\w\w\w UPPER CASE (ALHPA CHAR) UPP ER CASE (ALHPA CHAR)
\w\w\w 12345116122 underscore _R_8 123 45116122 underscore _R_8
\w\w\w (_1_(_Mix_It UPaBIT)1567)5551 ( _1_ (_Mix_It UPaBIT)1567)5551
\w\w\w (Numbers)(01234567 89)2 ( Num bers)(01234567 89)2
\w\w\w meta(gr)[or]{qu}rn-esc/\(}-{) met a(gr)[or]{qu}rn-esc/\(}-{)
\w\w\w (www.review-pc.com) ( www .review-pc.com)
\w\w\w lotsOf1.26(inside job)a4_ lot sOf1.26(inside job)a4_

Example regular expression that matches zero or more characters then 1

This example shows a Perl regular expression that matches zero or more characters then 1. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. We can start with a 1 character here

Example 4: matches zero or more characters then 1 .*1
regexp input pre match $` match $& post match $'
.*1 lower case (alpha str)
.*1 UPPER CASE (ALHPA CHAR)
.*1 12345116122 underscore _R_8 123451161 22 underscore _R_8
.*1 (_1_(_Mix_It UPaBIT)1567)5551 (_1_(_Mix_It UPaBIT)1567)5551
.*1 (Numbers)(01234567 89)2 (Numbers)(01 234567 89)2
.*1 meta(gr)[or]{qu}rn-esc/\(}-{)
.*1 (www.review-pc.com)
.*1 lotsOf1.26(inside job)a4_ lotsOf1 .26(inside job)a4_

Example regular expression that matches strings containing a 1 but not ending with any of 1 2 3 4 5 6 7

This example shows a Perl regular expression that matches strings containing a 1 but not ending with any of 1 2 3 4 5 6 7. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. It's important to see that, the ^ inside the [] is not the start of string anchor.

Example 5: matches strings containing a 1 but not ending with any of 1 2 3 4 5 6 7 1.*[^1-7]$
regexp input pre match $` match $& post match $'
1.*[^1-7]$ lower case (alpha str)
1.*[^1-7]$ UPPER CASE (ALHPA CHAR)
1.*[^1-7]$ 12345116122 underscore _R_8 12345116122 underscore _R_8
1.*[^1-7]$ (_1_(_Mix_It UPaBIT)1567)5551
1.*[^1-7]$ (Numbers)(01234567 89)2
1.*[^1-7]$ meta(gr)[or]{qu}rn-esc/\(}-{)
1.*[^1-7]$ (www.review-pc.com)
1.*[^1-7]$ lotsOf1.26(inside job)a4_ lotsOf 1.26(inside job)a4_

Example regular expression that matches strings not ending in a digit

This example shows a Perl regular expression that matches strings not ending in a digit. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. Note the capital characters \D \W etc act as not \d \w etc

Example 6: matches strings not ending in a digit \D$
regexp input pre match $` match $& post match $'
\D$ lower case (alpha str) lower case (alpha str )
\D$ UPPER CASE (ALHPA CHAR) UPPER CASE (ALHPA CHAR )
\D$ 12345116122 underscore _R_8
\D$ (_1_(_Mix_It UPaBIT)1567)5551
\D$ (Numbers)(01234567 89)2
\D$ meta(gr)[or]{qu}rn-esc/\(}-{) meta(gr)[or]{qu}rn-esc/\(}-{ )
\D$ (www.review-pc.com) (www.review-pc.com )
\D$ lotsOf1.26(inside job)a4_ lotsOf1.26(inside job)a4 _

Example regular expression that matches strings ending in three word characters

This example shows a Perl regular expression that matches strings ending in three word characters. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. Here, the caret is the ^ the end of string anchor.

Example 7: matches strings ending in three word characters \w\w\w$
regexp input pre match $` match $& post match $'
\w\w\w$ lower case (alpha str)
\w\w\w$ UPPER CASE (ALHPA CHAR)
\w\w\w$ 12345116122 underscore _R_8 12345116122 underscore _ R_8
\w\w\w$ (_1_(_Mix_It UPaBIT)1567)5551 (_1_(_Mix_It UPaBIT)1567)5 551
\w\w\w$ (Numbers)(01234567 89)2
\w\w\w$ meta(gr)[or]{qu}rn-esc/\(}-{)
\w\w\w$ (www.review-pc.com)
\w\w\w$ lotsOf1.26(inside job)a4_ lotsOf1.26(inside job) a4_

Example regular expression that matches strings containing CASE

This example shows a Perl regular expression that matches strings containing CASE. As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. Note, we can use the i option to force case insensitivity if need be.

Example 8: matches strings containing CASE CASE
regexp input pre match $` match $& post match $'
CASE lower case (alpha str)
CASE UPPER CASE (ALHPA CHAR) UPPER CASE (ALHPA CHAR)
CASE 12345116122 underscore _R_8
CASE (_1_(_Mix_It UPaBIT)1567)5551
CASE (Numbers)(01234567 89)2
CASE meta(gr)[or]{qu}rn-esc/\(}-{)
CASE (www.review-pc.com)
CASE lotsOf1.26(inside job)a4_

Example regular expression that escapes the meta characters ( )

This example shows a Perl regular expression that escapes the meta characters ( ). As above we provide, the part of the input before, the matched sub-string if any, and the post match sub-string. We see that the \ backslash can be used to escape meta-characters.

Example 9: escapes the meta characters ( ) \(\w*\)
regexp input pre match $` match $& post match $'
\(\w*\) lower case (alpha str)
\(\w*\) UPPER CASE (ALHPA CHAR)
\(\w*\) 12345116122 underscore _R_8
\(\w*\) (_1_(_Mix_It UPaBIT)1567)5551
\(\w*\) (Numbers)(01234567 89)2 (Numbers) (01234567 89)2
\(\w*\) meta(gr)[or]{qu}rn-esc/\(}-{) meta (gr) [or]{qu}rn-esc/\(}-{)
\(\w*\) (www.review-pc.com)
\(\w*\) lotsOf1.26(inside job)a4_

Example regular expression that does not escape the meta characters ( )

This example shows a Perl regular expression that does not escape the meta characters ( ). Lastly and as previous we see, the part of the input before, the matched sub-string if any, and the post match sub-string. Here, the () act as grouping parentheses.

Example 10: does not escape the meta characters ( ) (\w*)
regexp input pre match $` match $& post match $'
(\w*) lower case (alpha str) lower case (alpha str)
(\w*) UPPER CASE (ALHPA CHAR) UPPER CASE (ALHPA CHAR)
(\w*) 12345116122 underscore _R_8 12345116122 underscore _R_8
(\w*) (_1_(_Mix_It UPaBIT)1567)5551 (_1_(_Mix_It UPaBIT)1567)5551
(\w*) (Numbers)(01234567 89)2 (Numbers)(01234567 89)2
(\w*) meta(gr)[or]{qu}rn-esc/\(}-{) meta (gr)[or]{qu}rn-esc/\(}-{)
(\w*) (www.review-pc.com) (www.review-pc.com)
(\w*) lotsOf1.26(inside job)a4_ lotsOf1 .26(inside job)a4_

Summary

In these examples we have demonstrated some simple Perl like regular
expressions. A good deal of what we have seen here is applicable to other languages. In particular,
PHP, Ruby and Python. However, it needs to be stressed that language such as AWK, grep and egrep,
can behave very differently. These languages use a less flexible but significantly more efficient
matching algorithm than does Perl.

Perl examples illustrated

  • Example 1.The regexp .* matches zero or more characters.
  • Example 2.The regexp \d\d\d matches three digits.
  • Example 3.The regexp \w\w\w matches three word characters.
  • Example 4.The regexp .*1 matches zero or more characters then 1.
  • Example 5.The regexp 1.*[^1-7]$ matches strings containing a 1 but not ending with any of 1 2 3 4 5 6 7.
  • Example 6.The regexp \D$ matches strings not ending in a digit.
  • Example 7.The regexp \w\w\w$ matches strings ending in three word characters.
  • Example 8.The regexp CASE matches strings containing CASE.
  • Example 9.The regexp \(\w*\) escapes the meta characters ( ).
  • Example 10.The regexp (\w*) does not escape the meta characters ( ).