Matching Patterns (Regular Expressions)

function detect_zipcode($string) {
    // Use regex to look for zipcodes, return true or false
    return preg_match('/\b\d{5}(-\d{4})?\b/', $string);

preg_match() is the regular expression pattern. Patterns always start and end with some delimiter character, traditionally a /. The next item, \b, is the syntax used to match a word-boundary character. This means it will find any whitespace, punctuation, or the beginning or end of the string. The next item, \d, indicates that the next character must be a digit. The {5} after it, indicates that it must find five of the previous items, in this case, five digits. We then have a parenthesis. These are used for grouping items together. The next item,-, simply represents the dash character. We then have \d{4} meaning four more digits. The parenthesis then closes, and we have a ?. The question mark is a modifier meaning that the previous item (in this case the group specified by the parenthesis), is optional.

Therefore, we have created a regex that matches a five-digit string, surrounded by word boundaries, optionally having a dash and four more digits after it.

As a quick reference, here are some of the most common syntax characters for use in PCRE regular expressions:

Pattern matches:

\d = Digit

\D = Not a digit

\s = Whitespace

\S = Not whitespace

. = Any character (except \n)

^ = Start of string

$ = End of string

\b = Word boundary

Pattern match extenders:

? = Previous item is match 0 or 1 times.

* = Previous item is matched 0 or more times.

+ = Previous item is matched 1 or more times.

{n} = Previous item is matched exactly n times.

{n,} = Previous item is matched at least n times.

{n,m} = Previous item is matched at least n and at most m times.

? (after any of above) = Match as few as possible times.

Option patterns:

(pattern) = Groups the pattern to act as one item and captures it

(x|y) = Matches either pattern x, or pattern y

[abc] = Matches either the character a, b, or c

[^abc] = Matches any character except a, b, or c

[a-f] = Matches characters a through f

Note

Regular expressions are powerful, and a full discussion of them is beyond the scope of this book. You may want to study them more by reading the PHP documentation at http://php.net/pcre.