Monday, October 4, 2010 | By: mayurJavascript

Regular Expression(Pattern Matching in Javascript)

The RegExp object is a Core JavaScript Object. RegExp stands for Regular Expression. The RegExp object was based on the PERL implementation of Regular Expressions. PERL is a very capable and powerful scripting language. To put it simply, the RegExp object is used to find a match to the text you want to find. Various "switches" are used to give you options on how to find the text. On your browser, if you select the "find" option from the menu, you type in a string of text you would like to find, then click OK. The browser uses a Regular Expression to find a match to your text within the web page (or whatever application you are working in - the Regular Expression is very widely used, and in many languages).
With the RegExp object, you may not only find a match to your desired text, but also verify user input for things like valid postal and zip codes, telephone numbers, and account numbers. The RegExp object works by first creating a new instance of the Core RegExp object, then assigning a pattern for the object to match, as in the following syntax example.
var name = new RegExp("String of Text"); 
The above syntax example creates a new instance of the RegExp object called name which looks for the string String of Text. While this is the most common method of creating a new RegExp object, there is another more shorthanded form, as follows.
var name = /String of Text/; 
Placing the String of Text between two forward slashes tells JavaScript that the text, the "pattern", must be applied to the RegExp object. This method is called "Direct Assignment".

Defining Your Search Patterns

As mentioned earlier, there is an extensive set of "switches" used to further refine your search. Used properly and with some creativity, it is almost guaranteed that you'll find a specific match to your search string. The pattern matching characters available to JavaScript are given in the list below.
  • \w - Find a match to any alphanumeric character within a word
  • \W - Find a match to any non-word character
  • \s - Find a match to any whitespace character such as a tab character, newline, carriage return (enter), form feed, or vertical tab.
  • \d - Find a match to any numeric digit
  • \D - Find any character that is not a number
  • [ \b ] - Find a match for a backspace.
  • . (period) - Find a match for any character except a newline character.
  • [ ... ] - Match any one character within the square brackets.
  • [ ^... ] - Match any character not within the square brackets.
  • [ x-y ] - Match any character between X and Y.
  • [ ^x-y ] - Match any character not between X and Y
  • { x, y } - Match the previous search string at least X times, not to exceed Y times.
  • { x, } - Match the previous search string at least X times.
  • { x } - Match the previous search string exactly X times.
  • ? - Match the previous search string once or not at all.
  • + - Match the previous search string at least once.
  • * - Match the previous search string any number of times, or not at all
  • | - Match the expression to the left or the right of the | character.
  • ( ... ) - Group everything inside of the parentheses into one sub-pattern.
  • \x - Match the same pattern within the last sub-pattern in group X.
  • ^ - Match the beginning of a string or the beginning of a line, in multi-line search string matches.
  • $ - Match the end of the search string or the end of a line, in multi-line search string matches.
  • \b - Match the position between a word character and a non-word character.
  • \B - Match the position that is not between a word character and a non-word character.
Using the above list of options, it is almost a certainty that you'll find the string you are looking for. But this is not all that JavaScript offers. There is also a list of what are called Literal Characters JavaScript uses to make the search for your string easier, less complicated. That list is as follows.
  • \f - Find a form feed character.
  • \n - Find a new line character.
  • \r - Find a carriage return character.
  • \t - Find a tab character.
  • \v - Find a vertical tab character.
  • \/ - Find a forward slash.
  • \\ - Find a backward slash.
  • \. - Find a period.
  • \* - Find an asterisk.
  • \+ - Find a plus character.
  • \? - Find a question mark.
  • \| - Find a horizontal bar.
  • \( - Find a left parentheses character.
  • \) - Find a right parentheses character.
  • \[ - Find a left square bracket character.
  • \] - Find a right square bracket character.
  • \{ - Find a left curly brace character.
  • \} - Find a right curly brace character.
  • \XXX - Find an ASCII character represented by the octal number \XXX.
  • \xHH - Find an ASCII character represented by the hexadecimal number \xHH.
  • \cX - Find a control character represented by \cX.
You can see that most of the conceivable options have been defined for you. The creators of the JavaScript specifications have been very thorough. But there are two more "switches" that are probably the most commonly used of them all. Those two Pattern Attributes are as follows.
  • \g - This is for a Global Match, and is used to find all possible matches to your search string.
  • \i - This is for a case-insensitive match.