a second glance at regular expressions
Posted by
shashank ( shantaram )
a second glance at regular expressions
i remember learning all about regular expressions, grammer and the like in college but it was only last week when i looking for a quick method for email parsing and validation that i realised its practical use.
Using a good regex engine and a well-writen regular expression, one can perform all kinds of text-manipulation tasks. Regular expressions can be used to identify for certain conditions or charater sequences in a text file or data stream.
The most common place you'd find regular expressions is email address validation and search - replace functions . A search for " email validation using regular expressions" on google would prove my point.
So what do you need to start using regular expressions ? nothin you dont already have. Regular expressions are supported by most languages and tools in use.
i've used the
java.util.regex
API in java.so here's a simple example in java for email validation that should give you an idea of how regular expressions can be used.
public static void main(String[] args){
email=email.trim();
// Email Address validation
Pattern p=Pattern.compile("[a-zA-Z]*[0-9]*@[a-zA-Z]*\\.[a-zA-Z]*");
/* If you need a more detailed validation
Pattern p=Pattern.compile("^[a-zA-Z][\\w\\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\\w\\.-]*[a-zA-Z0-9]\\.[a-zA-Z][a-zA- Z\\.]*[a-zA-Z]$");
*/
Matcher m=p.matcher(email);
boolean result=m.matches();
if (result==true)
System.out.println( email + " is a VALID email address");
else
System.out.println( email + " is an INVALID email address");
}
What it means
[a-zA-Z]*[0-9]*@[a-zA-Z]*\\.[a-zA-Z]*
[a-zA-Z] --- any characted from the union of a to z and A-Z
[a-zA-Z]* --- the * means zero or more occurences
similarly for [0-9]*
\\. --- a dot ( \\ is escape character )
some more examples of character classes ( anything in [] )
[^x] - any character except x
[a-z && [x-z]] - x, or z ie- intersection
predefianed character classes
\d --- any digit
\w --- a word character ( ie [a-zA-Z_0-9])
. --- any character
for a detailed tutorial in check out http://java.sun.com/docs/books/tutorial/essential/regex/index.html
Cons : Regular expressions are easier to write than they are to read. so use it only if there arent too many people apart from you maintaining the code.
Subscribe to:
Post Comments (Atom)
Post a Comment