Java – Regular Expressions

Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar to the Perl programming language and very easy to learn.

A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. They can be used to search, edit, or manipulate text and data.

The java.util.regex package primarily consists of the following three classes −

  • Pattern Class − A Pattern object is a compiled representation of a regular expression. The Pattern class provides no public constructors. To create a pattern, you must first invoke one of its public staticcompile() methods, which will then return a Pattern object. These methods accept a regular expression as the first argument.

  • Matcher Class − A Matcher object is the engine that interprets the pattern and performs match operations against an input string. Like the Pattern class, Matcher defines no public constructors. You obtain a Matcher object by invoking the matcher()method on a Pattern object.

  • PatternSyntaxException − A PatternSyntaxException object is an unchecked exception that indicates a syntax error in a regular expression pattern.

Capturing Groups

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters “d”, “o”, and “g”.

Capturing groups are numbered by counting their opening parentheses from the left to the right. In the expression ((A)(B(C))), for example, there are four such groups −

  • ((A)(B(C)))
  • (A)
  • (B(C))
  • (C)

To find out how many groups are present in the expression, call the groupCount method on a matcher object. The groupCount method returns an int showing the number of capturing groups present in the matcher’s pattern.

There is also a special group, group 0, which always represents the entire expression. This group is not included in the total reported by groupCount.

Example

Following example illustrates how to find a digit string from the given alphanumeric string −

import java.util.regex.Matcher;import java.util.regex.Pattern;publicclassRegexMatches{publicstaticvoid main(String args[]){// String to be scanned to find the pattern.String line ="This order was placed for QT3000! OK?";String pattern ="(.*)(\\d+)(.*)";// Create a Pattern objectPattern r =Pattern.compile(pattern);// Now create matcher object.Matcher m = r.matcher(line);if(m.find()){System.out.println("Found value: "+ m.group(0));System.out.println("Found value: "+ m.group(1));System.out.println("Found value: "+ m.group(2));}else{System.out.println("NO MATCH");}}}

This will produce the following result −

Output

Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0

Regular Expression Syntax

Here is the table listing down all the regular expression metacharacter syntax available in Java −

Subexpression Matches
^ Matches the beginning of the line.
$ Matches the end of the line.
. Matches any single character except newline. Using m option allows it to match the newline as well.
[…] Matches any single character in brackets.
[^…] Matches any single character not in brackets.
\A Beginning of the entire string.
\z End of the entire string.
\Z End of the entire string except allowable final line terminator.
re* Matches 0 or more occurrences of the preceding expression.
re+ Matches 1 or more of the previous thing.
re? Matches 0 or 1 occurrence of the preceding expression.
re{ n} Matches exactly n number of occurrences of the preceding expression.
re{ n,} Matches n or more occurrences of the preceding expression.
re{ n, m} Matches at least n and at most m occurrences of the preceding expression.
a| b Matches either a or b.
(re) Groups regular expressions and remembers the matched text.
(?: re) Groups regular expressions without remembering the matched text.
(?> re) Matches the independent pattern without backtracking.
\w Matches the word characters.
\W Matches the nonword characters.
\s Matches the whitespace. Equivalent to [\t\n\r\f].
\S Matches the nonwhitespace.
\d Matches the digits. Equivalent to [0-9].
\D Matches the nondigits.
\A Matches the beginning of the string.
\Z Matches the end of the string. If a newline exists, it matches just before newline.
\z Matches the end of the string.
\G Matches the point where the last match finished.
\n Back-reference to capture group number “n”.
\b Matches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets.
\B Matches the nonword boundaries.
\n, \t, etc. Matches newlines, carriage returns, tabs, etc.
\Q Escape (quote) all characters up to \E.
\E Ends quoting begun with \Q.

Methods of the Matcher Class

Here is a list of useful instance methods −

Index Methods

Index methods provide useful index values that show precisely where the match was found in the input string −

Sr.No. Method & Description
1

public int start()

Returns the start index of the previous match.

2

public int start(int group)

Returns the start index of the subsequence captured by the given group during the previous match operation.

3

public int end()

Returns the offset after the last character matched.

4

public int end(int group)

Returns the offset after the last character of the subsequence captured by the given group during the previous match operation.

Study Methods

Study methods review the input string and return a Boolean indicating whether or not the pattern is found −

Sr.No. Method & Description
1

public boolean lookingAt()

Attempts to match the input sequence, starting at the beginning of the region, against the pattern.

2

public boolean find()

Attempts to find the next subsequence of the input sequence that matches the pattern.

3

public boolean find(int start)

Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

4

public boolean matches()

Attempts to match the entire region against the pattern.

Replacement Methods

Replacement methods are useful methods for replacing text in an input string −

Sr.No. Method & Description
1

public Matcher appendReplacement(StringBuffer sb, String replacement)

Implements a non-terminal append-and-replace step.

2

public StringBuffer appendTail(StringBuffer sb)

Implements a terminal append-and-replace step.

3

public String replaceAll(String replacement)

Replaces every subsequence of the input sequence that matches the pattern with the given replacement string.

4

public String replaceFirst(String replacement)

Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string.

5

public static String quoteReplacement(String s)

Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class.

The start and end Methods

Following is the example that counts the number of times the word “cat” appears in the input string −

Example

import java.util.regex.Matcher;import java.util.regex.Pattern;publicclassRegexMatches{privatestaticfinalString REGEX ="\\bcat\\b";privatestaticfinalString INPUT ="cat cat cat cattie cat";publicstaticvoid main(String args[]){Pattern p =Pattern.compile(REGEX);Matcher m = p.matcher(INPUT);// get a matcher objectint count =0;while(m.find()){
         count++;System.out.println("Match number "+count);System.out.println("start(): "+m.start());System.out.println("end(): "+m.end());}}}

This will produce the following result −

Output

Match number 1
start(): 0
end(): 3
Match number 2
start(): 4
end(): 7
Match number 3
start(): 8
end(): 11
Match number 4
start(): 19
end(): 22

You can see that this example uses word boundaries to ensure that the letters “c” “a” “t” are not merely a substring in a longer word. It also gives some useful information about where in the input string the match has occurred.

The start method returns the start index of the subsequence captured by the given group during the previous match operation, and the end returns the index of the last character matched, plus one.

The matches and lookingAt Methods

The matches and lookingAt methods both attempt to match an input sequence against a pattern. The difference, however, is that matches requires the entire input sequence to be matched, while lookingAt does not.

Both methods always start at the beginning of the input string. Here is the example explaining the functionality −

Example

import java.util.regex.Matcher;import java.util.regex.Pattern;publicclassRegexMatches{privatestaticfinalString REGEX ="foo";privatestaticfinalString INPUT ="fooooooooooooooooo";privatestaticPattern pattern;privatestaticMatcher matcher;publicstaticvoid main(String args[]){
      pattern =Pattern.compile(REGEX);
      matcher = pattern.matcher(INPUT);System.out.println("Current REGEX is: "+REGEX);System.out.println("Current INPUT is: "+INPUT);System.out.println("lookingAt(): "+matcher.lookingAt());System.out.println("matches(): "+matcher.matches());}}

This will produce the following result −

Output

Current REGEX is: foo
Current INPUT is: fooooooooooooooooo
lookingAt(): true
matches(): false

The replaceFirst and replaceAll Methods

The replaceFirst and replaceAll methods replace the text that matches a given regular expression. As their names indicate, replaceFirst replaces the first occurrence, and replaceAll replaces all occurrences.

Here is the example explaining the functionality −

Example

import java.util.regex.Matcher;import java.util.regex.Pattern;publicclassRegexMatches{privatestaticString REGEX ="dog";privatestaticString INPUT ="The dog says meow. "+"All dogs say meow.";privatestaticString REPLACE ="cat";publicstaticvoid main(String[] args){Pattern p =Pattern.compile(REGEX);// get a matcher objectMatcher m = p.matcher(INPUT); 
      INPUT = m.replaceAll(REPLACE);System.out.println(INPUT);}}

This will produce the following result −

Output

The cat says meow. All cats say meow.

The appendReplacement and appendTail Methods

The Matcher class also provides appendReplacement and appendTail methods for text replacement.

Here is the example explaining the functionality −

Example

import java.util.regex.Matcher;import java.util.regex.Pattern;publicclassRegexMatches{privatestaticString REGEX ="a*b";privatestaticString INPUT ="aabfooaabfooabfoob";privatestaticString REPLACE ="-";publicstaticvoid main(String[] args){Pattern p =Pattern.compile(REGEX);// get a matcher objectMatcher m = p.matcher(INPUT);StringBuffer sb =newStringBuffer();while(m.find()){
         m.appendReplacement(sb, REPLACE);}
      m.appendTail(sb);System.out.println(sb.toString());}}

This will produce the following result −

Output

-foo-foo-foo-

PatternSyntaxException Class Methods

A PatternSyntaxException is an unchecked exception that indicates a syntax error in a regular expression pattern. The PatternSyntaxException class provides the following methods to help you determine what went wrong −

Sr.No. Method & Description
1

public String getDescription()

Retrieves the description of the error.

2

public int getIndex()

Retrieves the error index.

3

public String getPattern()

Retrieves the erroneous regular expression pattern.

4

public String getMessage()

Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern.

Java – Methods

A Java method is a collection of statements that are grouped together to perform an operation. When you call the System.out.println() method, for example, the system actually executes several statements in order to display a message on the console.

Now you will learn how to create your own methods with or without return values, invoke a method with or without parameters, and apply method abstraction in the program design.

Creating Method

Considering the following example to explain the syntax of a method −

Syntax

public static int methodName(int a, int b) {
   // body
}

Here,

  • public static − modifier

  • int − return type

  • methodName − name of the method

  • a, b − formal parameters

  • int a, int b − list of parameters

Method definition consists of a method header and a method body. The same is shown in the following syntax −

Syntax

modifier returnType nameOfMethod (Parameter List) {
   // method body
}

The syntax shown above includes −

  • modifier − It defines the access type of the method and it is optional to use.

  • returnType − Method may return a value.

  • nameOfMethod − This is the method name. The method signature consists of the method name and the parameter list.

  • Parameter List − The list of parameters, it is the type, order, and number of parameters of a method. These are optional, method may contain zero parameters.

  • method body − The method body defines what the method does with the statements.

Example

Here is the source code of the above defined method called max(). This method takes two parameters num1 and num2 and returns the maximum between the two −

/** the snippet returns the minimum between two numbers */publicstaticint minFunction(int n1,int n2){int min;if(n1 > n2)
      min = n2;else
      min = n1;return min;}

Method Calling

For using a method, it should be called. There are two ways in which a method is called i.e., method returns a value or returning nothing (no return value).

The process of method calling is simple. When a program invokes a method, the program control gets transferred to the called method. This called method then returns control to the caller in two conditions, when −

  • the return statement is executed.
  • it reaches the method ending closing brace.

The methods returning void is considered as call to a statement. Lets consider an example −

System.out.println("This is tutorialspoint.com!");

The method returning value can be understood by the following example −

int result = sum(6,9);

Following is the example to demonstrate how to define a method and how to call it −

Example

publicclassExampleMinNumber{publicstaticvoid main(String[] args){int a =11;int b =6;int c = minFunction(a, b);System.out.println("Minimum Value = "+ c);}/** returns the minimum of two numbers */publicstaticint minFunction(int n1,int n2){int min;if(n1 > n2)
         min = n2;else
         min = n1;return min;}}

This will produce the following result −

Output

Minimum value = 6

The void Keyword

The void keyword allows us to create methods which do not return a value. Here, in the following example we’re considering a void methodmethodRankPoints. This method is a void method, which does not return any value. Call to a void method must be a statement i.e.methodRankPoints(255.7);. It is a Java statement which ends with a semicolon as shown in the following example.

Example

publicclassExampleVoid{publicstaticvoid main(String[] args){
      methodRankPoints(255.7);}publicstaticvoid methodRankPoints(double points){if(points >=202.5){System.out.println("Rank:A1");}elseif(points >=122.4){System.out.println("Rank:A2");}else{System.out.println("Rank:A3");}}}

This will produce the following result −

Output

Rank:A1

Passing Parameters by Value

While working under calling process, arguments is to be passed. These should be in the same order as their respective parameters in the method specification. Parameters can be passed by value or by reference.

Passing Parameters by Value means calling a method with a parameter. Through this, the argument value is passed to the parameter.

Example

The following program shows an example of passing parameter by value. The values of the arguments remains the same even after the method invocation.

publicclass swappingExample {publicstaticvoid main(String[] args){int a =30;int b =45;System.out.println("Before swapping, a = "+ a +" and b = "+ b);// Invoke the swap method
      swapFunction(a, b);System.out.println("\n**Now, Before and After swapping values will be same here**:");System.out.println("After swapping, a = "+ a +" and b is "+ b);}publicstaticvoid swapFunction(int a,int b){System.out.println("Before swapping(Inside), a = "+ a +" b = "+ b);// Swap n1 with n2int c = a;
      a = b;
      b = c;System.out.println("After swapping(Inside), a = "+ a +" b = "+ b);}}

This will produce the following result −

Output

Before swapping, a = 30 and b = 45
Before swapping(Inside), a = 30 b = 45
After swapping(Inside), a = 45 b = 30
**Now, Before and After swapping values will be same here**:
After swapping, a = 30 and b is 45

Method Overloading

When a class has two or more methods by the same name but different parameters, it is known as method overloading. It is different from overriding. In overriding, a method has the same method name, type, number of parameters, etc.

Let’s consider the example discussed earlier for finding minimum numbers of integer type. If, let’s say we want to find the minimum number of double type. Then the concept of overloading will be introduced to create two or more methods with the same name but different parameters.

The following example explains the same −

Example

publicclassExampleOverloading{publicstaticvoid main(String[] args){int a =11;int b =6;double c =7.3;double d =9.4;int result1 = minFunction(a, b);// same function name with different parametersdouble result2 = minFunction(c, d);System.out.println("Minimum Value = "+ result1);System.out.println("Minimum Value = "+ result2);}// for integerpublicstaticint minFunction(int n1,int n2){int min;if(n1 > n2)
         min = n2;else
         min = n1;return min;}// for doublepublicstaticdouble minFunction(double n1,double n2){double min;if(n1 > n2)
         min = n2;else
         min = n1;return min;}}

This will produce the following result −

Output

Minimum Value = 6
Minimum Value = 7.3

Overloading methods makes program readable. Here, two methods are given by the same name but with different parameters. The minimum number from integer and double types is the result.

Using Command-Line Arguments

Sometimes you will want to pass some information into a program when you run it. This is accomplished by passing command-line arguments to main( ).

A command-line argument is the information that directly follows the program’s name on the command line when it is executed. To access the command-line arguments inside a Java program is quite easy. They are stored as strings in the String array passed to main( ).

Example

The following program displays all of the command-line arguments that it is called with −

publicclassCommandLine{publicstaticvoid main(String args[]){for(int i =0; i<args.length; i++){System.out.println("args["+ i +"]: "+  args[i]);}}}

Try executing this program as shown here −

$java CommandLine this is a command line 200 -100

This will produce the following result −

Output

args[0]: this
args[1]: is
args[2]: a
args[3]: command
args[4]: line
args[5]: 200
args[6]: -100

Leave a comment