After reading my last post I think your fear of Regular Expressions might have decreased.😀
In Today's post, I will cover Repetitions, Grouping and capturing
REGEX_PART2


Before going to the explanation, I used Java language to solve the coding problems.

import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Regex_Pattern {    

    public static void main(String[] args) {
        
        Regex_Test tester = new Regex_Test();
        tester.checker("__________"); //here you will paste the regEx pattern
    
    }
}

class Regex_Test {

    public void checker(String Regex_Pattern){
    
        Scanner Input = new Scanner(System.in);
        String Test_String = Input.nextLine();
        Pattern p = Pattern.compile(Regex_Pattern);
        Matcher m = p.matcher(Test_String);
        int Count = 0;
        while(m.find()){
            Count += 1;
        }
        System.out.format("Number of matches : %d",Count);
    }   
    
}

Matching {x} Repetitions

The tool {x} will match exactly x repetitions of character/character class/groups.This is also known as Quantifiers.
Pattern : \w{2}
Test_String: do

Pattern : [abc]{2}
Test_String: abcabc (2 matches)

Matching {x,y} Repetitions

The tool {x} will match between x and y both inclusive repetitions of character/character class/groups.This is also known as Quantifiers.
Pattern : \w{1,4}\d{4,}
Test_String: abcb45677777abcd

Matching {x,y} Repetitions

The tool {x,y} will match between x and y both inclusive repetitions of character/character class/groups. This is also known as Quantifiers.
Pattern: [xyz]{5,} (It will match the character x, y, or z 5 or more times.)
Test_String: xyzxyz

Pattern : \w{1,4}\d{4,}
Test_String: abcb45677777abcd

Matching Zero Or More Repetitions

The * symbol matches 0 or more of the preceding token 
OR you can say matches 0 or more repetitions of character/character class/group

Pattern : \w{1,4}\d*
Test_String: abcb45677777

Matching One Or More Repetitions

The + symbol matches 1 or more of the preceding token 
OR you can say matches 1 or more repetitions of character/character class/group

Pattern : Ab+s
Test_String: As Abbbbss

Matching Word Boundaries

\b
\b matches a word boundary position between a word character and non-word character or position(start/end of the string )
Three different positions qualify for word boundaries :
  • Before the first character in the string, if the first character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.
  • After the last character in the string, if the last character is a word character.

\B
Not word boundary. Matches any position that is not a word boundary

Pattern : \bcat\b
Test_String: A cat
Test_String: Acat

Pattern: \Bcat\b
Test_String: Acat

Capturing & Non-Capturing Groups

()
Groups multiple tokens together and creates a capture group for extracting a substring or using a backreference. This allows us to apply quantifiers to that group
These parentheses also create a numbered capturing. It stores the part of the string matched by the part of regex inside parentheses.

(?:)
Groups multiple tokens together without creating a capture group

Pattern : abc(.*)ij
Test_String: abcefgijklmn

Pattern : (?:ha)+
Test_String: hahaha haa hah! (3 matches)

Alternative Matching

|
This | symbol acts like a boolean OR matches the expression before or after | . This is also known as Alternation.
When used inside a character class, it will match characters; when used inside a group, it will match entire expressions (i.e., everything to the left or everything to the right of the vertical bar). We must use parentheses to limit the use of alternations.

Pattern : b(a|e|i)d
Test_String: bad bud bod bed bid (3 matches)



Note: Use \\ instead of using \ in java

In the next post, I will discuss on Backreferences and Assertions