Definition
The regex indicates the usage of Regular Expression In Python. The Python regex helps in searching the required pattern by the user i.e. the user can find a pattern or search for a set of strings. To perform regex, the user must first import the re package.
Syntax:
import re
To build a regular expression, the user may use metacharacters, special sequences, and sets.
Metacharacters
The metacharacters are special characters used in building the Regular Expression In Python which has specific meanings in it. Some of the commonly used metacharacters are:
Metacharacter | Description | Example |
---|---|---|
[] | A condition can be provided to specify the required set of characters. | [a-j], [0-5] |
\ | This specifies the usage of escape sequence characters like special sequences. | “\d” |
. | Specifies any number of characters present between two strings or a set of strings. | py…n |
^ | This should be placed before the string or any condition to specify that the result must return the string beginning with the given string. | ^the |
$ | This should be placed at the end of the string or condition to specify that the result must return the string ending with the given string. | python$ |
* | This should be placed at the end of the given pattern. This specifies that the result must have zero or more occurrences of the given pattern. | “oo*” |
+ | This should be placed at the end of the given pattern. This specifies that the result must have at least one or more occurrences of the given pattern. | “oo+” |
{} | The exact number of required occurrences must be mentioned inside the brackets. | “oo{1}” |
| | Two patterns or string will be given. It will check either one among them is present or not. | practice | study |
Example1
import re txt = "Learning python is easy" x = re.findall("^Learning.*easy$", txt) print(x)
Output
['Learning python is easy']
Example2
import re txt = "Learning python is easy" x = re.findall("py...n", txt) print(x)
Output
['python']
Special sequences
The special sequences are like escape sequences. They are followed by a character after \. Some of the commonly used special sequences are:
- \A – It is placed before the required string to be searched. It returns the matched string. Example: \AHello
- \b – This is placed before the beginning of the pattern or at the end of the pattern. It returns if the given pattern is found in the beginning or at the end. Example: h”\bello” – searches if ello pattern is found at the beginning of any word. h”ello\b” – searches if ello pattern is found at the end of any word.
- \B – This is the exact opposite of \b. It returns only the strings when the pattern is not in the beginning or at the end.
- \d – This can be used to find whether any numbers are present in the given string.
- \D – Exact opposite of \d. It returns the strings given in the input that does not have any numbers.
Example1
import re test_string= "Hello world" x = re.findall("\AHello", test_string) print(x)
Output
['Hello']
Example2
import re test_string= "Hello world" x = re.findall(r"ello\b", test_string) print(x)
Output
['ello']
Sets
The sets are always represented by [] brackets. The sets are special characters placed inside the brackets in Regular Expression In Python. Some of the common usages of sets are:
- [xyz] – If any of the characters specified in the sets are matched, then it returns the list of matched characters.
- [a-j] – If any of the characters specified in the sets are matched, then it returns the list of matched characters in alphabetical order.
- [^xyz] – Except the characters specified in the sets other characters are returned.
- [123] – If any of the numbers specified in the sets are matched, then it returns the list of matched numbers.
- [0-9] – If any of the numbers specified in the sets are matched, then it returns the list of matched numbers in order.
- [0-4][0-4] – This is used for finding two-digit numbers i.e. it returns the number between 00 and 44.
- [*+] – The special characters which are specified in the sets are matched, then it returns the list of matched characters.
Example1
import re txt = "Happy learning..." x = re.findall("[hpyz]", txt) print(x)
Output
['p', 'p', 'y']
Example2
import re txt = "Happy learning..." x = re.findall("[...]", txt) print(x)
Output
['.', '.', '.']
Funtions
The regex expression mainly uses 4 functions. They are:
- findall()
- search()
- split()
- sub()
findall()
The findall() method is used to find all the matches and returns it in the form of a list.
Example
import re txt = "Happy learning..." x = re.findall("earn", txt) print(x)
Output
['earn']
search()
The search() method is used to search the given pattern and return the matched items in a list.
Example1
import re txt = "Happy learning..." x = re.search("earn", txt) if(x): print('Match found') else: print('No match found')
Output
Match found
Example2
import re txt = "Happy learning..." x = re.search("python", txt) if(x): print('Match found') else: print('No match found')
Output
No match found
split()
The split() method is used to split the string matched with the condition specified.
Example
import re txt = "Happy learning..." x = re.split("\s", txt) # used to split the text at each white space print(x)
Output
['Happy', 'learning...']
sub()
The sub() method is used to substitute the given text at the specified position.
Example
import re txt = "Happy python learning..." x = re.sub("\s","-", txt) print(x)
Output
Happy-python-learning...
Note: For sub(), the user can also pass another parameter mentioning the number of times in which the substitute should be used.
Example
import re txt = "Happy python learning..." x = re.sub("\s","-", txt,1) print(x)
Output
Happy-python learning...
Also Read:
Create Language Translator Using Python
Get Any Country Date And Time Using Python
Snake Game in Python using Pygame
Covid-19 Tracker Application Using Python
YouTube Video Downloader Application Using Python