PythonPython3java
 

Python normal expression

A normal articulation is an extraordinary succession of characters that encourages you effectively check if a string matches an example.

Python has included the re module since form 1.5, which gives a Perl-style normal articulation design.

The

re module gives the Python language loaded with standard articulation usefulness.

The

compile work produces a normal articulation object dependent on an example string and discretionary banner parameters. This article has a lot of techniques for customary articulation coordinating and substitution.

The

re module additionally furnishes capacities that are completely useful with these strategies, utilizing an example string as their first contention.

This part centers around normal articulation preparing capacities usually utilized in Python.


re.match function

re.match attempts to match a pattern from the beginning of the string. If the start position is not matched successfully, match() returns none.

Function syntax:

re.match(pattern, < /span>string, flags =0)

Function parameter description:

The
ParametersDescription
patternmatching customary expressions
stringThe string to coordinate.
flags banner is utilized to control how standard articulations are coordinated, for example, regardless of whether to recognize capitalized and lowercase, multi-line coordinating, etc. See: Regular Expression Modifiers - Optional Flags

The coordinating achievement re.match strategy restores a coordinating article, else it returns None.

We can utilize the group(num) or gatherings() coordinating article capacities to get the coordinating articulation.

Matching object methodsDescription
group(num=0) matches the string of the whole articulation, gathering() can enter different gathering numbers at once, in which case it will return one containing those The tuple of the esteem relating to the gathering.
groups()Returns a tuple containing all the gathering strings, from 1 to the included group number.

Instance

#!/usr/bin/python # -*- coding: UTF-8 -*- import re print(re.match('www', 'www.welookups.com').span()) # Match at the starting position print(re.match('com', 'www.welookups.com')) # Does not match at the starting position

The above example runs the output as:

(0, 3)
None

Instance

#!/usr/bin/python import re line = "Cats are smarter than dogs" matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I) if matchObj: print "matchObj.group() : ", matchObj.group() print "matchObj.group(1) : ", matchObj.group(1) print "matchObj.group(2) : ", matchObj.group(2) else: print "No match!!"

The above example execution results are as follows´╝Ü

matchObj.group() :  Cats are smarter than dogs
matchObj.group(1) :  Cats
matchObj.group(2) :  smarter

re.search method

re.search scans the entire string and returns the first successful match.

Function syntax:

re.search(pattern, < /span>string, flags =0)

Function parameter description:

The
ParametersDescription
patternmatching normal expressions
stringThe string to coordinate.
flags banner is utilized to control how ordinary articulations are coordinated, for example, regardless of whether to recognize capitalized and lowercase, multi-line coordinating, etc.

Successful match The re.search strategy restores a coordinating item, else it returns None.

We can utilize the group(num) or gatherings() coordinating item capacities to get the coordinating articulation.

Matching object methodsDescription
group(num=0) matches the string of the whole articulation, gathering() can enter different gathering numbers at once, in which case it will return one containing those The tuple of the esteem relating to the gathering.
groups()Returns a tuple containing all the gathering strings, from 1 to the included group number.

Instance

#!/usr/bin/python # - *-coding: UTF-8 - *- import re print(re.search('www', 'www.welookups.com').span()) # Match at the beginning position print(re.search('com', 'www.welookups.com').span()) # Does not coordinate at the beginning position

The above precedent runs the yield as:

(0, 3)
(11, 14)

instance

#!/usr/bin/python import re line = "Cats are smarter than dogs"; searchObj = re.search( r'(.*) are (.*?) .*', line, re.M|re.I) if searchObj: print "searchObj.group() : ", searchObj.group() print "searchObj.group(1) : ", searchObj.group(1) print "searchObj.group(2) : ", searchObj.group(2) else: print "Nothing found!!"
The above example execution results are as follows:
searchObj.group() : Cats are smarter than dogs
searchObj.group(1) : Cats
searchObj.group(2) : smarter

The difference between re.match and re.search

re.match only matches the beginning of the string. If the string does not match the regular expression, the match fails, the function returns None; and re.search matches the entire string until a match is found.

Instance

#!/usr/bin/python import re line = "Cats are smarter than dogs"; matchObj = re.match( r'dogs', line, re.M|re.I) if matchObj: print "match --> matchObj.group() : ", matchObj.group() else: print "No match!!" matchObj = re.search( r'dogs', line, re.M|re.I) if matchObj: print "search --> matchObj.group() : ", matchObj.group() else: print "No match!!"
The above model keeps running as pursues:
No match!! 

look - - > matchObj.group() : dogs

Search and Replace

Python's re module gives re.sub to supplanting matches in strings.

Syntax:

re.sub(pattern, repl, string, count=0, flags=0)

Parameters:

  • pattern : The example string in the standard.
  • Repl : The supplanted string, which can likewise be a capacity.
  • String : The first string to be supplanted by the query.
  • Tally : The most extreme number of substitutions after example coordinating. The default 0 intends to supplant all matches.

Instance

#!/usr/bin/python # -*- coding: UTF-8 -*- import re phone = "2004-959-559 # This is a foreign phone number" # Remove Python comments from strings num = re.sub(r'#.*$', "", phone) print "phone number is: ", num # Delete non-numeric (-) strings num = re.sub(r'\D', "", phone) print "phone number is : ", num
The above example execution results are as follows:
phone number is: 2004-959-559
phone number is : 2004959559

repl The argument is a function

Multiply the number in the string by 2 in the following example:

Instance

#!/usr/bin/python # -*- coding: UTF-8 -*- import re # Multiply the matching number by 2 def double(matched): value = int(matched.group('value')) return str(value * 2) s = 'A23G4HFD567' print(re.sub('(?P<value>\d+)', double, s))

Execution output is:

A46G8HFD1134

re.compile function

The

compile function is used to compile a regular expression and generate a regular expression (pattern ) object for use by the match() and search() functions.

The syntax is:

Parameters:

  • pattern : A regular expression in the form of a string

  • flags : Optional, indicating matching mode, such as ignoring case, multi-line mode, etc. The specific parameters are:

    1. re.I ignore case
    2. re.L indicates that the special character set \w, \W, \b, \B, \s, \S depends on the current environment
    3. re.M multi-line mode
    4. re.S is . and includes any characters including line breaks (. not included Line breaks)
    5. re.U indicates that the special character set \w, \W, \b, \B, \d, \D, \s, \S depends on the Unicode character property database
    6. re.X For readability, ignore spaces and comments after #
>

Instance

Instance

>>>import re >>> pattern = re.compile(r'\d+') # Used to match at least one number >>> m = pattern.match('one12twothree34four') # Find the head, no match >>> print m None >>> m = pattern.match('one12twothree34four', 2, 10) # Matches from the 'e' position, no match >>> print m None >>> m = pattern.match('one12twothree34four', 3, 10) # Matches from the '1' position, just matching >>> print m #Return a Match object lt;_sre.SRE_Match object at 0x10a42aac0> >>> m.group(0) # Can be omitted 0 '12' >>> m.start(0) # Can be omitted 0 3 >>> m.end(0) # Can be omitted 0 5 >>> m.span(0) # Can be omitted 0 (3, 5)

In the above, a Match object is returned when the match is successful, where:

  • group([group1, ...]) method is used to get one or more group matching strings. When you want to get the entire matching substring, you can use group directly. ) or group(0);
  • The
  • start([group]) method is used to get the starting position of the substring of the group matching in the entire string (the index of the first character of the substring). The default value of the parameter is 0. ;
  • The
  • end([group]) method is used to get the end position of the substring of the packet matching in the entire string (index +1 of the last character of the substring). The default value of the parameter is 0. ;
  • The
  • span([group]) method returns (start(group), end(group)).

Look at an example:

Instance

>>>import re >>> pattern = re.compile(r'([a-z]+) ([a-z]+)', re.I) # re.I Indicates that case is ignored >>> m = pattern.match('Hello World Wide Web') >>> print m # Matches successfully, returns a Match object <_sre.SRE_Match object at 0x10bea83e8> >>> m.group(0) # Returns the entire substring of the matching success 'Hello World' >>> m.span(0) # Returns the index of the entire substring that matched the success (0, 11) >>> m.group(1) # Returns the substring of the first group matching success 'Hello' >>> m.span(1) # Returns the index of the substring that the first packet matches successfully (0, 5) >>> m.group(2) # Returns the substring of the second group matching success 'World' >>> m.span(2) # Returns the substring of the second group matching success (6, 11) >>> m.groups() #Equivalent to (m.group(1), m.group(2), ...) ('Hello', 'World') >>> m.group(3) # Returns the second group matching the successful substring does not exist the third group Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: no such group

findall

Find all substrings matched by the regular expression in the string and return a list, or an empty list if no match is found.

Note: match and search are matched once and findall matches all.

The syntax is:

findall(string[, pos[, endpos]])

Parameters:

  • string : The string to be matched.
  • pos : An optional parameter that specifies the starting position of the string. The default is 0.
  • endpos : An optional parameter that specifies the end of the string. The default is the length of the string.

Find all the numbers in the string:

Instance

# -*- coding:UTF8 -*- import re pattern = re.compile(r'\d+') # Find numbers result1 = pattern.findall('welookups 123 google 456') result2 = pattern.findall('we88lookups123google456', 0, 10) print(result1) print(result2)

Output results:

['123', '456']
['88', '12']

re.finditer

Similar to findall, find all substrings that the regular expression matches in the string and return them as an iterator.

re.finditer(pattern, string, flags=0)





welookups is optimized for learning.© welookups 2018 -
All Right Reserved and you agree to have read and accepted our term and condition