Commit e74c71c

author

yangwenqiang.ywq

committed

re and regex

Change-Id: I3ac924a221db7d2109ede7d2c984d173fe55747f

1 parent 9719aa1 commit e74c71cCopy full SHA for e74c71c

File tree

3 files changed

+116

-2

lines changed

README.md
content
- re.md
- regex.md

3 files changed

+116

-2

lines changed

`‎README.md‎`

Lines changed: 2 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -131,6 +131,8 @@ python的强大之处有很大的一方面在于它有各种各样非常强大`
`131`	`131`
`132`	`132`	`## [re](content/re.md)`
`133`	`133`
	`134`	`+## [regex](content/regex.md)`
	`135`	`+`
`134`	`136`	`## [colorama](content/colorama.md)`
`135`	`137`
`136`	`138`	`## [termcolor](content/termcolor.md)`

`‎content/re.md‎`

Lines changed: 44 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -69,7 +69,8 @@ match是从字符串开头做匹配,search是从字符串中做任意匹配,`
`69`	`69`	`>>> e = re.findall(r"\w","hello , world")`
`70`	`70`	`>>> e`
`71`	`71`	`['h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'l', 'd']`
`72`		`-`
	`72`	`+>>> re.findall(r"\d+", "2333abc3uio890da123")`
	`73`	`+['2333', '3', '890', '123']`
`73`	`74`	```
`74`	`75`
`75`	`76`	除了查找之外,正则表达式还有两个很重要的功能就是分割与替换,在这里分别是sub和split,用法是`sub(pattern, repl, string, count=0, flags=0)`和`split(pattern, string, maxsplit=0, flags=0)`,返回改变之后的字符串,传入值保持不变。
`@@ -95,6 +96,8 @@ match是从字符串开头做匹配,search是从字符串中做任意匹配,`
`95`	`96`	`'hello , world'`
`96`	`97`	`>>> '%s , %s' % (re.match(r'(hello) , (world)', a).group(2), re.match(r'(hello) , (world)', a).group(1))`
`97`	`98`	`'world , hello'`
	`99`	`+>>> re.search(r"(.+?)1円+", 'dxabcabcyyyydxycxcxz').group()`
	`100`	`+'abcabc'`
`98`	`101`	```
`99`	`102`
`100`	`103`	`关于sub函数,还有一个subn函数,用法与sub一致,但是返回一个元组,由改变之后的字符串和改变的个数组成`
`@@ -108,6 +111,16 @@ match是从字符串开头做匹配,search是从字符串中做任意匹配,`
`108`	`111`	`'hello , world'`
`109`	`112`	```
`110`	`113`
	`114`	+找到连续的重复字符, `1円` 可以用来指代已经匹配到的分组
	`115`	`+`
	`116`	+```
	`117`	`+In [4]: re.search(r"(.+?)1円+", 'dxabcabcyyyydxycxcxz').group()`
	`118`	`+Out[4]: 'abcabc'`
	`119`	`+`
	`120`	`+In [5]: re.search(r"(.+?)1円+", 'dxabcabcyyyydxycxcxz').groups()`
	`121`	`+Out[5]: ('abc',)`
	`122`	+```
	`123`	`+`
`111`	`124`	`## 总结`
`112`	`125`
`113`	`126`	`- 使用 match 从头开始匹配,使用 search 从中匹配`
`@@ -222,6 +235,35 @@ print m.group()`
`222`	`235`	`<html>`
`223`	`236`	```
`224`	`237`
	`238`	`+## 分组于捕获`
	`239`	`+`
	`240`	`+\|代码 / 语法 \|匹配说明\|`
	`241`	`+\|-- \|--- \|`
	`242`	`+\|(?:) \| 只做匹配分组,不做结果展示,否则会有很多无用的分组结果 \|`
	`243`	`+\|(?P<name>) \| 对分组结果结果命令,使用命获取 \|`
	`244`	`+`
	`245`	+```
	`246`	`+In [1]: import re`
	`247`	`+`
	`248`	`+In [2]: re.match(r"(?P<key>\w+):(?P<value>\d+)", "haha:1").groups()`
	`249`	`+Out[2]: ('haha', '1')`
	`250`	`+`
	`251`	`+In [3]: re.match(r"(?P<key>\w+):(?P<value>\d+)", "haha:1").groupdict()`
	`252`	`+Out[3]: {'key': 'haha', 'value': '1'}`
	`253`	`+`
	`254`	`+In [4]: re.search(r"((?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groups()`
	`255`	`+Out[4]: ('laal:2;', 'laal', '2')`
	`256`	`+`
	`257`	`+In [5]: re.search(r"((?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groupdict()`
	`258`	`+Out[5]: {'key': 'laal', 'value': '2'}`
	`259`	`+`
	`260`	`+In [6]: re.search(r"(?:(?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groups()`
	`261`	`+Out[6]: ('laal', '2')`
	`262`	+```
	`263`	`+`
	`264`	`+但是有个问题就是匹配到的子串,只会出现一次,不能返回重复的结果,只会返回最终匹配的结果,需要使用 regex 来得到所有的匹配结果。`
	`265`	`+`
	`266`	`+`
`225`	`267`	`## 参考链接`
`226`	`268`
`227`		`-[正则表达式30分钟入门教程](http://deerchao.net/tutorials/regex/regex.htm)`
	`269`	`+[正则表达式30分钟入门教程](http://deerchao.net/tutorials/regex/regex.htm)`

`‎content/regex.md‎`

Lines changed: 70 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,70 @@`
	`1`	`+## regex`
	`2`	`+`
	`3`	`+更加强大的正则表达式`
	`4`	`+`
	`5`	`+一个比较典型的应用就是重复匹配,re 的匹配默认只会认准最后一个。`
	`6`	`+`
	`7`	`+比如说获取所有的匹配结果。`
	`8`	`+`
	`9`	+```
	`10`	`+`
	`11`	`+In [1]: import re`
	`12`	`+`
	`13`	`+In [2]: re.match(r"(?P<key>\w+):(?P<value>\d+)", "haha:1").groups()`
	`14`	`+Out[2]: ('haha', '1')`
	`15`	`+`
	`16`	`+In [3]: re.match(r"(?P<key>\w+):(?P<value>\d+)", "haha:1").groupdict()`
	`17`	`+Out[3]: {'key': 'haha', 'value': '1'}`
	`18`	`+`
	`19`	`+In [4]: re.search(r"((?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groups()`
	`20`	`+Out[4]: ('laal:2;', 'laal', '2')`
	`21`	`+`
	`22`	`+In [5]: re.search(r"((?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groupdict()`
	`23`	`+Out[5]: {'key': 'laal', 'value': '2'}`
	`24`	`+`
	`25`	`+In [6]: re.search(r"(?:(?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groups()`
	`26`	`+Out[6]: ('laal', '2')`
	`27`	`+`
	`28`	`+In [7]: import regex`
	`29`	`+`
	`30`	`+In [8]: regex.search(r"(?:(?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").groups()`
	`31`	`+Out[8]: ('laal', '2')`
	`32`	`+`
	`33`	`+In [9]: regex.search(r"(?:(?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").capturesdict()`
	`34`	`+Out[9]: {'key': ['haha', 'laal'], 'value': ['1', '2']}`
	`35`	`+`
	`36`	`+In [10]: regex.search(r"(?:(?P<key>\w+):(?P<value>\d+);)*", "haha:1;laal:2;").captures("key")`
	`37`	`+Out[10]: ['haha', 'laal']`
	`38`	+```
	`39`	`+`
	`40`	+或者说,找到字符串中的所有重复子串,或者找到数组中所有的指定子串,因为 re 好像不能使用 `\g` 来判断重复
	`41`	`+`
	`42`	+```
	`43`	`+In [1]: import regex`
	`44`	`+`
	`45`	`+In [2]: # 找到数组里的所有数字`
	`46`	`+`
	`47`	`+In [3]: regex.match(r"((?P<rep>(\d+)3円)[a-zA-Z])+", "2333abc3uio890da123").capturesdict()`
	`48`	`+Out[3]: {'rep': ['2333', '3', '890', '123']}`
	`49`	`+`
	`50`	`+In [2]: # 找到数组中所有的重复数字子串`
	`51`	`+`
	`52`	`+In [3]: regex.match(r"((?P<rep>(\d)3円)[a-zA-Z])+", "2333abc3uio890da1112233").capturesdict()`
	`53`	`+Out[3]: {'rep': ['2', '333', '3', '8', '9', '0', '111', '22', '33']}`
	`54`	`+`
	`55`	`+In [4]: # 找到数组中所有的重复子串`
	`56`	`+`
	`57`	`+In [5]: regex.match(r"(?P<rep>(\w)2円*)+", "aaabbcccdddd").capturesdict()`
	`58`	`+Out[5]: {'rep': ['aaa', 'bb', 'ccc', 'dddd']}`
	`59`	`+`
	`60`	`+In [6]: # 使用 re 只能找到所有的连续重复子串,或者第一个重复的子串`
	`61`	`+`
	`62`	`+In [8]: import re`
	`63`	`+`
	`64`	`+In [9]: re.search(r"(.+?)1円+", 'dxabcabcyyyydxycxcxz').group()`
	`65`	`+Out[9]: 'abcabc'`
	`66`	`+`
	`67`	`+In [9]: re.findall(r"\d+", "2333abc3uio890da123")`
	`68`	`+Out[9]: ['2333', '3', '890', '123']`
	`69`	+```
	`70`	`+`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit e74c71c

File tree

3 files changed

3 files changed

`‎README.md‎`

`‎content/re.md‎`

`‎content/regex.md‎`

0 commit comments