python re.match()用法相关示例

脚本专栏 2024/11/1 佚名

3 2 1

圆月山庄资源网 Design By www.vgjia.com

学习python爬虫时遇到了一个问题，书上有示例如下：

import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*)are(.*"htmlcode">

matchObj=re.match(r'(.*)are(.*"htmlcode">

import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*)are(.*"matchObj.group():",matchObj.group())
 print("matchObj.group(1):", matchObj.group(1))
 print("matchObj.group(2):", matchObj.group(2))
 print("matchObj.group(3):", matchObj.group(3))
else:
 print('No match!\n')




得到的结果是：

matchObj.group(): Cats are smarter than dogs

matchObj.group(1): Cats 

matchObj.group(2): 

matchObj.group(3):  smarter than dogs



可见第二个括号里的内容被默认为空了，然后删去那个？，可以看到结果变成：

matchObj.group(): Cats are smarter than dogs

matchObj.group(1): Cats 

matchObj.group(2):  smarter than dogs

matchObj.group(3): 



那么这是否就意味着？的默认值很可能是0次，那？这个符号到底有什么用呢
仔细想来这个说法并不是很严谨。尝试使用单独的."htmlcode">

import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*) are(.*)"matchObj.group():",matchObj.group())
 print("matchObj.group(1):", matchObj.group(1))
 print("matchObj.group(2):", matchObj.group(2))




也能在组别2中正常提取到are之后的字符内容，但稍微改动一下将？放到第二个括号内，
就什么也提取不到，同时导致group(0)中匹配的字符到Cats are就截止了（也就是第二个括号匹配失败）。
令人感到奇怪的是，如果将上面的代码改成


import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*) are (.*)+',line)

if matchObj:
 print("matchObj.group():",matchObj.group())
 print("matchObj.group(1):", matchObj.group(1))
 print("matchObj.group(2):", matchObj.group(2))




也就是仅仅将？改为+，虽然能成功匹配整个line但group(2)中没有内容，
如果把+放到第二个括号中就会产生报错，匹配失败。
那么是否可以认为.*"htmlcode">

import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*) are (.*r).*',line)

if matchObj:
 print("matchObj.group():",matchObj.group())
 print("matchObj.group(1):", matchObj.group(1))
 print("matchObj.group(2):", matchObj.group(2))
 #print("matchObj.group(3):", matchObj.group(3))
else:
 print('No match!\n')




为了泛用性尝试了一下把r改成‘ '但是得到的结果是‘smarter than '。于是尝试把.换成表示任意字母的
[a-zA-Z]，成功提取出了单个smarter，代码如下：


import re

line='Cats are smarter than dogs'
matchObj=re.match(r'(.*) are ([a-zA-Z]* ).*',line)

if matchObj:
 print("matchObj.group():",matchObj.group())
 print("matchObj.group(1):", matchObj.group(1))
 print("matchObj.group(2):", matchObj.group(2))
 #print("matchObj.group(3):", matchObj.group(3))
else:
 print('No match!\n')

python,re.match(),python,re.match

标签：

python,re.match(),python,re.match

圆月山庄资源网 Design By www.vgjia.com

广告合作：本站广告合作请联系QQ：858582 申请时备注：广告合作（否则不回）
免责声明：本站文章均来自网站采集或用户投稿，网站不提供任何软件下载或自行开发的软件！如有用户或公司发现本站内容信息存在侵权行为，请邮件告知！ 858582#qq.com

圆月山庄资源网 Design By www.vgjia.com

评论“python re.match()用法相关示例”

暂无评论...

www.vgjia.com 圆月山庄资源网

139,976互联网资源

144,792高清电影

21,817无损音乐

631,128技术资源

更新日志

2024年11月01日

python re.match()用法相关示例

python,re.match(),python,re.match

Python爬虫实现selenium处理iframe作用域问题

python利用appium实现手机APP自动化的示例

评论“python re.match()用法相关示例”

更新日志

友情链接