使用 ^操作员时,如何使正则表达方式在线路开始之前考虑逗号

import re

#example 1  with a  ,  before capture group
input_text = "Hello how are you?, dfdfdfd fdfdfdf other text. hghhg"

#example 2 without a  , (or \.|,|;|\n) before capture group
input_text = "dfdfdfd fdfdfdf other text. hghhg"

#No matter what position you place ^ within the options, it always considers it first, ignoring the others.
fears_and_panics_match = re.search(
                                    r"(?:\.|,|;|\n|^)\s*(?:(?:for|by)\s*me|)\s*(.+?)\s*(?:other\s*text)\s*(?:\.|,|;|\n)", 
                                    #r"(?:\.|,|;|\n)\s*(?:(?:for|by)\s*me|)\s*(.+?)\s*(?:other\s*text)\s*(?:\.|,|;|\n|$)", 
                                    input_text, flags = re.IGNORECASE)


if fears_and_panics_match: print(fears_and_panics_match.group(1))

无论您放置^的位置，为什么我要使用此模式r"(?:\.|,|;| |^)\s*(?:(?:for|by)\s*me|)\s*(.+?)\s*(?:other\s*text)\s*(?:\.|,|;| )"捕获Hello how are you?, dfdfdfd fdfdfdf。我需要您评估找到逗号,的可能性

在每种情况下正确的输出：

#for example 1
"dfdfdfd fdfdfdf"

#for example 2
"dfdfdfd fdfdfdf"

分析解答

您可以更改正则态度，可选地将某些字符匹配到.，,或;；然后从那里捕获到other text：

^(?:.*?[.,;])?\s*(?:(?:for|by)\s*me\s*)?(\w.*?)(?=\s*other\s*text)

它匹配：

^线的开始
(?:.*?[.,;])?具有.，,或;的可选字符串字符串
\s*一些空间
(?:(?:for|by)\s*me\s*)?可选短语for me或by me
(\w.*?)最小数量的字符，从一个词字符开始
(?=\s*other\s*text) lookahead断言下一个字符是other text

REGEX101上的demo

在python中（通过使用re.match注意，我们不需要^在正则是：

strs = [
  'dfdfdfd fdfdfdf other text. hghhg',
  'Hello how are you?, dfdfdfd fdfdfdf other text.hghhg',
  'for me a word other text',
  'A semicolon first; then some words before other text'
]
regex = r'(?:.*?[.,;])?\s*(?:(?:for|by)\s*me\s*)?(\w.*?)(?=\s*other\s*text)'
for s in strs:
    print(re.match(regex, s).group(1))

输出：

dfdfdfd fdfdfdf
dfdfdfd fdfdfdf
a word
then some words before

使用 ^操作员时,如何使正则表达方式在线路开始之前考虑逗号

Linux初学者云主机推荐