

| 我想使用.replace函数替换多个字符串。 我目前有
虽然那听起来不像是好的语法 正确的方法是什么?有点像在grep / regex中,您可以执行


import re

rep = {\"condition1\": \"\",\"condition2\": \"text\"} # define desired replacements here

# use these three lines to do the replacement
rep = dict((re.escape(k),v) for k,v in rep.iteritems()) 
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile(\"|\".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))],text)
>>> pattern.sub(lambda m: rep[re.escape(m.group(0))],\"(condition1) and --condition2--\")
\'() and --text--\'
def replace_all(text,dic):
    for i,j in dic.iteritems():
        text = text.replace(i,j)
    return text
是完整字符串,and8ѭ是字典-每个定义都是一个字符串,它将替换该词的匹配项。 注意:在Python 3中,“ 9”已替换为“ 10” 注意:Python字典没有可靠的迭代顺序。该解决方案仅在以下情况下解决您的问题: 更换顺序无关紧要 可以更改以前的替换结果 例如:
d = { \"cat\": \"dog\",\"dog\": \"pig\"}
mySentence = \"This is my cat and this is my dog.\"
可能的输出#1: “这是我的猪,这是我的猪。” 可能的输出#2 “这是我的狗,这是我的猪。” 一种可能的解决方法是使用OrderedDict。
from collections import OrderedDict
def replace_all(text,j in dic.items():
        text = text.replace(i,j)
    return text
od = OrderedDict([(\"cat\",\"dog\"),(\"dog\",\"pig\")])
mySentence = \"This is my cat and this is my dog.\"
\"This is my pig and this is my pig.\"
字符串太大或字典中有许多对,则效率低下。     ,为什么不提供这样的解决方案?
s = \"The quick brown fox jumps over the lazy dog\"
for r in ((\"brown\",\"red\"),(\"lazy\",\"quick\")):
    s = s.replace(*r)

#output will be:  The quick red fox jumps over the quick dog
    ,这是使用reduce的第一个解决方案的变体,以防您需要正常运行。 :)
repls = {\'hello\' : \'goodbye\',\'world\' : \'earth\'}
s = \'hello,world\'
reduce(lambda a,kv: a.replace(*kv),repls.iteritems(),s)
repls = (\'hello\',\'goodbye\'),(\'world\',\'earth\')
s = \'hello,repls,s)
def multiple_replace(string,rep_dict):
    pattern = re.compile(\"|\".join([re.escape(k) for k in sorted(rep_dict,key=len,reverse=True)]),flags=re.DOTALL)
    return pattern.sub(lambda x: rep_dict[x.group(0)],string)
>>>multiple_replace(\"Do you like cafe? No,I prefer tea.\",{\'cafe\':\'tea\',\'tea\':\'cafe\',\'like\':\'prefer\'})
\'Do you prefer tea? No,I prefer cafe.\'
如果您愿意,您可以从此简单的功能开始制作自己的专用替换功能。     ,我基于F.J.的出色答案:
import re

def multiple_replacer(*key_values):
    replace_dict = dict(key_values)
    replacement_function = lambda match: replace_dict[match.group(0)]
    pattern = re.compile(\"|\".join([re.escape(k) for k,v in key_values]),re.M)
    return lambda string: pattern.sub(replacement_function,string)

def multiple_replace(string,*key_values):
    return multiple_replacer(*key_values)(string)
>>> replacements = (u\"café\",u\"tea\"),(u\"tea\",u\"café\"),(u\"like\",u\"love\")
>>> print multiple_replace(u\"Do you like café? No,*replacements)
Do you love tea? No,I prefer café.
注意,由于更换仅需一遍,因此“café”更改为“ tea”,但不会更改为“café”。 如果您需要多次进行相同的替换,则可以轻松创建替换功能:
>>> my_escaper = multiple_replacer((\'\"\',\'\\\\\"\'),(\'\\t\',\'\\\\t\'))
>>> many_many_strings = (u\'This text will be escaped by \"my_escaper\"\',u\'Does this work?\\tYes it does\',u\'And can we span\\nmultiple lines?\\t\"Yes\\twe\\tcan!\"\')
>>> for line in many_many_strings:
...     print my_escaper(line)
This text will be escaped by \\\"my_escaper\\\"
Does this work?\\tYes it does
And can we span
multiple lines?\\t\\\"Yes\\twe\\tcan!\\\"
改进之处: 把代码变成一个函数 增加了多行支持 修复了转义中的错误 易于为特定的多个替换创建函数 请享用! :-)     ,我想提出字符串模板的用法。只需将要替换的字符串放在字典中,就可以完成所有设置!来自docs.python.org的示例
>>> from string import Template
>>> s = Template(\'$who likes $what\')
>>> s.substitute(who=\'tim\',what=\'kung pao\')
\'tim likes kung pao\'
>>> d = dict(who=\'tim\')
>>> Template(\'Give $who $100\').substitute(d)
Traceback (most recent call last):
ValueError: Invalid placeholder in string: line 1,col 10
>>> Template(\'$who likes $what\').substitute(d)
Traceback (most recent call last):
KeyError: \'what\'
>>> Template(\'$who likes $what\').safe_substitute(d)
\'tim likes $what\'
a = \'This is a test string.\'
b = {\'i\': \'I\',\'s\': \'S\'}
for x,y in b.items():
    a = a.replace(x,y)
>>> a
\'ThIS IS a teSt StrIng.\'
Python 3.8
开始,并引入赋值表达式(PEP 572)(
# text = \"The quick brown fox jumps over the lazy dog\"
# replacements = [(\"brown\",\"quick\")]
[text := text.replace(a,b) for a,b in replacements]
# text = \'The quick red fox jumps over the quick dog\'
    ,这是我的$ 0.02。它基于安德鲁·克拉克(Andrew Clark)的回答,但更加清晰,它还涵盖了替换字符串是另一个替换字符串的子字符串(更长的字符串获胜)的情况。
def multireplace(string,replacements):
    Given a string and a replacement map,it returns the replaced string.

    :param str string: string to execute replacements on
    :param dict replacements: replacement dictionary {value to find: value to replace}
    :rtype: str

    # Place longer ones first to keep shorter substrings from matching
    # where the longer ones should take place
    # For instance given the replacements {\'ab\': \'AB\',\'abc\': \'ABC\'} against 
    # the string \'hey abc\',it should produce \'hey ABC\' and not \'hey ABc\'
    substrs = sorted(replacements,reverse=True)

    # Create a big OR regex that matches any of the substrings to replace
    regexp = re.compile(\'|\'.join(map(re.escape,substrs)))

    # For each match,look up the new string in the replacements
    return regexp.sub(lambda match: replacements[match.group(0)],string)
正是在这个主旨中,如果您有任何建议,可以随时对其进行修改。     ,我需要一个解决方案,其中要替换的字符串可以是正则表达式, 例如,通过将多个空白字符替换为一个空白字符来帮助规范长文本。基于其他人(包括MiniQuark和mmj)的答案,这是我想到的:
def multiple_replace(string,reps,re_flags = 0):
    \"\"\" Transforms string,replacing keys from re_str_dict with values.
    reps: dictionary,or list of key-value pairs (to enforce ordering;
          earlier items have higher priority).
          Keys are used as regular expressions.
    re_flags: interpretation of regular expressions,such as re.DOTALL
    if isinstance(reps,dict):
        reps = reps.items()
    pattern = re.compile(\"|\".join(\"(?P<_%d>%s)\" % (i,re_str[0])
                                  for i,re_str in enumerate(reps)),re_flags)
    return pattern.sub(lambda x: reps[int(x.lastgroup[1:])][1],string)
>>> multiple_replace(\"(condition1) and --condition2--\",...                  {\"condition1\": \"\",\"condition2\": \"text\"})
\'() and --text--\'

>>> multiple_replace(\'hello,world\',{\'hello\' : \'goodbye\',\'world\' : \'earth\'})

>>> multiple_replace(\"Do you like cafe? No,...                  {\'cafe\': \'tea\',\'tea\': \'cafe\',\'like\': \'prefer\'})
\'Do you prefer tea? No,I prefer cafe.\'
>>> s = \"I don\'t want to change this name:\\n  Philip II of Spain\"
>>> re_str_dict = {r\'\\bI\\b\': \'You\',r\'[\\n\\t ]+\': \' \'}
>>> multiple_replace(s,re_str_dict)
\"You don\'t want to change this name: Philip II of Spain\"
如果您要将字典键用作普通字符串, 您可以使用例如在调用multi_replace之前转义那些。这个功能:
def escape_keys(d):
    \"\"\" transform dictionary d by applying re.escape to the keys \"\"\"
    return dict((re.escape(k),v in d.items())

>>> multiple_replace(s,escape_keys(re_str_dict))
\"I don\'t want to change this name:\\n  Philip II of Spain\"
def check_re_list(re_list):
    \"\"\" Checks if each regular expression in list is well-formed. \"\"\"
    for i,e in enumerate(re_list):
        except (TypeError,re.error):
            print(\"Invalid regular expression string \"
                  \"at position {}: \'{}\'\".format(i,e))

>>> check_re_list(re_str_dict.keys())
>>> multiple_replace(\"button\",{\"but\": \"mut\",\"mutton\": \"lamb\"})
>>> multiple_replace(\"button\",[(\"button\",\"lamb\"),...                             (\"but\",\"mut\"),(\"mutton\",\"lamb\")])
source = \"Here is foo,it does moo!\"

replacements = {
    \'is\': \'was\',# replace \'is\' with \'was\'
    \'does\': \'did\',\'!\': \'?\'

def replace(source,replacements):
    finder = re.compile(\"|\".join(re.escape(k) for k in replacements.keys())) # matches every string we want replaced
    result = []
    pos = 0
    while True:
        match = finder.search(source,pos)
        if match:
            # cut off the part up until match
            result.append(source[pos : match.start()])
            # cut off the matched part and replace it in place
            result.append(replacements[source[match.start() : match.end()]])
            pos = match.end()
            # the rest after the last match
    return \"\".join(result)

print replace(source,replacements)
关键是要避免长字符串的许多串联。我们将源字符串切成片段,在形成列表时替换一些片段,然后将整个内容重新组合成字符串。     ,您真的不应该这样,但是我觉得它太酷了:
>>> replacements = {\'cond1\':\'text1\',\'cond2\':\'text2\'}
>>> cmd = \'answer = s\'
>>> for k,v in replacements.iteritems():
>>>     cmd += \".replace(%s,%s)\" %(k,v)
>>> exec(cmd)
是所有替换的结果 再次,这很hacky,不是您应该定期使用的东西。但是,很高兴知道您可以根据需要执行以下操作。     ,我不了解速度,但这是我的工作日快速解决方案:
reduce(lambda a,b: a.replace(*b),[(\'o\',\'W\'),(\'t\',\'X\')] #iterable of pairs: (oldval,newval),\'tomato\' #The string from which to replace values
...但是我喜欢上面的#1正则表达式答案。注意-如果一个新值是另一个值的子字符串,则该操作不是可交换的。     ,您可以使用
df = pd.DataFrame({\'text\': [\'Billy is going to visit Rome in November\',\'I was born in 10/10/2010\',\'I will be there at 20:00\']})


0    name is going to visit city in month
1                      I was born in date
2                 I will be there at time
您可以在此处找到示例。请注意,文本上的替换是按照它们在列表中出现的顺序进行的     ,我也在这个问题上挣扎。正则表达式有很多替代方法,但比循环
慢四倍(在我的实验条件下)。 您绝对应该尝试使用Flashtext库(此处的博客文章,此处的Github)。就我而言,每个文档的速度从1.8 s到0.015 s(正则表达式花费7.7 s)快了两个数量级。 在上面的链接中很容易找到使用示例,但这是一个有效的示例:
    from flashtext import KeywordProcessor
    self.processor = KeywordProcessor(case_sensitive=False)
    for k,v in self.my_dict.items():
    new_string = self.processor.replace_keywords(string)
请注意,Flashtext会一次性进行替换(以避免-> b和b-> c将\'a \'转换为\'c \')。 Flashtext还会查找整个单词(因此\'is \'将不匹配\'this \')。如果您的目标是几个单词(用\“ Hello \”替换\'This is \'),则效果很好。     ,从安德鲁的宝贵答案开始,我开发了一个脚本,该脚本从文件加载字典并详细说明打开的文件夹中的所有文件以进行替换。该脚本从可在其中设置分隔符的外部文件中加载映射。我是一个初学者,但是当在多个文件中进行多次替换时,我发现此脚本非常有用。它以秒为单位加载了包含1000多个条目的字典。这不是优雅,但对我有用
import glob
import re

mapfile = input(\"Enter map file name with extension eg. codifica.txt: \")
sep = input(\"Enter map file column separator eg. |: \")
mask = input(\"Enter search mask with extension eg. 2010*txt for all files to be processed: \")
suff = input(\"Enter suffix with extension eg. _NEW.txt for newly generated files: \")

rep = {} # creation of empy dictionary

with open(mapfile) as temprep: # loading of definitions in the dictionary using input file,separator is prompted
    for line in temprep:
        (key,val) = line.strip(\'\\n\').split(sep)
        rep[key] = val

for filename in glob.iglob(mask): # recursion on all the files with the mask prompted

    with open (filename,\"r\") as textfile: # load each file in the variable text
        text = textfile.read()

        # start replacement
        #rep = dict((re.escape(k),v in rep.items()) commented to enable the use in the mapping of re reserved characters
        pattern = re.compile(\"|\".join(rep.keys()))
        text = pattern.sub(lambda m: rep[m.group(0)],text)

        #write of te output files with the prompted suffice
        target = open(filename[:-4]+\"_NEW.txt\",\"w\")
def mass_replace(text,dct):
    new_string = \"\"
    old_string = text
    while len(old_string) > 0:
        s = \"\"
        sk = \"\"
        for k in dct.keys():
            if old_string.startswith(k):
                s = dct[k]
                sk = k
        if s:
            old_string = old_string[len(sk):]
            old_string = old_string[1:]
    return new_string

print mass_replace(\"The dog hunts the cat\",{\"dog\":\"cat\",\"cat\":\"dog\"})
The cat hunts the dog
    ,另一个例子 : 输入清单
error_list = [\'[br]\',\'[ex]\',\'Something\']
words = [\'how\',\'much[ex]\',\'is[br]\',\'the\',\'fish[br]\',\'noSomething\',\'really\']
words = [\'how\',\'much\',\'is\',\'fish\',\'no\',\'really\']
[n[0][0] if len(n[0]) else n[1] for n in [[[w.replace(e,\"\") for e in error_list if e in w],w] for w in words]] 
>>> mrep = lambda s,d: s if not d else mrep(s.replace(*d.popitem()),d)
>>> mrep(\'abcabc\',{\'a\': \'1\',\'c\': \'2\'})
笔记: 这消耗了输入字典。 Python字典从3.6开始保留输入顺序;其他答案中的相应警告不再适用。为了向后兼容,可以采用基于元组的版本:
>>> mrep = lambda s,d: s if not d else mrep(s.replace(*d.pop()),d)
>>> mrep(\'abcabc\',[(\'a\',\'1\'),(\'c\',\'2\')])
注意:与python中的所有递归函数一样,太大的递归深度(即太大的替换字典)将导致错误。参见例如这里。     ,或者只是为了快速破解:
for line in to_read:
    read_buffer = line              
    stripped_buffer1 = read_buffer.replace(\"term1\",\" \")
    stripped_buffer2 = stripped_buffer1.replace(\"term2\",\" \")
    write_to_file = to_write.write(stripped_buffer2)
listA=\"The cat jumped over the house\".split()
modify = {word:word for number,word in enumerate(listA)}
print \" \".join(modify[x] for x in listA)
z = \"My name is Ahmed,and I like coding \"
print(z.replace(\" Ahmed\",\" Dauda\").replace(\" like\",\" Love\" ))

