如何解决使用Python分隔频率并写入多次7895 7895 7895 7895,而不是4 * 7895
我是Python的基本用户,并且我有一个较大的文本数据文件(OUT2.txt),其中有许多写为2*150
的值,这意味着有两个150个值(150150)或4*7895
表示四个7895值(7895 7895 7895 7895)。我想将所有这些类型的值都更改为彼此相邻的值,这意味着7895 7895 7895 7895而不是4 * 7895。
尝试过此代码,但得到以下错误:
**parts = fl.split()
AttributeError: 'list' object has no attribute 'split'**
fl = open('OUT2.txt','r').readlines()
parts = fl.split()
lst = []
for part in parts:
_parts = part.split('*')
if len(_parts) == 1:
lst.append(_parts[0])
else:
times = int(_parts[0])
for i in range(times):
lst.append(_parts[1])
open('OUT.3.txt','w+').writelines(lst)
请提出任何建议。谢谢。
从此文本数据文件示例
2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821
要理解
8.17997 8.17997 723.188 33.33 33.33 33.3 3 33.33 11.0524 11.0524 11.0524 and so on...
解决方法
分割字符串,在*
分割并转换回字符串
s = "2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821"
# split the string
l = s.split()
# split on "*"
l = [x.split('*') for x in l]
# multiply recurring values,keep the single ones
l = [x[0] if len(x) == 1 else " ".join([x[1]] * int(x[0])) for x in l]
# join back to a string
result = " ".join(l)
如果某项没有*
,则将其简单地保存为字符串(x[0]
,因为split("*")
将返回单个元素列表)。如果这样做的话,split("*")
将返回2个值,则第一个x[0]
需要解析为一个int,并且[x[1]] * i
是i
重复项的列表在空白处:
>>> ["11.883"] * 4
["11.883","11.883","11.883"]
>>> " ".join(["11.883"] * 4)
>>> "11.883 11.883 11.883 11.883"
,
下面应该可以工作
with open('in.txt') as f:
out_lines = []
lines = [l.strip() for l in f.readlines()]
for l in lines:
parts = l.split()
lst = []
for part in parts:
_parts = part.split('*')
if len(_parts) == 1:
lst.append(_parts[0])
else:
times = int(_parts[0])
for i in range(times):
lst.append(_parts[1])
out_lines.append(' '.join(lst))
with open('out.txt','w') as f1:
for line in out_lines:
f1.write(line + '\n')
in.txt
2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821
10*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821
out.txt
8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524 380.811 149.985 13.9643 13.9643 13.9643 13.9643 13.9643 22.8987 76.2205 24.7059 24.7059 64.821
8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524 380.811 149.985 13.9643 13.9643 13.9643 13.9643 13.9643 22.8987 76.2205 24.7059 24.7059 64.821
,
尝试使用正则表达式:
import re
# this is what you'll have after you read the file,for example
text = "2*8.17997 723.188 4*33.33 3*11.0524"
matches = re.findall(r'(\d+\*)?(\d+\.\d+)',text)
# matches = [('2*','8.17997'),('','723.188'),('4*','33.33'),('3*','11.0524')]
output = []
for match in matches:
if match[0]:
times = int(match[0][:-1]) # remove the `*`
else:
times = 1 # no `x*y` means one time y
for _ in range(times):
output.append(match[1])
output_str = ' '.join(output)
# output_str = '8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524'
这段代码不是很好,只是让您了解这个想法。这里有趣的部分是正则表达式。您可以在此处查看更多详细信息:https://regex101.com/
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。