使用Python分隔频率并写入多次7895 7895 7895 7895，而不是4 * 7895

如何解决使用Python分隔频率并写入多次7895 7895 7895 7895，而不是4 * 7895

我是Python的基本用户，并且我有一个较大的文本数据文件（OUT2.txt），其中有许多写为2*150的值，这意味着有两个150个值（150150）或4*7895表示四个7895值（7895 7895 7895 7895）。我想将所有这些类型的值都更改为彼此相邻的值，这意味着7895 7895 7895 7895而不是4 * 7895。

尝试过此代码，但得到以下错误：

**parts = fl.split()
AttributeError: 'list' object has no attribute 'split'**

fl = open('OUT2.txt','r').readlines()
parts = fl.split()
lst = []
for part in parts:
    _parts = part.split('*')
    if len(_parts) == 1:
        lst.append(_parts[0])
    else:
        times = int(_parts[0])
        for i in range(times):
            lst.append(_parts[1])
open('OUT.3.txt','w+').writelines(lst)

请提出任何建议。谢谢。

从此文本数据文件示例

2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821

要理解

8.17997 8.17997 723.188 33.33 33.33 33.3 3 33.33 11.0524 11.0524 11.0524 and so on...

解决方法

分割字符串，在*分割并转换回字符串

s = "2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821"

# split the string
l = s.split()

# split on "*"
l = [x.split('*') for x in l]

# multiply recurring values,keep the single ones
l = [x[0] if len(x) == 1 else " ".join([x[1]] * int(x[0])) for x in l]

# join back to a string
result = " ".join(l)

如果某项没有*，则将其简单地保存为字符串（x[0]，因为split("*")将返回单个元素列表）。如果这样做的话，split("*")将返回2个值，则第一个x[0]需要解析为一个int，并且[x[1]] * i是i重复项的列表在空白处：

>>> ["11.883"] * 4
["11.883","11.883","11.883"]
>>> " ".join(["11.883"] * 4)
>>> "11.883 11.883 11.883 11.883"

下面应该可以工作

with open('in.txt') as f:
    out_lines = []
    lines = [l.strip() for l in f.readlines()]
    for l in lines:
        parts = l.split()
        lst = []
        for part in parts:
            _parts = part.split('*')
            if len(_parts) == 1:
                lst.append(_parts[0])
            else:
                times = int(_parts[0])
                for i in range(times):
                    lst.append(_parts[1])
        out_lines.append(' '.join(lst))
with open('out.txt','w') as f1:
    for line in out_lines:
        f1.write(line + '\n')

in.txt

2*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821
10*8.17997 723.188 4*33.33 3*11.0524 380.811 149.985 5*13.9643 22.8987 76.2205 2*24.7059 64.821

out.txt

8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524 380.811 149.985 13.9643 13.9643 13.9643 13.9643 13.9643 22.8987 76.2205 24.7059 24.7059 64.821
8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524 380.811 149.985 13.9643 13.9643 13.9643 13.9643 13.9643 22.8987 76.2205 24.7059 24.7059 64.821

尝试使用正则表达式：

import re

# this is what you'll have after you read the file,for example
text = "2*8.17997 723.188 4*33.33 3*11.0524"

matches = re.findall(r'(\d+\*)?(\d+\.\d+)',text)
# matches = [('2*','8.17997'),('','723.188'),('4*','33.33'),('3*','11.0524')]

output = []
for match in matches:
    if match[0]:
        times = int(match[0][:-1])  # remove the `*`
    else:
        times = 1  # no `x*y` means one time y
    for _ in range(times):
        output.append(match[1])

output_str = ' '.join(output)
# output_str = '8.17997 8.17997 723.188 33.33 33.33 33.33 33.33 11.0524 11.0524 11.0524'

这段代码不是很好，只是让您了解这个想法。这里有趣的部分是正则表达式。您可以在此处查看更多详细信息：https://regex101.com/

使用Python分隔频率并写入多次7895 7895 7895 7895，而不是4 * 7895

如何解决使用Python分隔频率并写入多次7895 7895 7895 7895，而不是4 * 7895

解决方法

相关推荐