如何找到与部分字符串匹配的内容，然后使用awk从参考文件中删除该字符串？

如何解决如何找到与部分字符串匹配的内容，然后使用awk从参考文件中删除该字符串？

我有一个一直在试图解决的问题，但是还没有弄清楚该怎么做。我有一个参考文件，其中包含条形码中清单中的所有设备。

参考文件：

PTR10001,PRINTER,SN A
PTR10002,SN B
PTR10003,SN C 
MON10001,MONITOR,SN A
MON10002,SN B
MON10003,SN C
CPU10001,COMPUTER,SN A
CPU10002,SN B
CPU10003,SN C

我想做的是制作一个文件，我只需要在上面写上我需要的缩写即可。文件2如下所示：

PTR
CPU
MON
MON

所需的输出将是一个文件，该文件可以告诉我需要从货架上拿走条形码的哪些物品。
所需的输出文件：

PTR10001
CPU10001
MON10001
MON10002

从输出中可以看到，由于我不能拥有2个相同的条形码，因此我需要它浏览参考文件并找到第一个匹配项。将数字复制到输出文件后，我想从参考文件中删除该数字，以免重复该数字。

我尝试了awk的多次迭代，但无法获得所需的输出。
我得到的最接近的是以下代码：

awk -F'/' '{ key = substr($1,1,3) } NR==FNR {id[key]=$1; next} key in id { $1=id[key] } { print }' $file1 $file2 > $file3

我正在用ksh编写此文件，并希望使用awk，因为我认为这将是解决该问题的最佳方法。感谢您的帮助。

解决方法

请您尝试按照GNU awk中的示例进行跟踪，编写和测试。

awk '
FNR==NR{
  iniVal[$0]++
  next
}
{
  counter=substr($0,1,3)
}
counter in iniVal{
  if(++currVal[counter]<=iniVal[counter]){
     print $1
     if(currVal[counter]==iniVal[counter]){ delete iniVal[$0] }
  }
}
' Input_file2  FS="," Input_file1

说明： 添加以上详细说明。

awk '                                           ##Starting awk program from here.
FNR==NR{                                        ##Checking condition if FNR==NR which is true when Input_file2 is being read.
  iniVal[$0]++                                  ##Creating array iniVal with index of current line with increment of 1 each time it comes here.
  next                                          ##next will skip all further statements from here.
}
{
  counter=substr($0,3)                        ##Creating counter variable which has 1st 3 characters of Input_file1 here.
}
counter in iniVal{                              ##Checking if counter is present in iniVal then do following.
  if(++currVal[counter]<=iniVal[counter]){      ##Checking if currValarray with index of counter value is lesser than or equal to iniVal then do following.
     print $1                                   ##Printing 1st field of current line here.
     if(currVal[counter]==iniVal[counter]){     ##Checking if currVal value is equal to iniVal with index of counter here.
       delete iniVal[$0]                        ##If above condition is TRUE then deleting iniVal here.
     }
  }
}
' Input_file2  FS="," Input_file1               ##Mentioning Input_file names here.

第一个解决方案：

根据您的详细说明，我认为顺序无关紧要，因为您想知道该如何下架。因此，您可以做相反的事情，首先阅读file2，对物品进行计数，然后到架子上拿起它们。

awk -F,'FNR==NR{c[$0]++; next} c[substr($1,3)]-->0{print $1}' file2 file1

输出：

PTR10001
MON10001
MON10002
CPU10001

第二个解决方案：

您的awk非常接近您想要的内容，但是您需要数组中的第二个维度，并且不覆盖现有ID。我们将使用伪2维数组（BTW GNU awk具有真实的二维数组）来完成该操作，在其中存储PTR10001,PTR10002,PTR10003之类的ID，并使用split检索它们，然后删除从架子上也可以。

> cat tst.awk
BEGIN { FS="," }

NR==FNR {
    key=substr($1,3)
    ids[key] = (ids[key]? ids[key] "," $1: $1) #append new id.
    next
}

$0 in ids {
    split(ids[$0],tmp,",")
    print(tmp[1])
    ids[$0]=substr(ids[$0],length(tmp[1])+2) #remove from shelf
}

输出

awk -f tst.awk file1 file2
PTR10001
CPU10001
MON10001
MON10002

在这里，我们保留file2的顺序，因为这是基于您尝试过的想法。

如何找到与部分字符串匹配的内容，然后使用awk从参考文件中删除该字符串？

如何解决如何找到与部分字符串匹配的内容，然后使用awk从参考文件中删除该字符串？

解决方法

相关推荐