如何解决如果行不是以UNIX Shell中的时间戳开头的,则将其连接到上一行
我有一个输出带有时间戳前缀的日志的工具,但是日志条目可能包含换行符。我想将没有时间戳的任何行与上一行合并。
示例:
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two
lines [13]"
[ 2020/08/12 11:40] Success with "two lines with a twist
[19] to confuse you"
[ 2020/08/12 11:41] Failure with "one line again"
使用awk,我可以执行以下操作来合并不以[大括号开头的行]
awk -v RS="[" 'NR>1{$1=$1; print RS,$0}'
但是,您可以在上面的“扭曲”行上看到失败的地方。 “扭曲”行以[开头,这不是时间戳的一部分。
有没有办法为该时间戳前缀使用正则表达式?还是有一个更好的命令行工具来完成此任务?
解决方法
您可以尝试按照https://ideone.com/PXVCh2网站上显示的示例进行笔试和测试吗
awk '
{
printf("%s%s",$0~/^\[ [0-9]{4}\/[0-9]{2}\/[0-9]{2}/\
?(FNR!=1?ORS:""):OFS,$0)
}
END{ print "" }
' Input_file
根据Ed先生的评论,添加了一条print新行语句,以在Input_file的最后添加新行,以防万一它已经可以省略该部分了。
注意:我已经在手机上写了这个;抱歉,我无法判断它在大屏幕上的显示效果如何,所以我在这里将一行打印内容分为两行
,在我看来,您真正的问题实际上是您带引号的字符串可以包含换行符,因此,这种GNU awk解决方案(用于多字符RS)用于查找带引号的字符串可能比在开始时查找时间戳更健壮。行:
$ awk -v RS='"[^"]*"' '{gsub("\n"," ",RT); ORS=RT} 1' file
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two lines [13]"
[ 2020/08/12 11:40] Success with "two lines with a twist [19] to confuse you"
[ 2020/08/12 11:41] Failure with "one line again"
如果您引用的字符串可以包含一个可能出现在行首的时间戳,则这比检查以时间戳开头的行更好。 (请注意"four lines with a twist...
块中的时间戳记):
$ cat file
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two
lines [13]"
[ 2020/08/12 11:40] Success with "four lines with a twist
[ 2020/08/12 11:40] to confuse you
repeatedly and
in ""horrible"" ways"
[ 2020/08/12 11:41] Failure with "one line again"
。
$ awk -v RS='"[^"]*"' '{ORS=gensub("\n","g",RT)} 1' file
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two lines [13]"
[ 2020/08/12 11:40] Success with "four lines with a twist [ 2020/08/12 11:40] to confuse you repeatedly and in ""horrible"" ways"
[ 2020/08/12 11:41] Failure with "one line again"
,
假设日志包含您的示例文件:
$ cat log
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two
lines [13]"
[ 2020/08/12 11:40] Success with "two lines with a twist
[19] to confuse you"
[ 2020/08/12 11:41] Failure with "one line again"
以下代码检查双引号(“)的数目,如果只找到一个双引号,则将两行连接起来:
$ gawk 'gsub("\"","\"") == 1 {x=$0; getline; print x " " $0;} gsub("\"","\"") == 2 {print}' log
[ 2020/08/12 11:40] Success with "one line [42]"
[ 2020/08/12 11:40] Success with "two lines [13]"
[ 2020/08/12 11:40] Success with "two lines with a twist [19] to confuse you"
[ 2020/08/12 11:41] Failure with "one line again"
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。