如何解决带有多个grok过滤器的所有已解析日志中的_grokparsefailure标签
我正在尝试使用Elastic Stack解析Minecraft日志,但遇到了一个非常奇怪的问题(对我来说可能很奇怪!)
我的日志的所有行均已正确解析,但每个人中都有_grokparsefailure
标记。
我的logstash管道配置是这样的:
input {
file {
path => [ "/path/to/my/log" ]
#start_position => "beginning"
tags => ["minecraft"]
}
}
filter {
if "minecraft" in [tags] {
# mutate {
# gsub => [
# "message","\\n",""
# ]
# }
#############################
# Num 1 #
#############################
grok {
match => [ "message","\[%{TIME:timestamp}] \[(?<originator>[^\/]+)?/%{LOGLEVEL:level}]: %{GREEDYDATA:message}" ]
overwrite => [ "message" ]
break_on_match => false
}
#############################
# Num 2 #
#############################
grok {
match => [ "message","UUID of player %{USERNAME} is %{UUID}" ]
add_tag => [ "player","uuid" ]
break_on_match => true
}
#############################
# Num 3 #
#############################
grok {
match => [ "message","\A(?<player>[a-zA-Z0-9_]+)\[/%{IPV4:ip_address}:%{POSINT}\] logged in with entity id %{POSINT:entity_id} at \(\[(?<world>[a-zA-Z]+)\](?<pos>[^\)]+)\)\Z" ]
add_tag => [ "player","join" ]
break_on_match => true
}
#
# grok {
# match => [ "message","^(?<player>[a-zA-Z0-9_]+) has just earned the achievement \[(?<achievement>[^\[]+)\]$" ]
# add_tag => [ "player","achievement" ]
# }
#
# grok {
# match => [ "message","^(?<player>[a-zA-Z0-9_]+) left the game$" ]
# add_tag => [ "player","part" ]
# }
#
# grok {
# match => [ "message","^<(?<player>[a-zA-Z0-9_]+)> .*$" ]
# add_tag => [ "player","chat" ]
# }
}
}
output {
elasticsearch {
hosts => ["elasticsearch:xxxx"]
user => "xxxx"
password => "xxxxxx"
index => "minecraft_s1v15_%{+YYYY.MM.dd}"
}
}
我的日志示例是:
[11:21:46] [User Authenticator #7/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[11:21:46] [Server thread/INFO]: MyAwsomeUsername joined the game
[11:21:46] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45140] logged in with entity id 6868 at ([world]61.45686149445207,70.9375,-175.44700729217607)
[11:21:49] [Server thread/INFO]: MyAwsomeUsername issued server command: //efererg
[11:21:52] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> egerg
[11:21:54] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> ef
[12:00:19] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:19] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:21] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:21] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:21] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45470] logged in with entity id 11767 at ([world]61.45686149445207,-175.44700729217607)
[12:00:27] [Server thread/INFO]: MyAwsomeUsername issued server command: /wgergerger
[12:00:29] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerg
[12:00:33] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerger
[12:00:35] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> rerg
[12:00:37] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:37] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:38] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:38] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:38] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45476] logged in with entity id 11793 at ([world]62.97573252632079,71.0,-179.01739415148737)
[12:00:40] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:40] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:51] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:51] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:51] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45486] logged in with entity id 11805 at ([world]62.97573252632079,-179.01739415148737)
[12:00:55] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:55] [Server thread/INFO]: MyAwsomeUsername left the game
说明:
我评论了其他骗子,以更简单地解释问题(取消与他们的联系时完全相同的问题)
我测试了3种情况:
- 注释2和3以及其他注释,只有1个处于活动状态,在这种情况下,对每一行日志进行了解析,记录中没有任何
_grokparsefailure
。 - 只有他人评论,并且1和2处于活动状态。在这种情况下,与grok数字2匹配的日志行被解析为没有
_grokparsefailure
,其他行得到_grokparsefailure
。这还是有道理的! - 在最后一种情况下,我未注释所有3个grok(1、2、3处于活动状态),并且日志的每一行都被解析为 BUT ,其中包含
_grokparsefailure
!即使break_on_match
在默认情况下为true
,并且当与grok 2匹配时,也不应使用grok 3对其进行测试。
我在stackoverflow中读到了其他与我类似的问题:Similar Question 1,并且在grok过滤器之前添加了mutate
块(因为日志的每一行都以\ n结尾),但没有任何改变问题仍然存在!
我想提到的另一件事是我知道在grok 2(3和其他)旁边添加更多grok会导致此标签导致某些日志根本与grok 2不匹配,因此必须用它们包装正则表达式。但就目前而言,与grok 2匹配的日志应该是可以的(否_grokparsefailure
),但事实并非如此! (在stackoverflow问题中阅读它:Similar Question 2
解决方法
实际上,这是预期的行为,您使logstash和grok的工作方式有些混乱。
首先,所有过滤器彼此独立,在break_on_match
中使用grok
仅影响grok
,对于之后出现的其他grok
过滤器则没有任何影响在您的管道中。 break_on_match
仅在同一grok
中有多个模式时才有意义,而情况并非如此。
第二,由于Logstash是串行的,并且您没有使用任何条件,因此grok
过滤器将应用于管道中的每条消息,这是否已被解析都没有关系,这就是在做您的台词来获取_grokparsefailure
要修复您需要使用条件句的情况。
您不需要在前两个grok
过滤器中使用条件,第一个只是将日志行的不同部分覆盖到message
字段中,第二个只是您的第一次测试,对于第二个测试之后的每个grok
,您将需要以下配置。
if "_grokparsefailure" in [tags] {
grok {
match => "your pattern"
add_tag => "your tags"
remove_tag => ["_grokparsefailure"]
}
}
仅当邮件在grok
字段中有_grokparsefailure
时,才会应用此tags
,如果邮件与您的格式匹配,则该标签将被删除,如果不匹配,标签仍然存在,并且可以通过以下提示来测试邮件。
最后,您的grok
配置应如下所示。
grok {
"your first grok"
}
grok {
"your second grok,can be any of the others"
}
if "_grokparsefailure" in [tags] {
grok {
"your grok N"
remove_tag => ["_grokparsefailure"]
}
}
这仅是必需的,因为您为每条消息添加了不同的标签,例如,如果将此逻辑移至mutate
过滤器,则只能使用两个grok
过滤器,第二个过滤器为多模式grok
,其中break_on_match
设置为true
。
grok {
match => {
"message" => [
"pattern from grok 2","pattern from grok 3","pattern from grok N"
]
}
break_on_match => true
}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。