如何解决我们如何在Elastic Search中的文字类型列上汇总SUM?
我一直在尝试将聚合函数应用于例如:文本类型总和,我的索引映射如下:
{
"my_elastic_search_index" : {
"mappings" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},"@version" : {
"type" : "text","fields" : {
"keyword" : {
"type" : "keyword","ignore_above" : 256
}
}
},"doc" : {
"properties" : {
"_id" : {
"type" : "text","fields" : {
"keyword" : {
"type" : "keyword","ignore_above" : 256
}
}
},"last_updated_on" : {
"type" : "long"
},"sample_ids" : {
"type" : "nested","properties" : {
"name" : {
"type" : "text","fields" : {
"keyword" : {
"type" : "keyword","ignore_above" : 256
}
}
},"value" : {
"type" : "text","ignore_above" : 256
}
}
}
}
},"status" : {
"type" : "text","filter_id" : {
"type" : "text","ignore_above" : 256
}
}
}
}
},"query" : {
"properties" : {
"match" : {
"properties" : {
"doc" : {
"properties" : {
"filter_id" : {
"type" : "text","fields" : {
"keyword" : {
"type" : "keyword","ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
}
}
}
我的数据是:
{
"took" : 23,"timed_out" : false,"_shards" : {
"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0
},"hits" : {
"total" : {
"value" : 4,"relation" : "eq"
},"max_score" : 1.0,"hits" : [
{
"_index" : "my_elastic_search_index","_type" : "_doc","_id" : "zzz-yyy-xxx-a9e8","_score" : 1.0,"_source" : {
"@version" : "1","@timestamp" : "2019-11-14T14:30:56.261Z","doc" : {
"status" : "SENT","sample_ids" : [
{
"value" : """"20"""","name" : "8f4abde123d"
},{
"value" : """"25.52"""","name" : "d92c4732bc9fb91"
},{
"value" : """"0"""","name" : "4b91bdee68b6e"
},{
"value" : """"xyz"""","name" : "bd0a944a292d5a"
},{
"value" : """"someothervlue"""","name" : "8ee9932060d5bf"
},{
"value" : """"30..01"""","name" : "229eed093fa0d85"
},],"filter_id" : "a1357913-cf99650f51d","_id" : "zzz-yyy-xxx-a9e81",}
}
},{
"_index" : "my_elastic_search_index","_id" : "zzz-yyy-xxx-a9e82","@timestamp" : "2019-11-14T14:30:56.731Z","sample_ids" : [
{
"value" : """"40"""","name" : "d92c47372bc9fb91"
},"name" : "4b91bdc6ee68b6e"
},"name" : "bccf07c19cfe12c"
}
],"_id" : "zzz-yyy-xxx-a9e84",},}
},"_id" : "zzz-yyy-xxx-a9e85","@timestamp" : "2019-11-14T08:23:36.998Z","sample_ids" : [
{
"value" : """"17.8"""",{
"value" : """"35.6"""","name" : "d92c473132bc9fb91"
},"name" : "4b91bd5c6ee68b6e"
},"name" : "bd0a944c2a292d5a"
},"name" : "8ee9934dce9e2060d5bf"
},"name" : "229eed48xsscd3fa0d85"
},{
"value" : """"30"""","name" : "4381f1bddffc4265129"
},"name" : "94cafdd1c78fc355b00"
},{
"value" : """"HVDC"""","name" : "bccf024ac19cfe12c"
}
],"_id" : "zzz-yyy-xxx-a9e87","@timestamp" : "2019-11-14T08:24:01.272Z","doc" : {
"sample_ids" : [
{
"value" : """"11.08"""","name" : "d92c4737a132bc9fb91"
},"name" : "4b91bd5028c6ee68b6e"
},"name" : "bd0a9445e19c2a292d5a"
},"name" : "8ee9934dd002060d5bf"
},"name" : "229eed48e2093fa0d85"
}
],}
}
}
]
}
}
我想将doc.sample_ids.value的总和返回为88.88,其中doc.sample_ids.name = 8f4abde123d和filter_id = a1357913-cf99650f51d,已经尝试将doc.sample_ids.name转换为Number,这给了我错误。 无论如何,我可以得到sum,avg和count。
GET /my_elastic_search_index/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"doc.filter_id.keyword": [
"a1357913-cf99650f51d"
]
}
}
]
}
},"aggs": {
"sum_values" : { "sum" : { "script" : {
"lang":"painless","inline" : "Double.parseDouble(doc['sample_ids.value'])" } } } }
}
解决方法
您使用嵌套字段,因此必须使用嵌套聚合:
尝试这样的聚合:
{
"aggs": {
"resellers": {
"nested": {
"path": "doc.sample_ids"
},"aggs": {
"sum_values": {
"sum": {
"script": {
"lang": "painless","source": "Float.parseFloat(doc['doc.sample_ids.value.keyword'].value)"
}
}
}
}
}
}
}
使用类似这样的数据:“”“” 35.6“”“” 首先,您应该清除数据。 这是无效的格式,因此不应库存。 您可以通过重新索引数据或使用loigstash或使用查询更新来做到这一点。
但是,如果您确实无法解决此问题,则必须在elasticsearch.yml中启用正则表达式功能。
添加此行:
script.painless.regex.enabled: true
然后您将可以执行以下操作:
"aggs": {
"sum_values": {
"sum": {
"script": {
"lang": "painless","source": """
String content = /[\"]/.matcher(doc['doc.sample_ids.value.keyword'].value).replaceAll('');
Double.parseDouble(content);
"""
}
}
}
}
请谨慎使用此查询,性能会很糟糕。 您应该真正解决摄取问题,并清理数据。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。