国外海报设计网站营销团队外包
基础概念
bucket
数据分组,一些数据按照某个字段进行bucket划分,这个字段值相同的数据放到一个bucket中。可以理解成Java中的Map<String, List>结构,类似于Mysql中的group by后的查询结果。
metric:
对一个数据分组执行的统计,比如计算最大值,最小值,平均值等 类似于Mysql中的max(),min(),avg()函数的值,都是在group by后使用的。
案例
以如下文档结构为例:
{"_index" : "zb_notice","_type" : "_doc","_id" : "4451224572914342308301065","_score" : 1.0,"_source" : {"_class" : "NoticeEntity","id" : "111","url" : "https://xxxxxx/purchaseNotice/view/111?","owner" : "河管养所","procurementName" : "工程建筑","procurementNameText" : "应急抢险配套工程建筑","intermediaryServiceMatters" : "无(属于非行政管理的中介服务项目采购)","investmentApprovalProject" : "是","code" : "789456","scale" : 3.167183E8,"scaleText" : "投资额(¥316,718,300.00元)","area" : "","requiredServices" : "工程建筑","typeCodes" : ["021"],"context" : "是一座具有灌溉 、供水 、排洪 、交通和挡潮蓄淡等多功能的大(2)型水闸工程,承担黄冈河下游 8.65 万亩农田的灌溉任务并","timeLimit" : "具体时限以合同条款约定为准。","amount" : 0.0,"amountText" : "暂不做评估与测算","amountDescription" : "","selectIntermediaryType" : "直接选取","isChooseIntermediary" : "否","isAvoidance" : "否","endTime" : "2023-09-04 09:30:00","startTime" : "2023-08-31","files" : [{"fileName" : "东溪水闸初设批复(1).pdf","url" : "/aa/bb/file/downloadfile/PjAttachment/123456"}]}
}
统计服务类型最多公告
GET zb_notice/_search
{"size": 0,"aggs": {"song_qty_by_language": {"terms": {"field": "requiredServices"}}}
}
语法解释:
- size:0 表示只要统计后的结果,原始数据不展现
- aggs:固定语法 ,聚合分析都要声明aggs
- song_qty_by_language:聚合的名称,可以随便写,建议规范命名
- terms:按什么字段进行分组
- field:具体的字段名称
响应结果如下:
{"took": 2,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 5,"max_score": 0,"hits": []},"aggregations": {"song_qty_by_language": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"doc_count": 5}]}}
}
语法解释:
- hits: 由于请求时设置了size:0,hits就是空的
- aggregations:聚合查询的结果
- song_qty_by_language:请求时声明的名称
- buckets:根据指定字段查询后得到的数据分组集合,[]内的是每一个数据分组,其中key为每个bucket的对应指定字段的值,doc_count为统计的数量。
默认按doc_count降序排序。
按服务分类的平均服务价格
GET zb_notice/_search
{"size": 0,"aggs": {"lang": {"terms": {"field": "requiredServices"},"aggs": {"length_avg": {"avg": {"field": "amount"}}}}}
}
这里为两层aggs聚合查询,先按服务类型统计,得到数据分组,再在数据分组里算平均价格。
多个aggs嵌套语法也是如此,aggs代码块的位置即可。
统计最多服务费、最少服务费等的公告
最常用的统计:count,avg,max,min,sum,语法含义与mysql相同。
GET zb_notice/_search
{"size": 0,"aggs": {"color": {"terms": {"field": "requiredServices"},"aggs": {"length_avg": {"avg": {"field": "amount"}},"length_max": {"max": {"field": "amount"}},"length_min": {"min": {"field": "amount"}},"length_sum": {"sum": {"field": "amount"}}}}}
}
按上架日期分段统计服务类型数量
按月统计
date histogram与histogram语法类似,搭配date interval指定区间间隔 extended_bounds表示最大的时间范围。
复制代码GET zb_notice/_search
{"size": 0,"aggs": {"sales": {"date_histogram": {"field": "publishTime","interval": "month","format": "yyyy-MM-dd","min_doc_count": 0,"extended_bounds": {"min": "2023-01-01","max": "2023-12-31"}}}}
}
interval的值可以天、周、月、季度、年等。可以延伸一下
GET zb_notice/_search
{"size": 0,"aggs": {"sales": {"date_histogram": {"field": "publishTime","interval": "quarter","format": "yyyy-MM-dd","min_doc_count": 0,"extended_bounds": {"min": "2019-01-01","max": "2019-12-31"}},"aggs": {"lang_qty": {"terms": {"field": "requiredServices"},"aggs": {"like_sum": {"sum": {"field": "amount"}}}},"total" :{"sum": {"field": "amount"}}}}}
}
带上过滤条件
聚合查询可以和query搭配使用,相当于mysql中where与group by联合使用
查询条件
GET zb_notice/_search
{"size": 0,"query": {"match": {"requiredServices": "工程咨询"}},"aggs": {"sales": {"terms": {"field": "requiredServices"}}}
}
过滤条件
GET zb_notice/_search
{"size": 0,"query": {"constant_score": {"filter": {"term": {"requiredServices": "工程咨询"}}}},"aggs": {"sales": {"terms": {"field": "requiredServices"}}}
}