[フレーム]
Last Updated: February 25, 2016
·
12.14K
· banjer

Multiple level term aggregation in elasticsearch

If you're looking to generate a "cross frequency/tabulation" of terms in elasticsearch, you'd go with a nested aggregation.

Here's an example of a three-level aggregation that will produce a "table" of
hostname x login error code x username. This is a query I used to generate a daily report of OpenLDAP login failures.

curl -XGET http://localhost:9200/logstash-*/_search?pretty=true -d '
{
 "aggs" : {
 "hostname_by_login_result": {
 "terms": {
 "field": "hostname.raw"
 },
 "aggs": {
 "result_by_user": {
 "terms": {
 "field": "login_code",
 "size": 0,
 "order": { "_term" : "desc" }
 },
 "aggs": {
 "username": {
 "terms": {
 "field": "username.raw",
 "size": 0
 }
 }
 }
 }
 }
 }

 }
}
'

By querying the .raw version of a field, you get the "not analyzed" version, which means your data will not be split on delimiters.

I also want the output to be sorted by descending login error code, so hence the order option:

...
 "terms": {
 "field": "login_code",
 "size": 0,
 "order": { "_term" : "desc" }
 },
...

By default, output is sorted on count of documents returned, or _count. There are a couple of intrinsic sort options available, depending on what type of query you're running.

AltStyle によって変換されたページ (->オリジナル) /