ElasticGraph Query API: Aggregation Grouping
$ curl -s https://block.github.io/elasticgraph/dc.yml | docker compose -f - up --pull always
When aggregating documents, the groupings are defined by groupedBy. Here’s an example:
query ArtistCountsByYearFormedAndHomeCountry {
artistAggregations {
nodes {
groupedBy {
bio {
yearFormed
homeCountry
}
}
count
}
}
}
In this case, we’re grouping by multiple fields; a grouping will be returned for each
combination of Artist.bio.yearFormed and Artist.bio.homeCountry found in the data.
Date Grouping
In the example above, the grouping was performed on the raw values of the groupedBy fields.
However, for Date fields it’s generally more useful to group by truncated values.
Here’s an example:
query AlbumSalesByReleaseYear {
artistAggregations {
nodes {
subAggregations {
albums {
nodes {
groupedBy {
releasedOn {
asDate(truncationUnit: YEAR)
}
}
aggregatedValues {
soldUnits {
exactSum
}
}
}
}
}
}
}
}
In this case, we’re truncating the Album.releaseOn dates to the year to give us one grouping per
year rather than one grouping per distinct date. The truncationUnit argument supports DAY, MONTH,
QUARTER, WEEK and YEAR values. In addition, an offset argument is supported, which can be used
to shift what grouping a Date falls into. This is particularly useful when using WEEK:
query AlbumSalesByReleaseWeek {
artistAggregations {
nodes {
subAggregations {
albums {
nodes {
groupedBy {
releasedOn {
asDate(truncationUnit: WEEK, offset: {amount: -1, unit: DAY})
}
}
aggregatedValues {
soldUnits {
exactSum
}
}
}
}
}
}
}
}
With no offset, grouped weeks run Monday to Sunday, but we can shift it using offset. In this case, the weeks have been
shifted to run Sunday to Saturday.
Finally, we can also group Date fields by what day of week they fall into using asDayOfWeek instead of asDate:
query AlbumSalesByReleaseDayOfWeek {
artistAggregations {
nodes {
subAggregations {
albums {
nodes {
groupedBy {
releasedOn {
asDayOfWeek
}
}
aggregatedValues {
soldUnits {
exactSum
}
}
}
}
}
}
}
}
DateTime Grouping
DateTime fields offer a similar grouping API. asDate and asDayOfWeek work the same, but they accept an optional timeZone
argument (default is “UTC”):
query TourAttendanceByYear {
artistAggregations {
nodes {
subAggregations {
tours {
nodes {
subAggregations {
shows {
nodes {
groupedBy {
startedAt {
asDate(
truncationUnit: YEAR
timeZone: "America/Los_Angeles"
)
}
}
aggregatedValues {
attendance {
exactSum
}
}
}
}
}
}
}
}
}
}
}
Sub-day granualarities (HOUR, MINUTE, SECOND) are supported when you use asDateTime instead of asDate:
query TourAttendanceByHour {
artistAggregations {
nodes {
subAggregations {
tours {
nodes {
subAggregations {
shows {
nodes {
groupedBy {
startedAt {
asDateTime(truncationUnit: HOUR)
}
}
aggregatedValues {
attendance {
exactSum
}
}
}
}
}
}
}
}
}
}
}
Finally, you can group by the time of day (while ignoring the date) by using asTimeOfDay:
query TourAttendanceByHourOfDay {
artistAggregations {
nodes {
subAggregations {
tours {
nodes {
subAggregations {
shows {
nodes {
groupedBy {
startedAt {
asTimeOfDay(truncationUnit: HOUR)
}
}
aggregatedValues {
attendance {
exactSum
}
}
}
}
}
}
}
}
}
}
}