ElasticGraph Query API: Aggregation Grouping

When aggregating documents, the groupings are defined by groupedBy. Here’s an example:

query ArtistCountsByYearFormedAndHomeCountry {
  artistAggregations {
    nodes {
      groupedBy {
        bio {
          yearFormed
          homeCountry
        }
      }

      count
    }
  }
}

In this case, we’re grouping by multiple fields; a grouping will be returned for each combination of Artist.bio.yearFormed and Artist.bio.homeCountry found in the data.

Date Grouping

In the example above, the grouping was performed on the raw values of the groupedBy fields. However, for Date fields it’s generally more useful to group by truncated values. Here’s an example:

query AlbumSalesByReleaseYear {
  artistAggregations {
    nodes {
      subAggregations {
        albums {
          nodes {
            groupedBy {
              releasedOn {
                asDate(truncationUnit: YEAR)
              }
            }

            aggregatedValues {
              soldUnits {
                exactSum
              }
            }
          }
        }
      }
    }
  }
}

In this case, we’re truncating the Album.releaseOn dates to the year to give us one grouping per year rather than one grouping per distinct date. The truncationUnit argument supports DAY, MONTH, QUARTER, WEEK and YEAR values. In addition, an offset argument is supported, which can be used to shift what grouping a Date falls into. This is particularly useful when using WEEK:

query AlbumSalesByReleaseWeek {
  artistAggregations {
    nodes {
      subAggregations {
        albums {
          nodes {
            groupedBy {
              releasedOn {
                asDate(truncationUnit: WEEK, offset: {amount: -1, unit: DAY})
              }
            }

            aggregatedValues {
              soldUnits {
                exactSum
              }
            }
          }
        }
      }
    }
  }
}

With no offset, grouped weeks run Monday to Sunday, but we can shift it using offset. In this case, the weeks have been shifted to run Sunday to Saturday.

Finally, we can also group Date fields by what day of week they fall into using asDayOfWeek instead of asDate:

query AlbumSalesByReleaseDayOfWeek {
  artistAggregations {
    nodes {
      subAggregations {
        albums {
          nodes {
            groupedBy {
              releasedOn {
                asDayOfWeek
              }
            }

            aggregatedValues {
              soldUnits {
                exactSum
              }
            }
          }
        }
      }
    }
  }
}

DateTime Grouping

DateTime fields offer a similar grouping API. asDate and asDayOfWeek work the same, but they accept an optional timeZone argument (default is “UTC”):

query TourAttendanceByYear {
  artistAggregations {
    nodes {
      subAggregations {
        tours {
          nodes {
            subAggregations {
              shows {
                nodes {
                  groupedBy {
                    startedAt {
                      asDate(
                        truncationUnit: YEAR
                        timeZone: "America/Los_Angeles"
                      )
                    }
                  }

                  aggregatedValues {
                    attendance {
                      exactSum
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Sub-day granualarities (HOUR, MINUTE, SECOND) are supported when you use asDateTime instead of asDate:

query TourAttendanceByHour {
  artistAggregations {
    nodes {
      subAggregations {
        tours {
          nodes {
            subAggregations {
              shows {
                nodes {
                  groupedBy {
                    startedAt {
                      asDateTime(truncationUnit: HOUR)
                    }
                  }

                  aggregatedValues {
                    attendance {
                      exactSum
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Finally, you can group by the time of day (while ignoring the date) by using asTimeOfDay:

query TourAttendanceByHourOfDay {
  artistAggregations {
    nodes {
      subAggregations {
        tours {
          nodes {
            subAggregations {
              shows {
                nodes {
                  groupedBy {
                    startedAt {
                      asTimeOfDay(truncationUnit: HOUR)
                    }
                  }

                  aggregatedValues {
                    attendance {
                      exactSum
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}