Skip to main content

Defining pre-aggregates

Pre-aggregates are defined under the pre_aggregates key in your model configuration.
models:
  - name: orders
    meta:
      pre_aggregates:
        - name: orders_daily_by_status
          dimensions:
            - status
          metrics:
            - total_order_amount
            - average_order_size
          time_dimension: order_date
          granularity: day

Configuration reference

PropertyRequiredDescription
nameYesUnique identifier for the pre-aggregate. Must contain only letters, numbers, and underscores.
dimensionsYesList of dimension names to include. Must contain at least one dimension.
metricsYesList of metric names to include. Must contain at least one metric.
time_dimensionNoA time-based dimension for date grouping. Must be paired with granularity.
granularityNoTime granularity for the time_dimension. Valid values: hour, day, week, month, quarter, year. Must be paired with time_dimension.
max_rowsNoMaximum number of rows to store in the materialization. If the aggregation exceeds this limit, the result is truncated. Must be a positive integer.
refreshNoSchedule configuration for automatic re-materialization. See Scheduling refreshes.
If you specify time_dimension, you must also specify granularity, and vice versa.

Multiple pre-aggregates per model

You can define multiple pre-aggregates on the same model, each targeting different query patterns. For example, you might want a fine-grained daily pre-aggregate for detailed dashboards and a coarser monthly one for summary views:
models:
  - name: orders
    config:
      meta:
        pre_aggregates:
          - name: orders_daily_by_status
            dimensions:
              - status
            metrics:
              - total_order_amount
              - order_count
            time_dimension: order_date
            granularity: day
          - name: orders_monthly_summary
            dimensions:
              - status
            metrics:
              - total_order_amount
            time_dimension: order_date
            granularity: month
            max_rows: 1000000
When a query matches multiple pre-aggregates, Lightdash picks the smallest one.

Scheduling refreshes

By default, pre-aggregates are materialized when your dbt project compiles. You can also schedule automatic refreshes using cron expressions, using your project’s configured timezone (defaults to UTC):
pre_aggregates:
  - name: orders_daily_by_status
    dimensions:
      - status
    metrics:
      - total_order_amount
    time_dimension: order_date
    granularity: day
    refresh:
      cron: "0 6 * * *"  # Every day at 6:00 AM UTC

Materialization triggers

Pre-aggregates can be materialized through four different triggers:
TriggerWhen it happens
CompileAutomatically when your dbt project is compiled
CronOn the schedule you define in refresh.cron
ManualWhen you trigger a refresh from the Lightdash UI

Row limits

You can set max_rows to cap the size of a materialization. If the aggregation produces more rows than the limit, the result is truncated.
When max_rows is applied, some data is excluded from the materialization. Queries that match the pre-aggregate may return incomplete results. Use this setting carefully and monitor for the “max rows applied” warning in the monitoring UI.

Complete example

Here’s a full model definition with a pre-aggregate, including joins, scheduling, and row limits:
models:
  - name: orders
    meta:
      joins:
        - join: customers
          sql_on: ${customers.customer_id} = ${orders.customer_id}
      pre_aggregates:
        - name: orders_daily_by_status
          dimensions:
            - status
            - customers.country
          metrics:
            - total_order_amount
            - average_order_size
          time_dimension: order_date
          granularity: day
          max_rows: 5000000
          refresh:
            cron: "0 6 * * *"
    columns:
      - name: order_date
        meta:
          dimension:
            type: date
      - name: status
        meta:
          dimension:
            type: string
      - name: amount
        meta:
          metrics:
            total_order_amount:
              type: sum
            average_order_size:
              type: average
With this pre-aggregate, the following queries would be served from materialized data:
  • Total order amount by status, grouped by day, week, month, or year
  • Average order size by status, grouped by month
  • Total order amount filtered by status or customer country
  • Order amount by customer country, grouped by quarter
These queries would not match and would query the warehouse directly:
  • Queries including count_distinct metrics
  • Queries grouped by a dimension not in the pre-aggregate (for example, customer_id)
  • Queries with hourly granularity (finer than the pre-aggregate’s day)
  • Queries with custom dimensions, custom metrics, or table calculations