A Pivot cell takes a dataframe as input, aggregates it based on your configurations, and returns a dataframe-like object that can be displayed in an app or used downstream in your Hex project.
Using Pivot cells
- Create a Pivot cell by clicking on the "Transform" dropdown, and then selecting "Pivot".
- From "Pivot Source", select a dataframe or query result to be aggregated. After selecting your dataframe, all of its columns will appear in the field picker on the left.
- To render data in the Pivot cell, drag a column from the field picker into the Pivot cell configuration.
- Rows: The field(s) you want to aggregate by, rendered vertically.
- Columns: The field(s) you want to aggregated by, rendered horizontally
- Values: The field(s) you want to aggregate. The aggregate type and formatting of the value can be changed per value.
As you configure the cell, the pivot table will update with the results. This resulting table can be added directly to an app, or used in downstream cells (see below).
- Date and time fields can be truncated to a time period (hour, day, week, month, quarter or year) when added as a row or column.
- Fields can be renamed once they are added as rows, columns, or values by clicking into the field name.
Considerations when pivoting a query result
Pivot cells work seamlessly with query results (i.e. the result a SQL cell that returns data in query mode), running the compiled SQL against your warehouse rather than filtering a dataframe in memory. This allows you to work with large datasets in Hex by leveraging the compute engine of your data warehouse.
When pivoting a query result, the cell needs to be rerun after changing a configuration via the Run button, or using
cmd + enter. This helps avoid unnecessary queries being executed against your data warehouse.
Using pivot objects in downstream cells
Pivot cells return a custom pivot object (
pivot_result by default.
While the custom pivot object is not a dataframe itself, it contains two different dataframe representations of the data,
HexPivotTable.pivotedreturns the pivoted object as a dataframe that uses the same indexing and pivoting that is rendered in your original Pivot cell.
.pivoteddataframe is useful if:
- You’re creating variables from the resulting aggregated values (to be used in a dynamic markdown cell, for example)
- You want to preserve the pivoting that’s performed by the pivot cell
HexPivotTable.aggregatedreturns the aggregated data without any pivoting. For those familiar with SQL, this is the result of writing a statement that groups by all the fields rendered as rows and columns, before the data is pivoted.
.aggregatedis useful if:
- You plan on querying the dataframe in a downstream SQL cell
- You want to work with a “flattened”, un-pivoted dataframe
Duplicate Pivot cells as Python + SQL
The Pivot cell’s “Duplicate as Python + SQL cell” feature is a great way to visualize the aggregating + pivoting steps mentioned in the above section. If you’d ever like to understand exactly what’s going on behind the scenes of a Pivot cell, you can eject cell as Python + SQL via the three-dot menu.
The ejected SQL cell will aggregate the data using the “summarize by” aggregation you chose for the Pivot cell’s values. The resulting dataframe (
dataframe_aggregated) is identical to what’s returned by
The subsequent Python cell chains from the SQL cell, and pivots the aggregated data. Within this code,
dataframe_pivoted_tmp is identical to what’s returned by