Skip to main content

Data connections

Data connections are secure configurations to supported data sources.

Hex makes it easy to securely connect to your data warehouse or cloud storage provider.

Supported data sources

AlloyDB
Athena
BigQuery
ClickHouse
Cloud SQL
Databricks
Dremio
MS SQL Server
MariaDB
Materialize
MotherDuck
MySQL
Postgres
Presto
Redshift
Snowflake
Starburst
Trino

Project and workspace data connections

info
  • Users will need the Admin workspace role to create workspace data connections. Workspace data connections are available on the Professional, Team and Enterprise plans.
  • Users will need Can Edit or higher project permissions to create project data connections.

Hex has two types of data connections:

  • Workspace data connections are created and managed by Admins, from the Data Sources page of Settings. Workspace data connections can be used across multiple projects.
  • Project data connections are created by Editors, from the the Data browser sidebar of the Notebook view. Project data connections are scoped to the project they are created in, and cannot be used across projects.

In general, workspace data connections are recommended over project data connections as they can be used across projects rather than configured for each new project. Since they are centrally managed by admins, they also remove the need for Editors to have access to data warehouse credentials. Further, workspace data connections offer additional configuration features including automated schema refreshes and data connection permissions.

Add a workspace data connection

To create a workspace data connection, an Admin should head to SettingsData sources, and select the + Connection button.

When you add a new connection, you will be prompted to select a data source. Fill in the relevant connection details and make sure that Hex is allowed to connect to your data source.

Workspace data connections can also be created from within a project (as described below) by enabling the Share with workspace setting — this option will be enabled by default if an Admin is creating the connection.

Add a project data connection

Navigate to the Data sources tab within a project in the left sidebar, and select "Add".

Follow the same steps as above to create the data connection.

Example data connection form

Required connection details

The data connection form fields differ depending on the data connection type. For most data connections (PostgreSQL, MySQL, Redshift, MS SQL Server, ClickHouse, AlloyDB, MariaDB, Materialize & Cloud SQL) you'll need the following connection details to complete the data connection form:

  • Database host url
  • Database port
  • Database name
  • Username & password

BigQuery

To connect to BigQuery, you'll need:

Note, by default the Use BigQuery Storage API option is enabled when creating a BQ connection. This connection method significantly improves BigQuery query performance in Hex. However it requires two additional permissions be granted to the service account: bigquery.readsessions.create and bigquery.readsessions.getData. Additionally, there may be some BigQuery costs associated when using this method. Pricing details from Google is available here.

Snowflake

To connect to Snowflake, you'll need:

  • Account name
  • Warehouse
  • Database
  • Key Pair or OAuth
tip

If connecting to Snowflake using Key Pair authentication, paste the entire multi-line key, including "-----BEGIN PRIVATE KEY-----" and "-----END PRIVATE KEY-----", in the Private Key box.

caution

As of October 2024, Snowflake is transitioning away from password-only authentication.

Existing data connections authenticated using username & password authentication will continue to function, but the username & password fields will be uneditable. Connections will fail if the Snowflake password is changed or if MFA is enabled for the account used to authenticate the data connection. It is highly recommended to switch to Key Pair authentication to avoid issues going forward.

Databricks

For Databricks, you'll need:

Dremio

For Dremio, you'll need:

  • JDBC URL (retrieve this from the Project Settings page under General information in your Dremio account)
  • Access token (instructions to generate)

Trino & Starburst

For Trino & Starburst, you'll need:

  • Database host url
  • Database port
  • Username & password

Presto

For Presto, you'll need:

  • Database host url
  • Database port
  • Catalog name
  • Username & password

Athena

For Athena, you'll need:

  • AWS access key ID
  • AWS secret access key
  • S3 bucket to write query results to
  • The default port is 443 and the host is athena.[region].amazonaws.com (if you're using a VPC to connect then the host address is [vpc-specific-url].athena.[region].amazonaws.com).

Note: only AWS users who do not use MFA authentication are supported.

tip

AWS uses the AWSQuicksightAthenaAccess policy as an example policy for JDBC connections. The IAM user will also need permissions to utilize prepared statements.

Motherduck

For MotherDuck, you'll need:

Using data connections in projects

Any available project or workspace data connections are listed in the Data sources tab. When a workspace connection is first used in a project, it will be imported into the project.

Removing data connections from projects

To remove a data connection from a project, use the three-dot menu next to the data connection's name to remove it. You also have the option of replacing any existing references to the connection before removing it.

Manage data connection schemas

When a new data connection is added, Hex retrieves information about the objects that can be queried with these credentials, populating the data browser with information about the schemas, tables, columns and columns, collectively referred to as a "schema".

This information can be refreshed any time from the refresh button of the data browser (see below for information on restricting this action), or by scheduling a refresh.

Schema filtering

Admins can filter databases, schemas and/or tables from a data connection to display specific data in the Data browser. On refresh, only the non-filtered databases, schemas and/or tables will be synced, allowing Admins to curate what is viewable when exploring a connection.

Below are some scenarios where you'd want to consider filtering your connection:

  • You have a large warehouse by default but expect that most folks will only use a small subset of the warehouse. By filtering, you will greatly improve the viewer experience of data browsing for your end users by providing a focused view of relevant data.
  • What's helpful for end viewers is also helpful for Hex Magic. By filtering, you will reduce the scope of tables Magic has access to which can result in better Magic suggestions.
caution

Filtering your connection does not affect the underlying permissions set in your data connection. Even if a database, schema, and/or table is hidden from the Data browser, it can still be queried from if the user has proper permissions to do so.

Schedule schema refreshes

info

Scheduled refreshes are only available for workspace data connections

Workspace data connections can be configured to refresh the data connection's schema on a schedule. On refresh, Hex updates the data browser with your latest data connection schema, making any new schema data available as well as removing anything that was deleted.

caution

Automated refreshes query your data connection's entire schema. Consider query costs and your data warehouse load when deciding on a cadence schedule.

Workspace data connection permissions

Workspace data connections offer additional permissions that can be used to restrict access to sensitive information.

Can query

This permission determines who can use the connection to author queries. By default, all members of a workspace are granted Can query access to a data connection.

This permission can be set to a list of groups to explicitly define which users can use this connection to author queries.

If this connection is imported in a project where a user with Can edit project access does not have Can query access to the data connection, the user will be prevented from authoring queries by being downgraded to Can view project access.

Admins always have Can query access to data connections.

Can view results

info

Can view results data permissions are only available on the Enterprise plan.

This permission determines who can view the results of projects where this connection is imported, and can effectively be used to lock users out from projects.

By default, all users are granted Can view results access to a data connection, including any anonymous users if the project is shared with the web.

This permission can be set to a list of groups to explicitly define which users can view results when this data connection is used.

If this connection is imported in a project where a user with access to the project does not have Can view results access to the data connection, the user will be prevented from accessing the project.

Admins, and any groups that are granted Can query access always have Can view results access to a data connection.

tip

Data connection permission rules are determined based on which connections are imported into a project — a connection may not be actively used by any cells, but might still be imported. Follow these steps to remove a connection from a project completely.

Can refresh schema

By default, all users that have can query access can manually refresh the data connection schema via the Refresh button in the data browser. To restrict this ability and only allow Admins to manually refresh the data browser, update the "Can refresh schema" setting in your data connection configuration to "Admins only".

Use data connections to query your database

Once you have configured a data connection, you can query it using a SQL cell. See the SQL cells section for more details.

caution

Some Hex-native functionality, including Query Mode, Pivot cells, and Table display cells, rely on modern CTE functionality when operating on data directly in your warehouse.

If you are using an older version of some data sources (such as PostgreSQL<8.4, MySQL<8.0, or MS Sql Server<2005), you will need to query your data into Hex and utilize these features locally, rather than pointing them directly to your warehouse.

Database security

For each data connection, it's possible to configure SSH or SSL/TLS for a more secure connection and encryption in transit.

Using Secure Socket Layer (SSL) or Transport Layer Security (TLS) provides an additional layer of security by encrypting data in transit that moves from Hex to your database instance. Some data connection types have SSL/TLS configured by default.

In the table below, Enabled by default indicates that creating a data connection with that database will have encryption in transit enabled by default, and no further action is required. If Configurable, please reference the documentation linked in the first column for instructions on how to configure SSL/TLS for that specific database.

Data ConnectionSSL/TLS
AthenaEnabled by default
BigQueryEnabled by default
DatabricksEnabled by default
MaterializeEnabled by default
PrestoEnabled by default
RedshiftEnabled by default
SnowflakeEnabled by default
MotherDuckEnabled by default
AlloyDBConfigurable
ClickhouseConfigurable
Cloud SQL (MySQL)Configurable
Cloud SQL (PostgreSQL)Configurable
Cloud SQL (SQL Server)Configurable
DremioConfigurable
MariaDBConfigurable
MS SQL ServerConfigurable
MySQLConfigurable
PostgresConfigurable
TrinoConfigurable
StarburstConfigurable