Skip to main content

Connect a project

How you connect your dbt project depends on where it is hosted.

  1. If you want to connect a dbt project that's version controlled, take a look at:
  2. If you want to connect to your local dbt project, then you can check how to connect to local dbt

Once you've set up the connection to your dbt project, you'll need to continue on to set up the connection to your warehouse (it's a short step, we promise ๐Ÿคž).

We currently support:

  1. Bigquery
  2. Postgres
  3. Redshift
  4. Snowflake
  5. Databricks

If we don't support the warehouse you're using, don't be afraid to reach out to us in GitHub! :)

dbt connection options#


GitHub#

Personal access token#

This is used to access your repo. See the instructions for creating a personal access token here.

Select repo scope when you're creating the token.

screenshot

Repository#

This should be in the format my-org/my-repo. e.g. lightdash/lightdash-analytics

Branch#

This is the branch in your GitHub repo that Lightdash should sync to. e.g. main, master or dev

By default, we've set this to main but you can change it to whatever you'd like.

Project directory path#

This is the folder where your dbt_project.yml file is found in the GitHub repository you entered above.

  • Put / if your dbt_project.yml file is in the main folder of your repo (e.g. lightdash/lightdash-analytics/dbt_project.yml)
  • Include the path to the sub-folder where your dbt project is if your dbt project is in a sub-folder in your repo. For example, if my project was in lightdash/lightdash-analytics/dbt/dbt_project.yml, I'd write /dbt in this field.

GitLab#

Personal access token#

This is used to access your repo. See the instructions for creating a personal access token here.

Select read_repository scope when you're creating the token.

Repository#

This should be in the format my-org/my-repo. e.g. lightdash/lightdash-analytics

Branch#

This is the branch in your GitLab repo that Lightdash should sync to. e.g. main, master or dev

By default, we've set this to main but you can change it to whatever you'd like.

Project directory path#

This is the folder where your dbt_project.yml file is found in the GitLab repository you entered above.

If your dbt_project.yml file is in the main folder of your repo (e.g. lightdash/lightdash-analytics/dbt_project.yml), then you don't need to change anything in here. You can just leave the default value we've put in.

If your dbt project is in a sub-folder in your repo (e.g. lightdash/lightdash-analytics/dbt/dbt_project.yml), then you'll need to include the path to the sub-folder where your dbt project is (e.g. /dbt).


Local dbt project#

Prerequisite

When you're using a local dbt project, during the install process you should provide the absolute path to your dbt project.


Warehouse connection#

Bigquery#

You can see more details in dbt documentation.

Project#

This is the GCP project ID.

Data set#

This is the name of your dbt dataset: the dataset in your warehouse where the output of your dbt models is written to. If you're not sure what this is, check out the dataset value you've set in your dbt profiles.yml file.

Location#

The location of BigQuery datasets. You can see more details in dbt documentation.

Key File#

This is the JSON key file. You can see how to create a key here

Threads#

This is the amount of threads dbt can have with the warehouse.

Timeout in seconds#

If a dbt model takes longer than this timeout to complete, then BigQuery may cancel the query. You can see more details in dbt documentation.

Priority#

The priority for the BigQuery jobs that dbt executes. You can see more details in dbt documentation.

Retries#

The number of times dbt should retry queries that result in unhandled server errors You can see more details in dbt documentation.

Maximum bytes billed#

When a value is configured, queries executed by dbt will fail if they exceed the configured maximum bytes threshold. You can see more details in dbt documentation.


Postgres#

You can see more details in dbt documentation.

Host#

This is the host where the database is running.

User#

This is the database user name.

Password#

This is the database user password.

DB name#

This is the database name.

Schema#

This is the schema name.

Port#

This is the port where the database is running.

Threads#

This is the amount of threads dbt can have with the warehouse.

Keep alive idle (seconds)#

This specifies the amount of seconds with no network activity after which the operating system should send a TCP keepalive message to the client. You can see more details in postgresqlco documentation.

Search path#

This controls the Postgres "search path". You can see more details in dbt documentation.

SSL mode#

This controls how dbt connects to Postgres databases using SSL. You can see more details in dbt documentation.


Redshift#

You can see more details in dbt documentation.

Host#

This is the host where the database is running.

User#

This is the database user name.

Password#

This is the database user password.

DB name#

This is the database name.

Schema#

This is the schema name.

Port#

This is the port where the database is running.

Threads#

This is the amount of threads dbt can have with the warehouse.

Keep alive idle (seconds)#

This specifies the amount of seconds with no network activity after which the operating system should send a TCP keepalive message to the client.

SSL mode#

This controls how dbt connects to Postgres databases using SSL.


Snowflake#

You can see more details in dbt documentation.

Account#

This is the account to connect to.

User#

This is the database user name.

Password#

This is the database user password.

Role#

This is the role to assume when running queries as the specified user.

Database#

This is the database name.

Warehouse#

This is the warehouse name.

Schema#

This is the schema name.

Threads#

This is the amount of threads dbt can have with the warehouse.

Keep client session alive#

This is intended to keep Snowflake sessions alive beyond the typical 4 hour timeout limit You can see more details in dbt documentation.

Query tag#

This is Snowflake query tags parameter. You can see more details in dbt documentation.


Databricks#

The credentials needed to connect to your cluster can be found in the ODBC options in your databricks account:

  1. Go to the Compute tab in the sidebar.
  2. Click the configuration tab for the cluster that you're connecting to Lightdash.
  3. Expand the Advanced options tab
  4. Open the JDBC/ODBC tab

databricks connect screenshot

Server hostname#

Follow the instructions above to find your ODBC connection instructions.

HTTP Path#

Follow the instructions above to find your ODBC connection instructions.

Port#

Follow the instructions above to find your ODBC connection instructions.

Personal Access Token#

Your personal access token can be found in your user settings in databricks:

  1. Open Settings by clicking the cog โš™๏ธ in the sidebar and select User settings
  2. Click Generate token. You'll be asked to enter a name and expiry.
  3. Copy the token

databricks access screenshot

Database#

The default database name used by dbt for this connection. In databricks/spark the database is also the schema.