Skip to content
  • There are no suggestions because the search field is empty.

Integrate Databricks with Cascade

Sync Cloud Data with Cascade Metrics and Measures

Overview

The Databricks integration allows you to automatically sync KPI and performance data from Databricks into Cascade, ensuring your strategy dashboards always reflect the latest data from your data warehouse.

This integration is designed to be:

  • Read-only — no changes are made to your Databricks data
  • Stable — built on curated views with controlled schemas
  • Secure — uses least-privilege access and governed data access controls
  • Automated — eliminates manual exports or spreadsheet uploads

What This Integration Supports

Typical use cases include:

  • Updating Cascade Metrics from warehouse-driven KPIs (e.g., Revenue, OEE, FPY, On-time Delivery)
  • Syncing time-series data (daily, weekly, monthly performance)
  • Connecting Cascade directly to your source-of-truth analytics layer in Databricks
  • Replacing manual reporting workflows with automated data syncs

Recommended Data Approach

Use a Curated View (Strongly Recommended). This approach ensures:

  • Business logic remains controlled within Databricks
  • Schema changes do not break the integration
  • Only approved data is exposed to Cascade
  • Integration maintenance is minimized

Authentication Setup

Option A — Personal Access Token (POC Only)

For quick proofs of concept, a Databricks Personal Access Token (PAT) can be used. PATs are user-scoped and not recommended for long-term production integrations.

Refer to the following link to setup a Permanent Access Token:
https://docs.databricks.com/aws/en/dev-tools/auth/pat

Customer steps

  1. In your Databricks workspace, click your username in the top bar and select Settings.
  2. Click Developer.
  3. Next to Access tokens, click Manage.
  4. Click Generate new token (Enter a comment that helps you to identify this token)
  5. Set the token's lifetime in days (set the maximum lifetime for the new token)
  6. Click Generate and then click Done.

What the customer provides to Cascade

  • A Databricks PAT
  • Confirmation of warehouse and dataset access

Option B — OAuth (Recommended for Production)

Databricks supports machine-to-machine OAuth using a service principal.

Refer to the following link to authorize service principal access to Databricks with OAuth: https://docs.databricks.com/aws/en/dev-tools/auth/oauth-m2m

Refer to the following link to add a Workspace Level Service Principal: https://docs.databricks.com/aws/en/admin/users-groups/manage-service-principals

Customer steps (Databricks Admin)

  1. Create or identify a workspace service principal
  2. Grant the service principal:
    • Permission to use the target SQL Warehouse (refer to step 2)
    • Read access to the required catalogs, schemas, and tables/views (refer to step 3)
  3. Enable OAuth access for the service principal

What the customer provides to Cascade

  • OAuth credentials for the Databricks service principal
  • Confirmation of the SQL Warehouse and datasets the service principal can access
  • Databricks Workspace URL

Cascade will use the credentials provided with the Databricks Instance ID to make an API call to retrieve a bearer Token.


1) SQL Warehouse Access

All queries are executed against a Databricks SQL Warehouse.

Cascade can automatically identify available warehouses once access is granted.

Customer steps

  • Grant the service principal CAN_USE access to at least one SQL Warehouse
  • (Optional) Specify a preferred warehouse if multiple are available

What the customer provides

  • Confirmation that warehouse access is configured
  • (Optional) Preferred SQL Warehouse name

2) Data Access (Unity Catalog Permissions)

Access to the SQL Warehouse alone is not sufficient.
The service principal must also be granted read access to the underlying data.

Required permissions

  • USE CATALOG
  • USE SCHEMA
  • SELECT on the required tables or views

Example

GRANT USE CATALOG ON CATALOG main TO <service_principal>;
GRANT USE SCHEMA ON SCHEMA main.reporting TO <service_principal>;
GRANT SELECT ON VIEW main.reporting.cascade_kpis TO <service_principal>;

 

3) Define the Data to Sync

Cascade retrieves data using SQL queries defined by the customer.

Create Curated Views in Databricks (Recommended)

Customers should create views that return only the data required for Cascade.

Customer steps

  1. Open Databricks SQL
  2. Navigate to a reporting or analytics schema
  3. Create a view with the required output

Note: Cascade executes the SQL query exactly as defined and does not automatically add WHERE clauses, date filters, or offsets.

It is highly recommended to add incremental behavior (WHERE clause with date restriction) and it should be implemented in the SQL view or query provided.

Grant Cascade read-only access to the view

Once the view is created, grant the service principal SELECT access. Cascade will only be able to read from the explicitly shared view.

What the Customer Provides to Cascade

  • Catalog, Schema and View name(s)
  • Confirmation that the service principal has SELECT access
  • Expected refresh cadence (daily, hourly, etc.)

4) How Data is Retrieved

Once access is configured, Cascade automatically retrieves data from Databricks on a scheduled basis.

  • Cascade runs a query against your Databricks view
  • Databricks executes the query in a SQL Warehouse
  • Cascade retrieves the results via API
  • The data is mapped and synced into Cascade Metrics

This process is fully automated and requires no manual intervention.


5) Mapping Databricks Data to Cascade

Recommended Output Schema

To map cleanly into Cascade, your view should return:

Field Type Description
metric_name string Unique Name (e.g., revenue, oee)
metric_value number Value to sync
metric_date date Timestamp for time-series

Mapping Logic

  • metric_key → Cascade Metric

  • metric_date + metric_value → Data point update


6) Common Troubleshooting Scenarios

Permission errors

  • Confirm service principal has:
    • CAN_USE on SQL Warehouse
    • SELECT on views
    • USE CATALOG and USE SCHEMA

Query failures

  • Confirm view exists and is accessible
  • Confirm schema and catalog are correctly referenced

Data issues

  • Ensure numeric fields are properly formatted
  • Avoid schema changes without coordination

Summary

The Databricks integration provides a secure, scalable, and automated way to connect your data warehouse to Cascade.

By using curated views and controlled access, customers can ensure:

  • Reliable data syncing
  • Minimal maintenance
  • Full control over business logic