Connect Databricks to Digital Tap AI
1 Prerequisites
Before connecting, you'll need:
- Databricks workspace URL — e.g.,
https://adb-1234567890.1.azuredatabricks.netorhttps://dbc-abc123.cloud.databricks.com - Personal Access Token (PAT) — with cluster management permissions
- Digital Tap AI account — sign up free if you haven't
Creating a Databricks Personal Access Token
- In your Databricks workspace, click your username (top-right) → User Settings
- Go to the Developer tab → Access tokens
- Click Generate new token
- Set a comment like "Digital Tap AI" and an expiration (90 days recommended)
- Copy the token — you won't see it again
2 Required Permissions
The Databricks user or service principal associated with the token needs:
- Cluster management — create, edit, delete, restart clusters
- Job viewing — list and view job runs
- Workspace read — view notebook usage patterns
💡 Tip: For production, we recommend creating a dedicated service principal with scoped permissions rather than using a personal token.
3 Install the Agent
Option A: Docker
docker run -d \
--name digitaltap-agent \
--restart unless-stopped \
-e DT_API_KEY="your-digital-tap-api-key" \
-e DT_PLATFORM="databricks" \
-e DATABRICKS_HOST="https://your-workspace.cloud.databricks.com" \
-e DATABRICKS_TOKEN="dapi-xxxxxxxxxxxxx" \
ghcr.io/digital-tap/agent:latest
Option B: Helm (Kubernetes)
helm repo add digitaltap https://charts.digitaltap.ai
helm repo update
helm install digitaltap-agent digitaltap/agent \
--set apiKey="your-digital-tap-api-key" \
--set platform="databricks" \
--set databricks.host="https://your-workspace.cloud.databricks.com" \
--set databricks.token="dapi-xxxxxxxxxxxxx" \
--namespace digitaltap --create-namespace
Option C: Environment Variables
# .env file for the Digital Tap agent
DT_API_KEY=your-digital-tap-api-key
DT_PLATFORM=databricks
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=dapi-xxxxxxxxxxxxx
# Optional: scan interval (default: 180 seconds)
DT_SCAN_INTERVAL=180
# Optional: enable dry-run mode first
DT_DRY_RUN=true
4 Verify Connection
- Open your Digital Tap AI dashboard
- Navigate to Integrations → Connected Platforms
- Your Databricks workspace should appear with a green status indicator
- Wait 3-5 minutes for the first scan to complete
💡 First scan: The agent will discover all clusters, analyze usage patterns, and generate initial savings recommendations within 5 minutes.
5 What Agents Do for Databricks
Once connected, Digital Tap AI activates these agents for your Databricks workspace:
- Idle Detection — Finds clusters with no active notebooks, jobs, or queries
- Auto-Hibernation — Hibernates idle clusters while preserving state for instant resume
- Right-Sizing — Detects over-provisioned clusters and recommends exact instance types
- Spot Optimization — Migrates fault-tolerant workloads to spot instances with automatic fallback
- Job Optimization — Analyzes recurring Databricks jobs and tunes Spark configurations
- Query Performance — Detects slow queries, bad joins, and full table scans
- Schedule Optimization — Learns usage patterns and creates optimal hibernate/wake schedules
- Cost Forecasting — Projects end-of-month Databricks spend with trend analysis
6 Troubleshooting
Agent shows "Disconnected"
- Verify your
DATABRICKS_HOSTURL is correct (no trailing slash) - Ensure the PAT hasn't expired
- Check network connectivity:
curl -H "Authorization: Bearer $DATABRICKS_TOKEN" $DATABRICKS_HOST/api/2.0/clusters/list
No clusters discovered
- Confirm the token user has Can Manage or Can Restart permissions on clusters
- Wait a full scan cycle (3 minutes) after connecting
Permission errors
- The token needs cluster management permissions — admin or workspace-level token recommended
- For Unity Catalog workspaces, ensure the service principal is added to the workspace
⚠️ Dry-run first: We recommend starting with
DT_DRY_RUN=true to preview what the agents would do before enabling automated actions.