Connect Amazon EMR to Digital Tap AI

⏱️ Estimated time: 5 minutes 📋 Difficulty: Easy 📅 Last updated: March 2026

1 Prerequisites

  • AWS credentials — Access Key + Secret Key, or IAM Role (recommended)
  • AWS Region where your EMR clusters run
  • Digital Tap AI accountsign up free

2 Required IAM Permissions

Create an IAM policy with these permissions. This is the minimum required for Digital Tap AI to discover and optimize your EMR clusters:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DigitalTapEMRRead",
      "Effect": "Allow",
      "Action": [
        "elasticmapreduce:ListClusters",
        "elasticmapreduce:DescribeCluster",
        "elasticmapreduce:ListInstances",
        "elasticmapreduce:ListInstanceGroups",
        "elasticmapreduce:ListSteps",
        "elasticmapreduce:DescribeStep",
        "elasticmapreduce:ListBootstrapActions"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DigitalTapEMRManage",
      "Effect": "Allow",
      "Action": [
        "elasticmapreduce:SetTerminationProtection",
        "elasticmapreduce:TerminateJobFlows",
        "elasticmapreduce:ModifyInstanceGroups",
        "elasticmapreduce:PutAutoScalingPolicy"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DigitalTapCloudWatch",
      "Effect": "Allow",
      "Action": [
        "cloudwatch:GetMetricData",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:ListMetrics"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DigitalTapCostExplorer",
      "Effect": "Allow",
      "Action": [
        "ce:GetCostAndUsage",
        "ce:GetCostForecast"
      ],
      "Resource": "*"
    }
  ]
}
💡 Least privilege: For monitor-only mode, remove the DigitalTapEMRManage statement. The agent will detect idle clusters and generate recommendations without taking action.

3 Install the Agent

Option A: Docker (with Access Keys)

docker run -d \
  --name digitaltap-agent \
  --restart unless-stopped \
  -e DT_API_KEY="your-digital-tap-api-key" \
  -e DT_PLATFORM="emr" \
  -e AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE" \
  -e AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLE" \
  -e AWS_DEFAULT_REGION="us-east-1" \
  ghcr.io/digital-tap/agent:latest

Option B: Docker (with IAM Role — recommended for EC2)

# Attach the IAM role to your EC2 instance, then:
docker run -d \
  --name digitaltap-agent \
  --restart unless-stopped \
  -e DT_API_KEY="your-digital-tap-api-key" \
  -e DT_PLATFORM="emr" \
  -e AWS_DEFAULT_REGION="us-east-1" \
  ghcr.io/digital-tap/agent:latest

Option C: Helm (Kubernetes with IRSA)

helm repo add digitaltap https://charts.digitaltap.ai
helm repo update

helm install digitaltap-agent digitaltap/agent \
  --set apiKey="your-digital-tap-api-key" \
  --set platform="emr" \
  --set aws.region="us-east-1" \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:aws:iam::123456789012:role/digitaltap-role" \
  --namespace digitaltap --create-namespace

4 Verify Connection

  1. Open your Digital Tap AI dashboard
  2. Navigate to IntegrationsConnected Platforms
  3. Your EMR clusters should appear within 3-5 minutes

5 EMR-Specific Features

  • Idle Cluster Detection — Finds EMR clusters with no running steps and low HDFS/YARN utilization
  • Auto-Termination — Terminates truly idle transient clusters to stop the meter
  • Instance Group Optimization — Right-sizes core and task instance groups based on actual usage
  • Spot Fleet Management — Optimizes spot vs on-demand mix in task groups
  • Step Optimization — Analyzes Spark step performance and recommends config improvements
  • Bootstrap Action Audit — Flags slow or redundant bootstrap actions adding startup time
  • Cost Forecasting — Projects EMR spend by cluster, team, and workload type

6 Troubleshooting

No clusters found

  • Verify AWS_DEFAULT_REGION matches where your clusters run
  • For multi-region, set DT_AWS_REGIONS="us-east-1,us-west-2,eu-west-1"
  • Ensure IAM permissions include elasticmapreduce:ListClusters

Authentication errors

  • Verify credentials: aws sts get-caller-identity
  • If using IAM roles, ensure the instance profile is attached
← Back to Quickstart Full API Docs →