π«EMR SaaS Installation
Summary
This guide shows how to grant DataFlint read-only access to Amazon EMR metadata via cross-account IAM role assumption. It applies to EMR on EC2, EMR Serverless, and EMR on EKS (IAM called this "EMR Containers").
For the broader SaaS threat model and stability notes, see SaaS Security & Stability.
You will:
Create a dedicated IAM role in your AWS account.
Add a trust policy that allows the DataFlint service role to assume it.
Attach a minimal read-only policy for EMR / EMR Containers APIs.
Share the role ARN + regions with DataFlint.
The entire process should take a few minutes.
You need to repeat this per AWS account you want DataFlint to read from.
Cross-account roles must use an External ID. Ask DataFlint for your CUSTOMER_EXTERNAL_ID and the DataFlint AWS account details.
What DataFlint needs
Send DataFlint:
Role ARN you created (one per AWS account).
Region(s) where you run EMR (and/or EMR on EKS / EMR Containers).
The role name (optional, helps troubleshooting).
How it works
DataFlint assumes the role you create and calls read-only EMR APIs to:
Discover clusters / virtual clusters.
List and describe job runs and steps.
Fetch application UI links when applicable (read-only).
AWS classifies a few EMR βUI helperβ APIs as write actions (for example elasticmapreduce:CreatePersistentAppUI). DataFlint uses them only to generate read-only UI access links.
Required IAM permissions (minimal)
Use a dedicated policy attached to the role. This is the minimal set we currently require for EMR + EMR Containers read access:
If you want to scope down further (by region, tags, or resource ARNs), tell us your constraints and weβll help tighten it.
Installation
Pick one method. All methods create the same resources.
Step 1: Create the IAM policy
Open IAM β Policies.
Click Create policy.
Choose JSON.
Paste the minimal policy from Required IAM permissions.
Click Next.
Policy name:
DataflintEmrContainersReadOnlyCreate the policy.
Step 2: Create the IAM role
Open IAM β Roles.
Click Create role.
Trusted entity type: AWS account.
Select Another AWS account.
Account ID:
DATAFLINT_ACCOUNT_ID(get it from DataFlint).Enable Require external ID.
External ID:
CUSTOMER_EXTERNAL_ID(get it from DataFlint).In Add permissions, attach
DataflintEmrContainersReadOnly.Role name:
dataflint-emr-read-only-roleCreate the role.
Step 3: Update the trust policy (Principal role)
In some accounts, the UI creates the trust policy with the root principal. We require the DataFlint service role principal instead.
Open the new role.
Go to Trust relationships β Edit trust policy.
Use this trust policy (replace placeholders):
Step 4: Copy the role ARN
Open the role summary and copy the ARN. Youβll share it with DataFlint.
Prerequisites
AWS CLI v2 installed
Logged in to the target AWS account
Create the trust policy file
Create the permissions policy file
Create role + policy and attach
Replace placeholders before running:
DATAFLINT_ACCOUNT_IDCUSTOMER_EXTERNAL_ID
If your org mandates a permissions boundary or specific tags, add them at create-role.
Validate the setup (recommended)
You can validate that the role exists and has the expected policies attached.
You can also validate permissions using IAM simulation:
Send the details to DataFlint
Share over your approved secure channel:
Role ARN
Regions for EMR / EMR Containers
Last updated