Cloud

Automating GCP IAM with Terraform: Custom Roles and Workload Identity

Managing hierarchical custom IAM roles, service account bindings, and Workload Identity for a multi-environment AI platform — with a Terraform pipeline that requires manual approval before applying.

11 min read
IAM at Scale: The Challenge

A production AI platform can span multiple GCP projects (dev, UAT, prod) with dozens of service accounts, each needing precisely scoped permissions. Some services need Cloud SQL access, others need Pub/Sub publishing, and a Knowledge Base service might need Storage, DocumentAI, Vertex AI, and Secret Manager — but never more than what's required. Managing this manually through the GCP console is a recipe for drift and audit failures. Automated Terraform modules enforce perfect governance.

Terraform IAM Pipeline with Approval Gate

IAM changes flow best through a dedicated Jenkins pipeline triggered by PR merges to an IAM Terraform repository. The pipeline runs terraform plan and then pauses for manual approval in the Jenkins UI before executing terraform apply. This two-step process ensures no IAM change reaches production without human review — critical in regulated banking and enterprise environments where security audits are mandatory.

IAM Terraform Pipeline
sequenceDiagram participant Dev as Developer participant Repo as IAM Terraform Repo participant Jenkins as Jenkins Pipeline participant GCP as GCP IAM Dev->>Repo: Create PR (update .tfvars) Dev->>Repo: Merge PR Repo->>Jenkins: Webhook Trigger Jenkins->>Jenkins: git diff (detect env) Jenkins->>GCP: terraform plan Jenkins-->>Dev: Request Approval Dev-->>Jenkins: Approve Jenkins->>GCP: terraform apply
Workload Identity: No More JSON Keys

Workload Identity lets Kubernetes Service Accounts (KSAs) act as Google Service Accounts (GSAs) without managing JSON key files. The Terraform binds a KSA in a specific namespace to a GSA, granting it roles/iam.workloadIdentityUser. For Cloud Run, the service account is specified directly in the service.yaml — no keys to rotate, no secrets to leak, keeping our security posture completely hardened.

hcl
# Workload Identity binding — KSA acts as GSA (no JSON keys needed)
resource "google_service_account_iam_member" "workload_identity" {
  service_account_id = "projects/${var.project_id}/serviceAccounts/${var.gsa_email}"
  role               = "roles/iam.workloadIdentityUser"
  member             = "serviceAccount:${var.project_id}.svc.id.goog[${var.namespace}/${var.ksa_name}]"
}

# Cloud Run — service account specified declaratively in service.yaml
# serviceAccountName: my-service@my-project.iam.gserviceaccount.com
Lessons Learned: Custom Role Permission Limits

A major gotcha with GCP custom roles is that certain cloud permissions (such as Vertex AI model deployment or specific BigQuery analytical functions) cannot be assigned to custom roles due to GCP API constraints. Trying to add these permissions to a custom role will cause the Terraform apply step to fail. When this happens, you must instead bind the service account to predefined GCP roles, but scope the binding down to the resource level (using IAM Conditions) to maintain the Principle of Least Privilege.

More Recent Posts