Fix EFS dynamic provision on EKS

Probably it’s an obvious thing for people with more experience, but I spent an evening trying to figure out what’s wrong.

I have an EKS configured with terraform module terraform-aws-eks and IRSA configured like this:

module "efs_csi_irsa_role" {
  source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
 
  role_name             = "efs-csi"
  attach_efs_csi_policy = true
 
  oidc_providers = {
    ex = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:efs-csi-controller-sa"]
    }
  }

At some point it started to work with static provisioning, but when I tried to use dynamic it stopped with the next errors in efs-csi-controller pod:

I1204 23:55:08.556870       1 controller.go:61] CreateVolume: called with args {Name:pvc-f725e33d-b1e5-44ff-a400-1f9ff8388296 CapacityRange:required_bytes:5368709120  VolumeCapabilities:[mount:<> access_mode: ] Parameters:map[basePath:/dynamic_provisioning csi.storage.k8s.io/pv/name:pvc-f725e33d-b1e5-44ff-a400-1f9ff8388296 csi.storage.k8s.io/pvc/name:efs-claim2 csi.storage.k8s.io/pvc/namespace:kva-prod directoryPerms:700 fileSystemId:fs-031e4372b15a36d5a gidRangeEnd:2000 gidRangeStart:1000 provisioningMode:efs-ap] Secrets:map[] VolumeContentSource: AccessibilityRequirements: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1204 23:55:08.556934       1 cloud.go:238] Calling DescribeFileSystems with input: {
  FileSystemId: "fs-031e4372b15a36d5a"
}
E1204 23:55:08.597320       1 driver.go:103] GRPC error: rpc error: code = Unauthenticated desc = Access Denied. Please ensure you have the right AWS permissions: Access denied

And here is what I missed, official documentation uses eksctl for IRSA:

eksctl create iamserviceaccount \
    --cluster my-cluster \
    --namespace kube-system \
    --name efs-csi-controller-sa \
    --attach-policy-arn arn:aws:iam::111122223333:policy/AmazonEKS_EFS_CSI_Driver_Policy \
    --approve \
    --region region-code

SA creation is disabled with helm:

helm upgrade -i aws-efs-csi-driver aws-efs-csi-driver/aws-efs-csi-driver \
    --namespace kube-system \
    --set image.repository=602401143452.dkr.ecr.region-code.amazonaws.com/eks/aws-efs-csi-driver \
    --set controller.serviceAccount.create=false \
    --set controller.serviceAccount.name=efs-csi-controller-sa

So I missed service annotation. The thing which have helped me to figure out what’s wrong (no it wasn’t careful reading of the documentation) was CloudTrail:

    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "EKYQJEOBHPAS7L:i-deadbeede490d57b1",
        "arn": "arn:aws:sts::111122223333:assumed-role/default_node_group-eks-node-group-20220727213424437600000003/i-deadbeede490d57b1",
        "accountId": "111122223333",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "EKYQJEOBHPAS7L",
                "arn": "arn:aws:iam::111122223333:role/default_node_group-eks-node-group-20220727213424437600000003",
                "accountId": "111122223333",
                "userName": "default_node_group-eks-node-group-20220727213424437600000003"
            },
            "webIdFederationData": {},
            "attributes": {
                "creationDate": "2022-12-04T23:20:40Z",
                "mfaAuthenticated": "false"
            },
            "ec2RoleDelivery": "2.0"
        }
    },
    "errorMessage": "User: arn:aws:sts::111122223333:assumed-role/default_node_group-eks-node-group-20220727213424437600000003/i-deadbeede490d57b1 is not authorized to perform: elasticfilesystem:DescribeFileSystems on the specified resource",

Assuming role as a node differently not what I expected.

If I have been more thoughtful I may ask myself what comment “## Enable if EKS IAM for SA is used” was doing in aws-efs-csi-driver’s values.yaml but I hadn’t.
Evening spent, lesson learned.

PS

And  that update of service account doesn’t lead to magical appear of  AWS_WEB_IDENTITY_TOKEN_FILE env in container is a thing that worth to remember.

PPS

Looks like static provisioning will work even with broken IRSA for EFS, since NFS which is under the hood of EFS not be bothered by IAM existence in any sense.