Set up EFS filesystem and CSI driver on EKS cluster

The application relies on network storage to temporarily store data while processing jobs. The instructions below explain how to setup AWS's flavor of NFS, EFS, within the cluster. To get started, some variables need to be setup from the command line. These instructions are adapted from the EFS CSI Driver documentation.

Tip

We're using the AWS CLI with named profiles. If you aren't using profiles you can edit the commands to leave off the --profile=$profile part.

$ cluster_name=YOUR-CLUSTER
$ region=YOUR-CLUSTER-REGION
$ profile=YOUR-AWS-NAMED-PROFILE

Use the vars and aws cli to get the VPC of the cluster

$ vpc_id=$(aws eks describe-cluster \
    --name $cluster_name \
    --query "cluster.resourcesVpcConfig.vpcId" \
    --output text \
    --region=$region \
    --profile=$profile)

Tip

You can echo these variables if you're curious what was returned here e.g. echo $vpc_id should show a VPC ID for your cluster.

Set up network security group that allows NFS traffic from the cluster

  1. Get CIDR range for the VPC

    $ cidr_range=$(aws ec2 describe-vpcs \
        --vpc-ids $vpc_id \
        --query "Vpcs[].CidrBlock" \
        --output text \
        --region $region \
        --profile $profile)
    
  2. Create a security group

    $ security_group_id=$(aws ec2 create-security-group \
        --group-name EFS-EKS-Access \
        --description "EFS Access for EKS" \
        --vpc-id $vpc_id \
        --output text \
        --region $region \
        --profile $profile)
    
  3. Allow NFS traffic from the VPC CIDR range

    $ aws ec2 authorize-security-group-ingress \
        --group-id $security_group_id \
        --protocol tcp \
        --port 2049 \
        --cidr $cidr_range \
        --region $region \
        --profile $profile
    

Determine which subnets the Cluster belongs to

  1. Get the subnets in the VPC

    $ aws ec2 describe-subnets \
        --filters "Name=vpc-id,Values=$vpc_id" \
        --query 'Subnets[*].{SubnetId: SubnetId,AvailabilityZone: AvailabilityZone,CidrBlock: CidrBlock}' \
        --output table \
        --profile=$profile \
        --region=$region
    

    Example output:

    -------------------------------------------------------------------
    |                         DescribeSubnets                         |
    +------------------+-----------------+----------------------------+
    | AvailabilityZone |    CidrBlock    |         SubnetId           |
    +------------------+-----------------+----------------------------+
    |  us-west-1a      |  172.16.4.0/24  |  subnet-123456789041ec429  |
    |  us-west-1a      |  172.16.6.0/24  |  subnet-1234567890810e520  |
    |  us-west-1c      |  172.16.2.0/24  |  subnet-1234567890fa0f076  |
    |  us-west-1c      |  172.16.5.0/24  |  subnet-12345678900f6af4a  |
    +------------------+-----------------+----------------------------+
    
  2. Get node IP adddresses

    The name of the node has the IP address in it.

    $ kubectl get nodes
    NAME                                         STATUS   ROLES    AGE    VERSION
    ip-172-16-2-100.us-west-2.compute.internal   Ready    <none>   60d   v1.xx.xx-eks-1234
    ip-172-16-6-35.us-west-2.compute.internal    Ready    <none>   60d    v1.xx.xx-eks-1234
    

    In the above example these two nodes fall into range for 172.16.2.0/24 and 172.16.6.0/24 respectively. This means our subnets in use are:

    • subnet-1234567890fa0f076
    • subnet-1234567890810e520

    We'll use these next when creating mount targets in our EFS filesystem.

Create Filesystem

  1. Create a filesystem in EFS.

    $ file_system_id=$(aws efs create-file-system \
        --region $region \
        --performance-mode generalPurpose \
        --query 'FileSystemId' \
        --output text \
        --profile $profile)
    
  2. Create mount target for each subnet from earlier step

    Replace subnet-11111222222 with your subnet ID. Run this to create a mount target for each subnet that your cluster is using.

    $ aws efs create-mount-target \
        --file-system-id $file_system_id \
        --subnet-id subnet-11111222222 \
        --security-groups $security_group_id \
        --region $region \
        --profile $profile
    

Configure the EKS cluster to use the EFS CSI add-on

Follow the documentation to set up the necessary ServiceAccounts and install the EFS CSI add-on.

Set up storageclass

  1. Get your filesystem ID created earlier. Use your fileSystemID created earlier:

    $ echo $file_system_id
    fs-094e712345abcde
    
  2. Create a YAML file for the storageclass.

    efs-storageclass.yaml:

    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: efs-sc
    provisioner: efs.csi.aws.com
    parameters:
      provisioningMode: efs-ap
      fileSystemId: fs-094e712345abcde # Replace with your filesystem ID
      directoryPerms: "700"
      gidRangeStart: "1000" # optional
      gidRangeEnd: "2000" # optional
      basePath: "/dynamic_provisioning" # optional
    
  3. Apply the YAML file to create the storageclass.

    $ kubectl apply -f efs-storageclass.yaml
    

Install a test app to confirm installation

  1. Download a sample manifest that creates a pod with a PersistentVolumeClaim:

    curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/examples/kubernetes/dynamic_provisioning/specs/pod.yaml
    
  2. Deploy the pod and PersistentVolumeClaim:

    kubectl apply -f pod.yaml
    
  3. Confirm the pod and PersistentVolumeClaim were created:

    $ kubectl get pods
    

    After a few seconds you should see a pod called efs-app in the Running state:

    efs-app   1/1     Running   0          46s
    
  4. Confirm that the PersistentVolumeClaim was created:

    $ kubectl get pvc
    NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    efs-claim   Bound    pvc-d2b633be-34e1-4415-ad21-26e540705373   5Gi        RWX            efs-sc         116s
    
  5. Confirm that the test data is being written to the volume

    $ kubectl exec efs-app -- bash -c "cat data/out"
    Thu Jun 1 04:28:59 UTC 2023
    Thu Jun 1 04:29:04 UTC 2023
    Thu Jun 1 04:29:09 UTC 2023
    Thu Jun 1 04:29:14 UTC 2023
    Thu Jun 1 04:29:19 UTC 2023
    Thu Jun 1 04:29:24 UTC 2023
    Thu Jun 1 04:29:29 UTC 2023
    Thu Jun 1 04:29:34 UTC 2023
    Thu Jun 1 04:29:39 UTC 2023
    Thu Jun 1 04:29:44 UTC 2023
    Thu Jun 1 04:29:49 UTC 2023
    Thu Jun 1 04:29:54 UTC 2023
    
  6. Remove the test pod and PersistentVolumeClaim:

    $ kubectl delete -f pod.yaml
    
Next
What is Privacy Dynamics?