Nebulaworks Insight Content Card Background - Ricardo gomez angel building facade
Recent Updates
When it comes to managing workloads in a cluster, Kubernetes is often the tool of choice, with its open-source nature and ever-expanding user base. Being a container orchestrator, it solves the issue of micromanaging numerous ephemeral containers that often host various parts of an application, grouped together via Pods. Each of these containers has its own independent storage and life cycles. Due to this distinction from running traditional virtual machines (VMs), new challenges are presented in these applications. One such challenge is file storage. To resolve this issue, Kubernetes has the concept of volumes, which allows for pods to have permanent storage space. In this blog, we will be taking a quick look at leveraging AWS EBS for Kubernetes Persistent Volumes.
Kubernetes Persistent Volumes
As of this blog, there are two different categories of volumes that exist in Kubernetes, normal volumes, and
persistent volumes. Persistent volumes
come with the added luxury of being independent of the pod they are attached to, making them completely independent from
the pod’s life cycle. Not only that, but they are more flexible than the standard volume, such as having user-specified
sizes and performance needs. Kubernetes volumes also come with the nice perk of having a multitude of different types of
them to fit a user’s need. One such type of persistent volume is the
AWSElasticBlockStore
which is the type
this blog will focus on.
Why go to the cloud?
Great question! It may be a bold move to suddenly trust a third-party developer to store your cluster’s data, especially if it contains confidential data. However, this decision has a lot of merit to it, despite the initial rebound. By utilizing another service, the cluster’s infrastructure has been greatly simplified. As we will be seeing shortly, connecting a cloud provider’s volume into your cluster is fairly straightforward. Not only that, but it will also cut costs on maintaining an in-house server that would host said solution. What’s more, a cloud provider has built-in reliability, security, and high availability that they take care of in the background. All the end-user will need to worry about is utilizing said service in their applications. This separation of operations will prove its weight in gold in the long run.
Now that we addressed the Why?
, let’s do a quick dive into the How?
aspect.
Pre-Requirements
To properly utilize a cloud provider’s storage for persistent volumes, one must have the following:
- A working Kubernetes cluster that is hosted on AWS. This can either be done on EC2 instances (which was what this
blog post was written with in mind) or using AWS EKS service. The cluster also needs to have the flag
--cloud-provider=aws
enabled on the kubelet, api-server, and the controller-manager during the cluster’s creation. One way to incorporate this flag is by usingkubeadm init --config config.yaml
when creating a new cluster. An example of what is in aconfig.yaml
is:
config.yaml
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
apiServer:
extraArgs:
cloud-provider: aws
controllerManager:
extraArgs:
cloud-provider: aws
address: 0.0.0.0
networking:
podSubnet: <the-value-you-put-for-the-pod-address-cidr-flag>
scheduler:
extraArgs:
address: 0.0.0.0
---
apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
For best practice, it is reccomended to have your cluster hosted in the same environment that your volumes will reside in. Otherwise, you will run into issues concerning data transfer/upload speeds.
- The instances in the cluster have their hostname to be the same as their private DNS entry. The quickest way to get this done is by doing the following command on your EC2 instances
sudo sed -i "s/$(hostname)/$(curl http://169.254.169.254/latest/meta-data/hostname)/g" /etc/hosts
sudo sed -i "s/$(hostname)/$(curl http://169.254.169.254/latest/meta-data/hostname)/g" /etc/hostname
sudo reboot
Walkthrough
- Create the AWS Elastic Block Store (EBS) volume in the same region as your cluster. If you have the
aws cli
installed and configured, this command will create one for you:
aws ec2 create-volume --availability-zone=eu-west-1a --size=10 --volume-type=gp2
- With this new volume, attach it onto the master node in your cluster. If you have the
aws cli
installed and configured, this command will perform this for you:
aws ec2 attach-volume --device /dev/xvdf --instance-id <MASTER NODE ID> --volume-id <YOUR VOLUME ID>
- In the master node, check to see if your device is attached to your instance by running
lsblk
. If the last step worked, you should see your volume at the bottom of the list. In this case, the volume I made earlier is callednvme1n1
.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 17.9M 1 loop /snap/amazon-ssm-agent/1068
loop1 7:1 0 89.3M 1 loop /snap/core/6673
nvme0n1 259:0 0 25G 0 disk
└─nvme0n1p1 259:1 0 25G 0 part /
nvme1n1 259:2 0 10G 0 disk
- With the name of the volume, create the filesystem on the volume. This only needs to be done once on the volume.
sudo mkfs -t xfs /dev/<NAME OF VOLUME FROM PREV STEP>
- Create a
Persistent Volume
that associates the EBS you made to the cluster. An example of said volume looks like this:
pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: aws-pv
labels:
type: aws-pv
spec:
capacity:
storage: 3Gi
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
volumeID: <YOUR EBS VOLUME ID HERE>
fsType: xfs
- Create the
Persistent Volume Claim
that will take a partition of thePersistent Volume
we just made. An example of said claim would look like is:
pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: aws-pvc
labels:
type: aws-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
selector:
matchLabels:
type: <THE NAME OF THE PV YOU MADE EARLIER>
- Create a
Pod
that takes in thePersistent Volume Claim
we just made and mounts it into the Pod. An example of said pod looks like this:
redis-cloud.yaml
apiVersion: v1
kind: Pod
metadata:
name: redis-cloud
spec:
volumes:
- name: cloud-storage
persistentVolumeClaim:
claimName: <NAME OF CLAIM YOU MADE EARLIER>
containers:
- name: redis
image: redis
volumeMounts:
- name: cloud-storage
mountPath: /cloud/data
- Run the following
kubectl
commands on your cluster:
kubectl create -f pv.yaml
kubectl create -f pvc.yaml
To verify that your volume and claim are associated, run kubectl get pvc
and look for the name of your PVC that you
made.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
aws-pvc Bound aws-pv 3Gi RWO 3s
If the status
of it says BOUND
, everything is working!
With you PVC bound to the PV, now run:
kubectl create -f redis-cloud.yaml
Once it is up, verify to see if the volume has been properly mounted onto the pod by doing:
kubectl describe pod redis-cloud
. If theEvents
section looks like the following, the volume mounted successfully!
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17s default-scheduler Successfully assigned default/redis-cloud-2 to ip-172-31-23-218.us-west-1.compute.internal
Normal SuccessfulAttachVolume 15s attachdetach-controller AttachVolume.Attach succeeded for volume "aws-pv"
Normal Pulling 7s kubelet, ip-172-31-23-218.us-west-1.compute.internal Pulling image "redis"
Normal Pulled 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Successfully pulled image "redis"
Normal Created 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Created container redis
Normal Started 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Started container redis
Perform a local exec into the pod, using
kubectl exec -it nameOfPod -- /bin/bash
and verify that the volume is at the mount point that we specified (in this case, it should be at/cloud/data
).You’re done! Feel free to add files to that directory. Even if the pod is deleted, when the pod is respun up, whether it is the same exact yaml that we provided or if it is a brand new pod, that file should still be in there.
NOTE: As of this blog post, the EBS volume integration with Kubernetes PV will only work on one node at a time. This means that two nodes cannot mount the same EBS volume at once. Thus, when making deployments using PVs that are backed by EBS, be sure to properly allocate the pods being located on the instance that has the volume attached to it.
But what about Dynamic Storage Provisioning?
Another good question! One of the downfalls of using the method above is that an operator needs to create the storage
resource itself on a cloud provider and then link it to a Persistent Volume
. Once that is done, the developer can then
create the Persistent Volume Claim
to use said deployed PV. However, there is a way for storage resources on the fly
by the use of Storage Classes
. This works by that the Storage Class
, will provision the needed storage resource onto
the cloud, using the specified provisioner
. In order for these to work, the cluster must have the proper IAM
permissions granted to them in order to deploy the proper resources.
These Storage Class
objects are declared like the following (note that this is the format for a Storage Class
utilizing EBS. For more detains on other cloud providers, refer to
this link):
sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-storage-class
provisioner: kubernetes.io/aws-ebs
parameters:
type: io1
iopsPerGB: '10'
fsType: xfs
By deploying sc.yaml
into the cluster, all an operator needs to do when it comes to provisioning volumes for their
developers is creating a Persistent Volume
that has the additional parameter to it:
pvWithSC.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: aws-pv-sc
labels:
type: sc
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
storageClassName: ebs-storage-class # NEW PARAMETER
Then, a developer who needs to utilize a Persistent Volume
creates and deploys the following Persistent Volume Claim
for their own use:
pvcWithSC.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: aws-pvc-sc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
storageClassName: ebs-storage-class # NEW PARAMETER
selector:
matchLabels:
type: sc
Wrap Up
Volumes are what makes applications running in Kubernetes pods much more reliable in usability. No longer does the operations need to be concerned with making sure the data is safe from deletion or loss. By leveraging cloud providers, like AWS, in connecting to your Kubernetes persistent volumes, the cluster will continue to stay reliable in performance as well as operating.
Looking for a partner with engineering prowess? We got you.
Learn how we've helped companies like yours.