Sai Umesh

Persisting storage using Kubernetes

7 min read

We will be exploring the topic of persisting storage using Kubernetes. When we use containers to run services like databases, data loss can occur on restarts of the service due to the stateless nature of containers.

In this post, we will dive into an example of how to achieve persistent storage using Kubernetes. To demonstrate this, we will be using Docker Desktop as our Kubernetes engine. It’s important to note that each Kubernetes engine has its own drivers to achieve persistent storage, but the underlying concepts remain the same.

By the end of this post, you will have a comprehensive understanding of how to leverage Kubernetes to persist data and prevent data loss, ensuring the longevity of your services. So, let’s get started!

Storage class, PV, and PVC

Kubernetes, a storage class plays a critical role in defining the properties of the storage that will be used for our persistent volume. By creating a storage class, we can define the type of storage we need for our application, such as the access mode, the provisioner, the reclaim policy, and the volume binding mode.

When we create a Persistent Volume Claim (PVC) in Kubernetes to request storage for our application, we can specify the storage class that should be used to provision the requested storage. Kubernetes then provisions the storage according to the parameters defined in the storage class.

configs

namespace

apiVersion: v1
kind: Namespace
metadata:
  name: postgres

storage class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: postgres-storage-class
  namespace: postgres
provisioner: docker.io/hostpath
reclaimPolicy: Retain
allowVolumeExpansion: true

Deployment, ConfigMap, PV, and PVC

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-cm
  namespace: postgres
  labels:
    app: postgres
data:
  POSTGRES_DB: saiumesh
  POSTGRES_USER: postgres
  POSTGRES_PASSWORD: postgres
  PGDATA: /var/lib/postgresql/data/pgdata
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: postgres-volume
  namespace: postgres
  labels:
    app: postgres
spec:
  storageClassName: postgres-storage-class
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: "/<PATH_TO_DATA>"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: postgres-volume-claim
  namespace: postgres
  labels:
    app: postgres
spec:
  storageClassName: postgres-storage-class
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  volumeName: postgres-volume
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:latest
          imagePullPolicy: "IfNotPresent"
          ports:
            - containerPort: 5432
          envFrom:
            - configMapRef:
                name: postgres-cm
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: postgredb
      volumes:
        - name: postgredb
          persistentVolumeClaim:
            claimName: postgres-volume-claim
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: postgres
  labels:
    app: postgres
spec:
  type: NodePort
  ports:
    - port: 5432
      nodePort: 31423
  selector:
    app: postgres

Creating Resources

Run the following command to create namespace, storage class and deployment

kubectl apply -f ./namespace.yaml
kubectl apply -f ./storage-class.yaml
kubectl apply -f ./postgres-deployment.yaml

After you have deployed your PostgreSQL database using the postgres-deployment.yaml file, you can check if the pgdata folder has been created at the specified path as defined below

hostPath:
    path: "<PATH>"

Connect DB

To connect to the PostgreSQL database that we have deployed using the postgres-deployment.yaml file, you can run the following command:

In our case, the PostgreSQL database is running on port 31423, and the configuration details are specified in the postgres-deployment.yaml file as a ConfigMap.

psql -U postgres -d saiumesh -h localhost -p 31422

Query to create table and insert data for testing

CREATE TABLE employee (
    emp_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    gender CHAR(1),
    birthdate DATE,
    email VARCHAR(100) UNIQUE,
    salary INT
);

INSERT INTO employee VALUES(1, 'Annie', 'Smith', 'F', DATE '1988-01-09', 'ani@email.com', 5000);

-- Read data from table
select * from employee;

Restart the container by running following command

kubectl rollout restart deployment/postgres

After restarting the deployment, you can connect to the database again and verify that the data is still present.

Delete resources

kubectl delete -f ./namespace.yaml
kubectl delete -f ./storage-class.yaml
kubectl delete -f ./postgres-deployment.yaml

image info


Sai Umesh

I’m Sai Umesh, a software engineer based in India. Working as a DevOps engineer.