Back to Home

Kubernetes in Production

Storage & Persistence

Managing persistent data in Kubernetes: Volumes, PersistentVolumes, StatefulSets, and storage best practices.

Storage Concepts

Volumes

Directory accessible to containers in a pod

  • Pod-scoped: Lifecycle tied to pod
  • Shared between containers in same pod
  • Many types: emptyDir, hostPath, configMap, secret, etc.
  • Ephemeral or persistent depending on type

PersistentVolume (PV)

Cluster-wide storage resource provisioned by admin

  • Independent lifecycle from pods
  • Can be provisioned statically or dynamically
  • Supports various storage backends (NFS, iSCSI, cloud providers)
  • Defines capacity, access modes, reclaim policy

PersistentVolumeClaim (PVC)

Request for storage by a user

  • Abstracts storage details from pod
  • Binds to appropriate PV based on requirements
  • Specifies size and access mode needed
  • Used by pods to mount persistent storage

StorageClass

Describes classes of storage (quality of service, backup policies)

  • Enables dynamic provisioning
  • Defines provisioner (AWS EBS, GCE PD, etc.)
  • Sets parameters (type, IOPS, replication)
  • Allows different tiers (fast SSD, slow HDD)

Access Modes

📝

ReadWriteOnce (RWO)

Mounted as read-write by single node

Use: Databases, single-replica apps

📖

ReadOnlyMany (ROX)

Mounted as read-only by many nodes

Use: Shared configuration, static content

📚

ReadWriteMany (RWX)

Mounted as read-write by many nodes

Use: Shared file systems, multi-replica stateful apps

Note: Not all volume types support all access modes. Cloud provider block storage (EBS, PD) typically only supports RWO, while network file systems (NFS, CephFS) support RWX.

StatefulSets

StatefulSets manage stateful applications, providing guarantees about ordering, uniqueness, and persistent storage.

Key Features

  • Stable network identity: Each pod gets predictable hostname (app-0, app-1, app-2)
  • Ordered deployment: Pods created/updated/deleted in order
  • Stable storage: PVCs automatically created per pod
  • Headless service: Direct pod access via DNS
  • Graceful scaling: Controlled scaling up/down

Common Use Cases

✓ Databases

MySQL, PostgreSQL, MongoDB, Cassandra

✓ Message Queues

Kafka, RabbitMQ, NATS

✓ Distributed Systems

ZooKeeper, etcd, Consul

✓ Search Engines

Elasticsearch, Solr

Storage Best Practices

Choose the Right Storage Type

  • Use local storage for temporary data (emptyDir)
  • Use PV/PVC for persistent data
  • Consider performance requirements (SSD vs HDD)
  • Evaluate cost vs performance tradeoffs

Backup and Disaster Recovery

  • Implement regular backup schedules
  • Test restore procedures
  • Use volume snapshots where available
  • Consider cross-region replication for critical data

Resource Management

  • Set storage limits in PVCs
  • Monitor storage usage and capacity
  • Implement data retention policies
  • Clean up orphaned volumes

Performance Optimization

  • Use appropriate IOPS for workload
  • Consider volume provisioning time
  • Implement caching where appropriate
  • Monitor I/O performance metrics

Key Takeaways

  • PersistentVolumes provide durable storage independent of pod lifecycle
  • StorageClasses enable dynamic provisioning for easier storage management
  • StatefulSets are essential for stateful applications requiring stable identities
  • Choose access modes based on application requirements and storage backend capabilities
  • Always implement backup strategies and test disaster recovery procedures