Storage & Persistence
Managing persistent data in Kubernetes: Volumes, PersistentVolumes, StatefulSets, and storage best practices.
Storage Concepts
Volumes
Directory accessible to containers in a pod
- •Pod-scoped: Lifecycle tied to pod
- •Shared between containers in same pod
- •Many types: emptyDir, hostPath, configMap, secret, etc.
- •Ephemeral or persistent depending on type
PersistentVolume (PV)
Cluster-wide storage resource provisioned by admin
- •Independent lifecycle from pods
- •Can be provisioned statically or dynamically
- •Supports various storage backends (NFS, iSCSI, cloud providers)
- •Defines capacity, access modes, reclaim policy
PersistentVolumeClaim (PVC)
Request for storage by a user
- •Abstracts storage details from pod
- •Binds to appropriate PV based on requirements
- •Specifies size and access mode needed
- •Used by pods to mount persistent storage
StorageClass
Describes classes of storage (quality of service, backup policies)
- •Enables dynamic provisioning
- •Defines provisioner (AWS EBS, GCE PD, etc.)
- •Sets parameters (type, IOPS, replication)
- •Allows different tiers (fast SSD, slow HDD)
Access Modes
ReadWriteOnce (RWO)
Mounted as read-write by single node
Use: Databases, single-replica apps
ReadOnlyMany (ROX)
Mounted as read-only by many nodes
Use: Shared configuration, static content
ReadWriteMany (RWX)
Mounted as read-write by many nodes
Use: Shared file systems, multi-replica stateful apps
Note: Not all volume types support all access modes. Cloud provider block storage (EBS, PD) typically only supports RWO, while network file systems (NFS, CephFS) support RWX.
StatefulSets
StatefulSets manage stateful applications, providing guarantees about ordering, uniqueness, and persistent storage.
Key Features
- Stable network identity: Each pod gets predictable hostname (app-0, app-1, app-2)
- Ordered deployment: Pods created/updated/deleted in order
- Stable storage: PVCs automatically created per pod
- Headless service: Direct pod access via DNS
- Graceful scaling: Controlled scaling up/down
Common Use Cases
✓ Databases
MySQL, PostgreSQL, MongoDB, Cassandra
✓ Message Queues
Kafka, RabbitMQ, NATS
✓ Distributed Systems
ZooKeeper, etcd, Consul
✓ Search Engines
Elasticsearch, Solr
Storage Best Practices
Choose the Right Storage Type
- Use local storage for temporary data (emptyDir)
- Use PV/PVC for persistent data
- Consider performance requirements (SSD vs HDD)
- Evaluate cost vs performance tradeoffs
Backup and Disaster Recovery
- Implement regular backup schedules
- Test restore procedures
- Use volume snapshots where available
- Consider cross-region replication for critical data
Resource Management
- Set storage limits in PVCs
- Monitor storage usage and capacity
- Implement data retention policies
- Clean up orphaned volumes
Performance Optimization
- Use appropriate IOPS for workload
- Consider volume provisioning time
- Implement caching where appropriate
- Monitor I/O performance metrics
Key Takeaways
- PersistentVolumes provide durable storage independent of pod lifecycle
- StorageClasses enable dynamic provisioning for easier storage management
- StatefulSets are essential for stateful applications requiring stable identities
- Choose access modes based on application requirements and storage backend capabilities
- Always implement backup strategies and test disaster recovery procedures