Kubernetes cluster auto-scaling is the process of automatically adding or removing computing resources (such as servers or pods) based on the workload. This ensures that the Kubernetes cluster can handle increased or decreased demand. For instance, if a large number of users suddenly start accessing an application, the system will automatically scale up (adding more servers) to handle the increased demand. Similarly, if the demand for the application decreases, the system will automatically scale down (removing some of the servers) to conserve resources. This way, the system is always running efficiently and is able to handle any changes in demand.