One of the most important features of Kubernetes is the ability to easily scale our containerized applications. This allows administrators to deal with increased traffic by adding more replicas that can handle the uptick in activity. Kubernetes can handle the load intelligently by distributing the work evenly to pods in the cluster, ensuring that none of them become overwhelmed.
Having multiple replicas also means that we are able to perform rolling updates without any downtime, as replicas can be updated one by one and there will always be enough available to servce the incoming traffic. In this tutorial, we will show you how to use the
kubectl scale command to scale applications by configuring additional replicas on a Linux system. Check out some of the example kubectl commands below to get started.
In this tutorial you will learn:
- How to scale a deployment up or down in Kubernetes
- How to get information on deployment replicas and pods
|Category||Requirements, Conventions or Software Version Used|
|System||Any Linux distro|
|Other||Privileged access to your Linux system as root or via the
# – requires given linux commands to be executed with root privileges either directly as a root user or by use of
$ – requires given linux commands to be executed as a regular non-privileged user
Use Scale Command in Kubernetes
These steps assume that you already have your Kubernetes cluster up and running, and have access to the
- Let’s start by checking our currently deployments. In this example, we have a single Nginx container running:
$ kubectl get deployments NAME READY UP-TO-DATE AVAILABLE AGE nginx-server 1/1 1 1 55s
NAMEshows the name of our deployment(s).
READYis the number of replicas available for the deployment, out of the total.
UP-TO-DATEis how many replicas match the latest version of the deployment.
AVAILABLEis the number of replicas that have been yet been updated.
AGEis how long the deployment has been up since it was created.
DID YOU KNOW?
Having multiple replicas will not only help your cluster to serve increased traffic demands, but also provide fault tolerance and better availability, which can be very useful if another instance goes down or when performing rolling updates.
- Using the
kubectl, we can scale our deployments up or down and specify the number of replicas we wish for the deployment to use. In this example, we will scale up our
nginx-serverdeployment by taking it from one replica up to five.
$ kubectl scale deployments/nginx-server --replicas=5
- In the screenshot above, you can see that we execute the following command immediately after our
$ kubectl get deployments
At first, the command returns output that indicates 1/5 replicas are ready. A few moments later, when we execute the command again, we confirm that all five of our replicas are now ready.
- Let’s get some more information from our replicas by executing:
$ kubectl get pods -o wide
As you can see in the screenshot above, this command reveals the IP address of each pod.
- We can also get some relevant replica information for our deployment with this command:
$ kubectl describe deployments/nginx-server | grep Replicas Replicas: 5 desired | 5 updated | 5 total | 5 available | 0 unavailable
- The syntax to scale a deploymment down is the same. With this command, we will take our Nginx server replicas from five down to three.
$ kubectl scale deployments/nginx-server --replicas=3
In this tutorial, we saw how to use the
kubectl scalecommand in Kubernetes on a Linux system. This command is used to increase or decrease the number of replicas that are running for a deployment in our Kubernetes cluster. By controlling the number of replicas, we can scale an application to meet increased demand, or scale it down when the number of replicas becomes excessive. By having multiple replicas, we also have the ability to perform rolling updates.