One of the most important features of Kubernetes is the ability to easily scale our containerized applications. This allows administrators to deal with increased traffic by adding more replicas that can handle the uptick in activity. Kubernetes can handle the load intelligently by distributing the work evenly to pods in the cluster, ensuring that none of them become overwhelmed.
Having multiple replicas also means that we are able to perform rolling updates without any downtime, as replicas can be updated one by one and there will always be enough available to servce the incoming traffic. In this tutorial, we will show you how to use the kubectl scale
command to scale applications by configuring additional replicas on a Linux system. Check out some of the example kubectl commands below to get started.
In this tutorial you will learn:
- How to scale a deployment up or down in Kubernetes
- How to get information on deployment replicas and pods

Category | Requirements, Conventions or Software Version Used |
---|---|
System | Any Linux distro |
Software | Kubernetes |
Other | Privileged access to your Linux system as root or via the sudo command. |
Conventions |
# – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux commands to be executed as a regular non-privileged user |
Use Scale Command in Kubernetes
These steps assume that you already have your Kubernetes cluster up and running, and have access to the
kubectl
command.
- Let’s start by checking our currently deployments. In this example, we have a single Nginx container running:
$ kubectl get deployments NAME READY UP-TO-DATE AVAILABLE AGE nginx-server 1/1 1 1 55s
NAME
shows the name of our deployment(s).
READY
is the number of replicas available for the deployment, out of the total.
UP-TO-DATE
is how many replicas match the latest version of the deployment.
AVAILABLE
is the number of replicas that have been yet been updated.
AGE
is how long the deployment has been up since it was created.DID YOU KNOW?
Having multiple replicas will not only help your cluster to serve increased traffic demands, but also provide fault tolerance and better availability, which can be very useful if another instance goes down or when performing rolling updates. - Using the
scale
argument withkubectl
, we can scale our deployments up or down and specify the number of replicas we wish for the deployment to use. In this example, we will scale up ournginx-server
deployment by taking it from one replica up to five.$ kubectl scale deployments/nginx-server --replicas=5
Scaling up our deployment to five replicas and then checking how many are ready - In the screenshot above, you can see that we execute the following command immediately after our
scale
command:$ kubectl get deployments
At first, the command returns output that indicates 1/5 replicas are ready. A few moments later, when we execute the command again, we confirm that all five of our replicas are now ready.
- Let’s get some more information from our replicas by executing:
$ kubectl get pods -o wide
Checking the IP addresses of our Nginx replicas As you can see in the screenshot above, this command reveals the IP address of each pod.
- We can also get some relevant replica information for our deployment with this command:
$ kubectl describe deployments/nginx-server | grep Replicas Replicas: 5 desired | 5 updated | 5 total | 5 available | 0 unavailable
- The syntax to scale a deploymment down is the same. With this command, we will take our Nginx server replicas from five down to three.
$ kubectl scale deployments/nginx-server --replicas=3
Our Nginx server deployment has been scaled down to using three replicas
Closing Thoughts
In this tutorial, we saw how to use the
kubectl scale
command in Kubernetes on a Linux system. This command is used to increase or decrease the number of replicas that are running for a deployment in our Kubernetes cluster. By controlling the number of replicas, we can scale an application to meet increased demand, or scale it down when the number of replicas becomes excessive. By having multiple replicas, we also have the ability to perform rolling updates.