k3s

Part 2 - K3s Zero To Hero: Scaling Up - Multi-Node K3s Clusters Made Easy

AB Engineering

10 Jun 2025 • 8 min read

Ready to graduate from your cozy single-node K3s cluster to something that could survive a zombie apocalypse (or at least a server failure)? Welcome to Part 2 of our K3s journey, where we'll transform your lonely little cluster into a bustling metropolis of nodes. In this post, we'll dive deep into creating a proper high-availability setup with three server nodes and three agent nodes, because redundancy is the name of the game, and nobody likes having all their eggs in one computational basket.

Understanding the High Availability Architecture

Before we start spinning up nodes like a DJ at a rave, let's understand what we're building. A high-availability K3s cluster isn't just about having more computers to make you feel important; it's about creating a system that can keep running even when things go sideways.

In K3s land, we have two main types of nodes that play very different roles. Server nodes are the control freaks of the cluster - they run the Kubernetes API server, manage the cluster state, and host the embedded etcd datastore. Think of them as the managers who never stop talking about synergy and keep all the important decisions. Agent nodes, on the other hand, are the workhorses that actually run your applications and services. They're like the employees who just want to do their job without attending another meeting about meetings.

For our high-availability setup, we need at least three server nodes, and here's where mathematics meets reality in a slightly annoying way. An HA embedded etcd cluster must be comprised of an odd number of server nodes for etcd to maintain quorum. The magic formula is quorum = (n/2)+1, which means with three servers, you need at least two to agree on anything. It's like a democratic process, but with more TCP packets and fewer campaign promises.

The beauty of K3s lies in its embedded etcd approach. Unlike traditional Kubernetes setups that require you to manage etcd separately (and probably question your life choices), K3s bundles everything together. However, there's a caveat that might make Raspberry Pi enthusiasts slightly nervous: embedded etcd may have performance issues on slower disks such as Raspberry Pis running with SD cards. So if you're planning to build a cluster on a collection of Pi devices, maybe invest in some proper storage or prepare for the occasional performance hiccup.

Initializing Your First Server Node

Now comes the moment of truth - creating the genesis node of your cluster empire. The first server node is special because it needs to bootstrap the entire etcd cluster and establish itself as the founding member of this digital democracy.

The initialization process starts with the magical cluster-init flag, which tells K3s to start a new cluster rather than trying to join an existing one. Here's how to birth your first server node:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.33.1+k3s1 K3S_TOKEN=your-super-secret-token sh -s - server \
  --cluster-init \
  --tls-san=your-load-balancer-ip

Let's break down this command because details matter when you're building something that hopefully won't fall apart at 3 AM. The K3S_TOKEN is your cluster's shared secret - think of it as the password to your exclusive Kubernetes club. Make it something memorable but secure, because you'll need it again when adding more nodes. The tls-san parameter adds your load balancer IP as a Subject Alternative Name to the TLS certificate, which is fancy talk for "making sure your certificates don't throw a tantrum when accessed through a load balancer".

After running this command, you'll have your first server node up and running with embedded etcd initialized. You can verify everything is working by checking the node status:

kubectl get nodes

You should see something like:

NAME      STATUS   ROLES                       AGE   VERSION
server-1  Ready    control-plane,etcd,master   2m    v1.33.1+k3s1

Notice how your node proudly displays multiple roles - it's the control plane, etcd member, and master all rolled into one. It's like a Swiss Army knife, but for container orchestration.

Adding Additional Server Nodes

With your first server established and presumably not on fire, it's time to add the second and third server nodes to complete your high-availability triumvirate. This process is where the K3s simplicity really shines compared to traditional Kubernetes setups that might require a PhD in distributed systems.

Adding the second server node is refreshingly straightforward. You'll use the same installation script, but instead of the cluster-init flag, you'll point it to your existing server:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.33.1+k3s1K3S_TOKEN=your-super-secret-token sh -s - server \
  --server https://server-1-ip:6443 \
  --tls-san=your-load-balancer-ip

The server parameter tells this new node to join an existing cluster rather than starting fresh. The magic happens automatically - K3s handles the etcd membership, certificate distribution, and all the other tedious bits that usually require multiple cups of coffee and a deep understanding of distributed consensus algorithms.

Repeat this exact process for your third server node, and you'll have a proper three-node control plane. The beauty is that each server knows about the others and can take over if one decides to take an unscheduled vacation (also known as crashing). You can verify your cluster is properly formed by running:

kubectl get nodes

Which should now show all three servers:

NAME      STATUS   ROLES                       AGE   VERSION
server-1  Ready    control-plane,etcd,master   10m   v1.33.1+k3s1
server-2  Ready    control-plane,etcd,master   5m    v1.33.1+k3s1
server-3  Ready    control-plane,etcd,master   2m    v1.33.1+k3s1

At this point, you have a cluster that can survive the loss of one server node and keep running. It's like having backup dancers for your backup dancers.

Welcoming Agent Nodes to the Party

Now that you have a robust control plane, it's time to add some muscle to your cluster in the form of agent nodes. These are the nodes that will actually run your workloads, leaving the server nodes free to focus on their management duties and looking important.

Adding agent nodes is delightfully simple and follows the same pattern you used in Part 1, but now you're joining a proper HA cluster instead of a single server. The process requires two pieces of information: the cluster token and the URL of any server node.

First, grab the node token from any of your server nodes:

sudo cat /var/lib/rancher/k3s/server/node-token

This will output a token that looks something like a cryptographic word salad:

K107123a456b789c012345678901234567890abcdef1234567890abcdef123456::server:abcdef1234567890abcdef1234567890abcdef12

K3s tokens come in different formats, but the one you'll see is the secure format that includes a cluster CA hash for extra security. This hash allows the joining node to verify it's connecting to the legitimate cluster and not some imposter trying to infiltrate your container kingdom.

Now, on each agent node, run the installation command:

curl -sfL https://get.k3s.io | K3S_URL=https://server-1-ip:6443 K3S_TOKEN="your-long-cryptographic-token" sh -

The K3S_URL parameter tells the installer to configure this as an agent rather than a server, and it will register with the K3s server at the specified URL. The agent will establish a websocket connection to the server and maintain it using a client-side load balancer. This means if one server goes down, the agent can automatically connect to another server in your cluster.

Repeat this process for all three agent nodes. After each installation completes, the k3s-agent service will start automatically and register with your cluster. You can watch the nodes join by running:

kubectl get nodes -w

The -w flag will watch for changes, so you can see nodes appearing in real-time like magic. Your final cluster should look something like this:

NAME      STATUS   ROLES                       AGE   VERSION
server-1  Ready    control-plane,etcd,master   20m   v1.33.1+k3s1
server-2  Ready    control-plane,etcd,master   15m   v1.33.1+k3s1
server-3  Ready    control-plane,etcd,master   12m   v1.33.1+k3s1
agent-1   Ready                          5m    v1.33.1+k3s1
agent-2   Ready                          3m    v1.33.1+k3s1
agent-3   Ready                          1m    v1.33.1+k3s1

Notice how the agent nodes show **** in the ROLES column, which isn't an insult to their importance but simply indicates they're dedicated worker nodes without control plane responsibilities.

Verification and Troubleshooting Your New Cluster

Congratulations! You now have a six-node K3s cluster that's more resilient than your average superhero team. But before you start deploying applications willy-nilly, let's make sure everything is actually working as intended.

First, verify that all nodes are healthy and communicating properly:

kubectl get nodes -o wide

This will show you detailed information about each node, including their IP addresses, Kubernetes versions, and container runtimes. All nodes should show a Ready status. If any node shows NotReady, it might be having networking issues or the k3s service might not be running properly.

You can check the etcd cluster health from any server node to ensure your high availability is actually available:

sudo k3s kubectl get nodes

To verify your cluster can survive a server failure, try shutting down one of your server nodes:

sudo systemctl stop k3s

Your cluster should continue operating normally with the remaining two servers. The etcd quorum will be maintained, and agent nodes will automatically reconnect to healthy servers. This is the moment where you can feel smug about building something that doesn't completely fall apart when one component fails.

If you encounter issues during node joining, the most common culprits are network connectivity problems or token mismatches. You can check the k3s logs on any node using:

sudo journalctl -u k3s -f

The -f flag will follow the logs in real-time, which is useful for debugging connection issues. Common problems include firewall rules blocking port 6443 (the Kubernetes API port) or copying tokens incorrectly (those long cryptographic strings are surprisingly easy to mangle).

Load Balancing for Production Readiness

While your cluster is now technically highly available, there's one more piece to consider for a truly production-ready setup: load balancing. Right now, your agent nodes are connecting directly to specific server IPs, which means if that particular server goes down, they'll need to reconnect to another one.

For a more robust setup, you can place a load balancer in front of your server nodes. This provides a single, stable endpoint that agent nodes and external clients can use to access the cluster. Popular options include HAProxy, Nginx, or cloud provider load balancers.

With a load balancer, your agent join command would look like:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.33.1+k3s1 K3S_URL=https://your-load-balancer:6443 K3S_TOKEN="your-token" sh -

This setup ensures that even if individual servers fail, your agents maintain connectivity through the load balancer's health checking and automatic failover capabilities.

Your Cluster is Ready for Action

You've successfully transformed your humble single-node K3s cluster into a robust, high-availability system capable of withstanding server failures while maintaining service availability. Your six-node cluster with three servers and three agents represents a solid foundation for running production workloads with confidence.

The embedded etcd approach means you don't need to manage a separate etcd cluster, while the automatic node registration and load balancing capabilities ensure your cluster can adapt to changing conditions. Whether you're running this on bare metal servers, virtual machines, or even a collection of Raspberry Pis (with proper storage considerations), you now have a Kubernetes cluster that can handle real-world demands.

In Part 3, we'll dive into the configuration rabbit hole, exploring how to customize your cluster's behavior through YAML files, CLI flags, and environment variables. Because while a default K3s installation is remarkably capable, the real fun begins when you start tailoring it to your specific needs and requirements.