Kubespray: Zero to Hero - Part 1: Laying the Groundwork - Infrastructure Preparation & Initial Setup

Kubespray: Zero to Hero - Part 1: Laying the Groundwork -  Infrastructure Preparation & Initial Setup

Deploying a production-ready Kubernetes cluster requires meticulous planning and preparation. As we embark on this journey through the Kubespray automation landscape, this initial phase serves as the foundational layer upon which our entire deployment architecture will rest. The precision with which we execute these preparatory steps directly correlates with the stability, performance, and maintainability of our final Kubernetes environment.

Understanding the Kubespray Architecture

Kubespray represents a sophisticated orchestration tool that leverages the power of Ansible automation to deploy production-grade Kubernetes clusters across diverse infrastructure platforms. Unlike traditional deployment methods that require manual intervention at every step, Kubespray provides a comprehensive framework that supports multiple Linux distributions, networking plugins, and container runtimes, making it an ideal choice for organizations seeking both flexibility and reliability.

The tool's architecture is built around the principle of composability, allowing administrators to select from a wide range of configuration options including network plugins like Calico, Flannel, and Cilium, container runtimes such as containerd and CRI-O, and various Linux distributions from Ubuntu to CentOS. This modularity ensures that your deployment can be tailored to meet specific organizational requirements while maintaining consistency across environments.

Hardware Requirements and Specifications

Minimum System Requirements for Production Deployment

The foundation of any robust Kubernetes cluster begins with properly provisioned hardware. For production deployments, the minimum specifications vary significantly based on the intended role of each node within the cluster:

Control Plane Nodes (Master Nodes):

  • Memory: Minimum 1500 MB RAM, recommended 8GB for production environments
  • CPU: Minimum 2 cores, recommended 4-6 cores for high-availability deployments
  • Storage: 20GB minimum, 100GB recommended for production workloads

Worker Nodes:

  • Memory: Minimum 1024 MB RAM, recommended 8GB for containerized workloads
  • CPU: Minimum 2 cores, recommended 4+ cores depending on workload requirements
  • Storage: 20GB minimum, 100GB+ recommended for persistent storage and container images

Control Node (Ansible/Kubespray Host):

  • Memory: Minimum 1024 MB RAM
  • CPU: Single core sufficient for automation tasks
  • Storage: 20GB for Kubespray repository and temporary files

Bare Metal Considerations

When deploying on bare metal infrastructure, additional considerations become paramount. Bare metal deployments offer distinct advantages including improved performance through direct hardware access, cost efficiency by eliminating hypervisor overhead, and enhanced security through reduced attack surfaces.

The elimination of the hypervisor layer provides direct access to CPU, memory, and I/O resources, resulting in significant performance improvements particularly for resource-intensive workloads. Organizations have reported performance gains of 20-40% compared to virtualized environments, with even more substantial improvements for applications requiring high I/O throughput.

For high-availability control planes, a minimum of three nodes is required to establish proper etcd quorum and prevent split-brain scenarios. These nodes should be distributed across different failure domains when possible, ensuring cluster resilience against hardware failures or network partitions.

Operating System Preparation

Supported Linux Distributions

Kubespray maintains extensive compatibility with major Linux distributions, ensuring flexibility in base operating system selection. The supported distributions include:

Ubuntu Variants:

  • Ubuntu 16.04, 18.04, 20.04, 22.04, 24.04 LTS
  • All variants support both server and desktop installations

Red Hat Enterprise Linux and Derivatives:

  • CentOS 7, 8, 9
  • RHEL 7, 8, 9
  • Rocky Linux 8, 9
  • Alma Linux 8, 9
  • Oracle Linux 7, 8, 9

Debian Family:

  • Debian Bullseye, Buster, Jessie, Stretch

Specialized Distributions:

  • Fedora 35, 36
  • Fedora CoreOS
  • openSUSE Leap 15.x/Tumbleweed
  • Flatcar Container Linux by Kinvolk
  • Amazon Linux 2

Essential System Configuration

Before proceeding with Kubespray deployment, several critical system-level configurations must be implemented across all nodes.

SELinux Configuration:
On Red Hat-based systems, SELinux must be configured to permissive mode to prevent interference with Kubernetes operations:

setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

Swap Deactivation:
Kubernetes requires swap to be disabled on all nodes to ensure proper resource management and scheduling:

swapoff -a
sed -i '/ swap /d' /etc/fstab

Kernel Parameter Optimization:
Critical kernel parameters must be configured to support containerized workloads and networking:

cat  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system

Required Kernel Modules:
The bridge netfilter module must be loaded to support Kubernetes networking:

modprobe br_netfilter
echo 'br_netfilter' >> /etc/modules-load.d/k8s.conf

Network Infrastructure Configuration

IPv4 Forwarding and Routing

Proper network configuration forms the backbone of any Kubernetes deployment. IPv4 forwarding must be enabled across all nodes to facilitate pod-to-pod communication and service routing.

The kernel parameter net.ipv4.ip_forward controls whether the system can forward packets between network interfaces, which is essential for Kubernetes networking functionality. This parameter is automatically managed by kube-proxy in most configurations, but should be explicitly configured during the preparation phase.

To enable IPv4 forwarding permanently:

echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf
sysctl -p

Firewall Configuration

Firewall configuration represents one of the most critical yet complex aspects of Kubernetes infrastructure preparation. The distributed nature of Kubernetes requires numerous ports to be accessible between cluster components.

Control Plane Node Ports:

  • 6443: Kubernetes API server (primary access point)
  • 2379-2380: etcd server client API
  • 10250: Kubelet API
  • 10251: kube-scheduler
  • 10252: kube-controller-manager
  • 10255: Read-only Kubelet API (deprecated but sometimes required)

Worker Node Ports:

  • 10250: Kubelet API
  • 10255: Read-only Kubelet API
  • 30000-32767: NodePort Services range

Example firewall configuration for CentOS/RHEL systems:

# Control plane nodes
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=10251/tcp
firewall-cmd --permanent --add-port=10252/tcp
firewall-cmd --reload

# Worker nodes
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=30000-32767/tcp
firewall-cmd --reload

For development environments or controlled network segments, administrators may choose to disable firewalls entirely, though this approach is not recommended for production deployments.

SSH Access Configuration

Passwordless SSH Implementation

Ansible relies on SSH connectivity to manage remote nodes, making proper SSH configuration absolutely critical for successful Kubespray deployment. The implementation of passwordless SSH authentication using public-key cryptography provides both security and automation benefits.

SSH Key Generation:
Begin by generating an RSA key pair on the control node:

ssh-keygen -t rsa -b 4096

When prompted, specify a secure passphrase or leave empty for automation purposes. The key generation process creates two files:

  • ~/.ssh/id_rsa (private key)
  • ~/.ssh/id_rsa.pub (public key)

Public Key Distribution:
Copy the public key to all target nodes using the ssh-copy-id utility:

ssh-copy-id username@target-node-ip

This command automatically appends the public key to the target node's ~/.ssh/authorized_keys file, enabling passwordless authentication.

SSH Configuration Optimization:
To improve connection reliability and performance, configure SSH client settings:

cat > ~/.ssh/config
Host *
    ControlMaster auto
    ControlPath ~/.ssh/ansible-%r@%h:%p
    ControlPersist 30m
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null
EOF

User Privilege Configuration

The deployment user must possess sudo privileges without password prompts to enable automated system configuration. Configure passwordless sudo access:

echo "username ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/username

This configuration allows the deployment user to execute privileged commands without interrupting the automation process.

Dependency Management and Tool Installation

Ansible Installation and Configuration

Kubespray requires specific versions of Ansible to ensure compatibility and proper functionality. The minimum required version is Ansible 2.14+, with Python 3.8+ as a prerequisite.

System-Level Installation:
For Ubuntu/Debian systems:

sudo apt update
sudo apt install python3-pip
pip3 install ansible

For CentOS/RHEL systems:

sudo yum install epel-release
sudo yum install python3-pip
pip3 install ansible

Virtual Environment Approach:
For isolated dependency management, utilize Python virtual environments:

python3 -m venv kubespray-env
source kubespray-env/bin/activate
pip install ansible

Python Environment Preparation

Kubespray depends on specific Python packages and versions for proper operation. The control node requires Python 3.8+ with several additional libraries.

Required Python Packages:
Install essential Python dependencies:

pip install -r requirements.txt

The requirements.txt file in the Kubespray repository contains all necessary Python packages, including:

  • ansible (core automation engine)
  • jinja2 (template engine)
  • netaddr (network address manipulation)
  • pbr (Python build tools)
  • hvac (HashiCorp Vault client)
  • kubernetes (Kubernetes Python client)

Kubespray Repository Setup

Repository Cloning:
Clone the official Kubespray repository and checkout the appropriate release branch:

git clone https://github.com/kubernetes-sigs/kubespray.git
cd kubespray
git checkout release-2.28

Version Compatibility:
Ensure compatibility between Kubespray, Ansible, and Kubernetes versions by consulting the compatibility matrix in the repository documentation. Each Kubespray release supports specific ranges of Ansible and Kubernetes versions, making version selection critical for deployment success.

Network Connectivity Verification

Inter-Node Communication Testing

Before proceeding with the deployment, verify that all nodes can communicate with each other over the required ports. This testing phase prevents deployment failures due to network connectivity issues.

Basic Connectivity Test:

# Test basic ICMP connectivity
ping -c 4 target-node-ip

# Test SSH connectivity
ssh username@target-node-ip "echo 'Connection successful'"

# Test specific port connectivity
telnet target-node-ip 6443

Network Performance Validation:
For production deployments, validate network performance characteristics:

# Bandwidth testing between nodes
iperf3 -s  # On target node
iperf3 -c target-node-ip  # On source node

# Latency measurement
ping -c 100 target-node-ip | tail -1

DNS Resolution Configuration

Ensure proper DNS resolution for all cluster nodes, as Kubernetes relies heavily on DNS for service discovery and inter-component communication. Configure /etc/hosts entries for all nodes if DNS infrastructure is not available:

cat >> /etc/hosts << EOF
192.168.1.10 k8s-master01
192.168.1.11 k8s-master02
192.168.1.12 k8s-master03
192.168.1.20 k8s-worker01
192.168.1.21 k8s-worker02
EOF

Storage Preparation

Disk Space Requirements

Adequate storage provisioning is crucial for cluster stability and performance. Beyond the minimum requirements, consider the following storage needs:

Container Image Storage:

  • Container images can consume significant disk space
  • Plan for 50-100GB per worker node for image storage
  • Consider implementing image garbage collection policies

etcd Storage:

  • etcd requires fast, reliable storage for cluster state
  • SSD storage strongly recommended for production deployments
  • Plan for 8-10GB minimum, with room for growth

Log Storage:

  • Kubernetes and application logs require dedicated storage
  • Implement log rotation policies to prevent disk exhaustion
  • Consider centralized logging solutions for production environments

Filesystem Optimization

Configure filesystems with appropriate mount options for Kubernetes workloads:

# Example fstab entry for container storage
/dev/sdb1 /var/lib/containers ext4 defaults,noatime,nobarrier 0 2

The noatime option reduces filesystem overhead by not updating access times, while nobarrier can improve performance on certain storage systems (use with caution in production).

Time Synchronization

NTP Configuration

Accurate time synchronization across all cluster nodes is essential for proper certificate validation, log correlation, and distributed system coordination.

NTP Service Installation:

# Ubuntu/Debian
sudo apt install ntp

# CentOS/RHEL
sudo yum install ntp

NTP Configuration:
Configure reliable NTP servers in /etc/ntp.conf:

server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
server 3.pool.ntp.org iburst

Time Sync Verification:

# Check NTP sync status
ntpq -p

# Verify system time
timedatectl status

Security Hardening Preparation

Initial Security Configuration

While comprehensive security hardening will be addressed in Part 4 of this series, several foundational security measures should be implemented during the preparation phase.

Disable Unnecessary Services:

# Disable common unnecessary services
systemctl disable cups
systemctl disable bluetooth
systemctl disable avahi-daemon

Update System Packages:

# Ubuntu/Debian
sudo apt update && sudo apt upgrade -y

# CentOS/RHEL
sudo yum update -y

Configure Audit Logging:
Enable system audit logging for security monitoring:

sudo systemctl enable auditd
sudo systemctl start auditd

Validation and Testing

Pre-Deployment Validation

Before proceeding to the actual Kubespray deployment, conduct comprehensive validation of all preparation steps:

System Requirements Validation:

# Check available memory
free -h

# Verify CPU information
lscpu

# Check disk space
df -h

# Validate kernel version
uname -r

Network Connectivity Matrix:
Create a comprehensive connectivity test matrix to verify all required ports are accessible between all nodes. This testing phase prevents deployment failures and reduces troubleshooting time.

SSH Access Verification:

# Test SSH access to all nodes
for host in $(cat inventory/hosts); do
    ssh $host "hostname && date"
done

Looking ahead to Part 2 of this series, we'll delve deep into establishing your Kubernetes Root Certificate Authority, building upon the solid foundation we've established here. The meticulous preparation we've completed in this initial phase will prove invaluable as we progress through the more complex aspects of certificate management and trust establishment.

The infrastructure groundwork we've laid provides the essential platform for a successful Kubernetes deployment. Each configuration step, from hardware provisioning to network optimization, contributes to the overall reliability and performance of your final cluster. As we transition to the certificate authority establishment phase, these foundational elements will support the sophisticated security architecture that modern Kubernetes deployments require.

By following these comprehensive preparation steps, you've established a robust foundation that will support not only the initial deployment but also the long-term operation and maintenance of your Kubernetes cluster. The attention to detail invested in this preparatory phase will pay dividends throughout the lifecycle of your infrastructure, ensuring smooth operations and simplified troubleshooting when issues arise.