Skip to content

Setting up a kubernetes cluster on laptop/desktop using Ansible. Runs on VMs created through vagrant

Notifications You must be signed in to change notification settings

e4c5/ansible-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Master Kubernetes Lab Cluster

A high-availability Kubernetes cluster running on KVM/libvirt virtual machines, featuring 3 control plane nodes, 2 worker nodes, and an HAProxy load balancer - all automated with Vagrant and Ansible.

Architecture

Cluster Topology

                    ┌──────────────────┐
                    │   HAProxy LB     │
                    │ 192.168.100.10   │
                    └────────┬─────────┘
                             │
         ┌───────────────────┼───────────────────┐
         │                   │                   │
    ┌────▼────┐         ┌────▼────┐        ┌────▼────┐
    │  cp1    │         │  cp2    │        │  cp3    │
    │ .100.11 │         │ .100.12 │        │ .100.13 │
    │ Master  │         │ Master  │        │ Master  │
    └─────────┘         └─────────┘        └─────────┘
    
         ┌─────────────────────────────────────┐
         │                                     │
    ┌────▼────┐                           ┌────▼────┐
    │   w1    │                           │   w2    │
    │ .100.21 │                           │ .100.22 │
    │ Worker  │                           │ Worker  │
    └─────────┘                           └─────────┘

Node Specifications

Node Type Hostname IP vCPUs RAM Disk Role
Load Balancer lb 192.168.100.10 1 512MB 10GB HAProxy (API endpoint)
Control Plane cp1 192.168.100.11 2 3GB 20GB Kubernetes master
Control Plane cp2 192.168.100.12 2 3GB 20GB Kubernetes master
Control Plane cp3 192.168.100.13 2 3GB 20GB Kubernetes master
Worker w1 192.168.100.21 2 4GB 30GB Application workloads
Worker w2 192.168.100.22 2 4GB 30GB Application workloads

Total Resources: 10 vCPUs, 17GB RAM, 110GB disk

Network Configuration

  • Virtual Network: 192.168.100.0/24 (isolated, NAT-ed)
  • Control Plane Endpoint: 192.168.100.10:6443 (HAProxy VIP)
  • Pod Network CIDR: 192.168.0.0/16 (Calico VXLAN)
  • Service CIDR: 10.96.0.0/12 (default)

Technology Stack

Infrastructure

  • Hypervisor: KVM/libvirt
  • Provisioning: Vagrant (vagrant-libvirt plugin)
  • Base OS: Rocky Linux 9.3
  • Automation: Ansible

Kubernetes

  • Version: 1.34.3 (configurable in ansible/inventory/group_vars/all.yml)
  • Container Runtime: containerd 2.2.0
  • CNI: Calico v3.29.1 (VXLAN mode, no BGP)
  • Load Balancer: HAProxy 2.4+

High Availability Features

  • 3-node etcd cluster (quorum-based, can lose 1 node)
  • 3 API servers (load balanced via HAProxy)
  • Stacked control plane (etcd runs on control plane nodes)
  • Automatic certificate distribution (via kubeadm --upload-certs)

Prerequisites

Host System Requirements

  • OS: Linux (tested on Fedora/Ultramarine)
  • CPU: x86_64 with virtualization support (Intel VT-x or AMD-V)
  • RAM: 20GB+ available (32GB recommended)
  • Disk: 120GB+ free space
  • KVM: Loaded and functional

Software Dependencies

Install on your host machine:

# Fedora/RHEL/Rocky/Ultramarine
sudo dnf install -y \
  libvirt \
  libvirt-devel \
  qemu-kvm \
  ruby-devel \
  gcc \
  make \
  ansible

# Enable and start libvirt
sudo systemctl enable --now libvirtd

# Install Vagrant
# Download from https://developer.hashicorp.com/vagrant/downloads
# Or use package manager

# Install vagrant-libvirt plugin
vagrant plugin install vagrant-libvirt

Verify installation:

virsh list --all                    # Should work without errors
kvm-ok || grep -E 'vmx|svm' /proc/cpuinfo  # Verify virtualization
vagrant --version                   # Should show version
ansible --version                   # Should show version

Quick Start

1. Provision Virtual Machines

cd /path/to/this/repo
vagrant up

This creates all 6 VMs (takes 5-10 minutes). Vagrant will:

  • Download Rocky Linux 9 base image (first time only)
  • Create VMs with specified resources
  • Configure static IPs on isolated network
  • Set up SSH keys for password-less access

Generate SSH config for Ansible:

vagrant ssh-config > .vagrant/ssh-config

2. Configure Nodes

Run the common setup playbook (installs containerd, Kubernetes packages, configures firewall):

ansible-playbook ansible/playbooks/common.yml

Configure the load balancer:

ansible-playbook ansible/playbooks/loadbalancer.yml

3. Bootstrap Kubernetes Cluster

Initialize the first control plane node:

ansible-playbook ansible/playbooks/kubeadm-init.yml

Join remaining control planes and workers:

ansible-playbook ansible/playbooks/join-nodes.yml

This automatically:

  • Joins cp2 and cp3 as additional control planes
  • Joins w1 and w2 as worker nodes
  • Fetches the kubeconfig to .kube/config locally

4. Install CNI (Calico)

ansible-playbook ansible/playbooks/cni-calico.yml

This installs Calico in VXLAN mode (no BGP complexity).

5. Verify Cluster

ansible-playbook ansible/playbooks/verify-cluster.yml

Or manually:

export KUBECONFIG=$(pwd)/.kube/config
kubectl get nodes
kubectl get pods -A

Expected output:

NAME   STATUS   ROLES           AGE   VERSION
cp1    Ready    <none>          1h    v1.34.3
cp2    Ready    control-plane   1h    v1.34.3
cp3    Ready    control-plane   1h    v1.34.3
w1     Ready    <none>          1h    v1.34.3
w2     Ready    <none>          1h    v1.34.3

Project Structure

.
├── Vagrantfile                      # VM definitions
├── ansible.cfg                      # Ansible configuration
├── ansible/
│   ├── inventory/
│   │   ├── hosts.ini               # Inventory (groups: control_planes, workers, loadbalancer)
│   │   └── group_vars/
│   │       └── all.yml             # Global variables (versions, CIDRs)
│   └── playbooks/
│       ├── common.yml              # Node setup (containerd, k8s packages, firewall)
│       ├── loadbalancer.yml        # HAProxy configuration
│       ├── kubeadm-init.yml        # Initialize first control plane
│       ├── join-nodes.yml          # Join additional nodes
│       ├── remove-calico.yml       # Remove Calico CNI
│       ├── cni-calico.yml          # Install Calico (VXLAN mode)
│       └── verify-cluster.yml      # Comprehensive health checks
├── templates/
│   ├── containerd-config.toml.j2   # Containerd config (systemd cgroup)
│   ├── haproxy.cfg.j2              # HAProxy config (roundrobin to API servers)
│   ├── hosts-block.j2              # /etc/hosts entries for all nodes
│   └── kubeadm-config.yaml.j2      # Cluster configuration
└── README.md

How It Works

1. VM Provisioning (Vagrant)

Vagrant creates 6 VMs on an isolated libvirt network with NAT for internet access. Each VM:

  • Gets a static IP assignment
  • Has SSH keys pre-configured
  • Runs Rocky Linux 9 base OS
  • Uses thin-provisioned qcow2 disks

2. System Configuration (Ansible)

common.yml prepares all Kubernetes nodes:

  • Disables swap (required by kubelet)
  • Loads kernel modules (overlay, br_netfilter, vxlan)
  • Configures sysctl for networking (IP forwarding, bridge netfilter)
  • Installs containerd with systemd cgroup driver
  • Installs kubelet, kubeadm, kubectl
  • Opens firewall ports:
    • All nodes: 10250 (kubelet), 4789 (VXLAN)
    • Control planes: 6443 (API), 2379-2381 (etcd), 179 (BGP)
    • Workers: 30000-32767 (NodePort)

loadbalancer.yml configures HAProxy:

  • Listens on 192.168.100.10:6443
  • Load balances to all 3 API servers (roundrobin)
  • Health checks: TCP connection to port 6443
  • SELinux configuration for port binding

3. Cluster Initialization

kubeadm-init.yml on cp1:

  • Generates kubeadm config with:
    • Control plane endpoint: 192.168.100.10:6443
    • Pod subnet: 192.168.0.0/16 (Calico)
    • Kubernetes version: 1.34.3
  • Initializes cluster with --upload-certs (distributes certs for joining)
  • Creates join commands for control planes and workers

join-nodes.yml:

  • Retrieves join tokens from cp1
  • Joins cp2, cp3 as control planes (with --control-plane flag)
  • Joins w1, w2 as workers
  • Fetches kubeconfig to local machine

4. CNI Installation

cni-calico.yml:

  • Downloads Calico v3.29.1 manifest
  • Configures VXLAN mode (no BGP):
    • CALICO_IPV4POOL_IPIP: Never
    • CALICO_IPV4POOL_VXLAN: Always
  • Patches readiness probe (felix-only, no BIRD check)
  • Waits for Calico DaemonSet rollout

5. High Availability Mechanics

etcd Quorum:

  • 3 members = can lose 1 and maintain quorum
  • etcd-cp1: 192.168.100.11:2380
  • etcd-cp2: 192.168.100.12:2380
  • etcd-cp3: 192.168.100.13:2380

API Server Load Balancing:

  • HAProxy distributes requests across 3 API servers
  • If one control plane fails, API remains accessible
  • Clients connect to VIP (192.168.100.10:6443)

Scheduler & Controller Manager:

  • Run on all 3 control planes
  • Leader election ensures only one is active
  • Automatic failover if leader fails

Configuration

Changing Kubernetes Version

Edit ansible/inventory/group_vars/all.yml:

kubernetes_version: "1.31.4"  # Change this

Then re-run:

ansible-playbook ansible/playbooks/common.yml

Changing Pod Network CIDR

Edit ansible/inventory/group_vars/all.yml:

pod_network_cidr: "10.244.0.0/16"  # For Flannel

Note: Must be done BEFORE cluster initialization. To change after, requires cluster re-creation.

Changing VM Resources

Edit Vagrantfile:

NODES = {
  "cp1" => { ip: "192.168.100.11", memory: 4096, cpus: 4, disk: "30G" },
  # ...
}

Destroy and recreate VMs:

vagrant destroy -f
vagrant up

Maintenance

Remove and Reinstall Calico

If CNI has issues:

ansible-playbook ansible/playbooks/remove-calico.yml
ansible-playbook ansible/playbooks/cni-calico.yml

Check Cluster Health

# All nodes should be Ready
kubectl get nodes

# All system pods should be Running
kubectl get pods -n kube-system

# Check etcd cluster
kubectl exec -n kube-system etcd-cp1 -- etcdctl \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/peer.crt \
  --key=/etc/kubernetes/pki/etcd/peer.key \
  member list -w table

# Check HAProxy stats
vagrant ssh lb -c "sudo systemctl status haproxy"

Adding More Worker Nodes

  1. Edit Vagrantfile to add new worker definition
  2. Run vagrant up <new-worker-name>
  3. Add to ansible/inventory/hosts.ini under [workers]
  4. Run ansible-playbook ansible/playbooks/common.yml --limit=<new-worker>
  5. Generate join command on cp1:
    vagrant ssh cp1 -c "sudo kubeadm token create --print-join-command"
  6. SSH to new worker and run join command

Troubleshooting

Nodes Not Ready

Check Calico pods:

kubectl get pods -n kube-system -l k8s-app=calico-node

If not ready, check logs:

kubectl logs -n kube-system <calico-node-pod> -c calico-node

Common fix: Remove and reinstall Calico

ansible-playbook ansible/playbooks/remove-calico.yml
ansible-playbook ansible/playbooks/cni-calico.yml

API Server Unreachable

Check HAProxy:

vagrant ssh lb -c "sudo systemctl status haproxy"
vagrant ssh lb -c "sudo tail -f /var/log/haproxy.log"

Check control plane API servers:

for i in 1 2 3; do
  vagrant ssh cp$i -c "sudo systemctl status kube-apiserver"
done

Test connection:

curl -k https://192.168.100.10:6443/healthz

Pods Can't Communicate

Check Calico:

kubectl get pods -n kube-system -l k8s-app=calico-node -o wide

Verify VXLAN mode:

kubectl get cm -n kube-system calico-config -o yaml | grep calico_backend
# Should show: vxlan (not bird)

Check firewall on nodes:

vagrant ssh cp1 -c "sudo firewall-cmd --list-all"
# Should include: 4789/udp (VXLAN)

etcd Cluster Issues

Check etcd members:

kubectl exec -n kube-system etcd-cp1 -- etcdctl \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/peer.crt \
  --key=/etc/kubernetes/pki/etcd/peer.key \
  endpoint status --cluster -w table

Should show 3 healthy members.

Firewall Blocking Traffic

Re-run common playbook to ensure all ports are open:

ansible-playbook ansible/playbooks/common.yml

Cleanup

Destroy All VMs

vagrant destroy -f

This removes all VMs but keeps:

  • Downloaded base box (for faster re-creation)
  • Vagrantfile and Ansible playbooks
  • Any saved kubeconfig in .kube/

Complete Cleanup

vagrant destroy -f
rm -rf .vagrant .kube
vagrant box remove generic/rocky9

Common Use Cases

Testing Application Deployments

# Deploy sample app
kubectl create deployment nginx --image=nginx:alpine --replicas=3
kubectl expose deployment nginx --port=80 --type=NodePort

# Get NodePort
kubectl get svc nginx

# Access from host
NODE_PORT=$(kubectl get svc nginx -o jsonpath='{.spec.ports[0].nodePort}')
curl http://192.168.100.21:$NODE_PORT  # w1's IP

Testing HA Failover

# Watch nodes
watch kubectl get nodes

# In another terminal, power off a control plane
vagrant halt cp2

# Cluster should remain operational with cp1 and cp3
# Bring it back
vagrant up cp2

Experimenting with Kubernetes Features

Since this is a full multi-master cluster, you can test:

  • Leader election: Deploy apps with leader-election
  • etcd operations: Backup, restore, member management
  • HA configurations: Disruption budgets, rolling updates across AZs
  • Network policies: Calico supports advanced network policies
  • Storage: Add persistent volumes via NFS or local storage

Performance Tuning

Host Machine

Increase VM CPU priority:

# Edit Vagrantfile, add to provider block:
lv.cpu_mode = 'host-passthrough'
lv.cpus = 4  # Increase from 2

Kubernetes

Reduce logging verbosity:

# Edit /var/lib/kubelet/config.yaml on nodes
logging:
  verbosity: 2  # Default is 4

Security Considerations

This is a LAB environment. For production:

  1. TLS certificates: Default certs are auto-generated. Use proper PKI
  2. RBAC: Configure proper role-based access control
  3. Network policies: Implement strict pod network policies
  4. Secrets encryption: Enable encryption at rest for secrets
  5. Audit logging: Enable and configure audit logs
  6. kubeconfig: Protect the admin kubeconfig (stored in .kube/config)
  7. Host firewall: The VMs' firewall is configured but test thoroughly
  8. Updates: Keep Kubernetes and OS packages updated

References

License

This is a learning/lab environment. Use at your own risk.

Contributing

This is a personal lab setup. Feel free to fork and modify for your needs.


Questions? Check the troubleshooting section or review the playbooks in ansible/playbooks/ for implementation details.

About

Setting up a kubernetes cluster on laptop/desktop using Ansible. Runs on VMs created through vagrant

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages