2 minutes
Powering Through: Automating Home Infrastructure for Bay Area’s Blackouts
Introduction
Being based in the Bay Area and managing a complex home infrastructure – including Kubernetes clusters, VMs, and Proxmox nodes – I’ve faced the significant challenge of rolling blackouts. Here’s my journey, including the lessons learned, the solutions implemented, and my plans for the future.
The 2 AM Wake-Up Call
My UPS alarm went off at 2 AM, and with only 15 minutes of power left, I failed to turn off my nodes in time, leading to an unsafe shutdown. I knew I needed to develop a more strategic solution.
Manual Solution: The First Success
I wrote an Ansible playbook that drained the Kubernetes nodes and safely shut them down. During the next blackout, this manual method worked seamlessly, shutting down everything in a controlled manner. But I wanted something more robust.
Transition to Low-Power Mode: Essentials Stay On
Building upon my initial success, I optimized for a low-power mode that kept essential services running on a low-powered Celeron node, such as Home Assistant and Omada Controller. Additionally, a proxmox VM running OPNsense served as my backup firewall/router, providing critical connectivity. This allowed quick WiFi restoration and light control, creating a more resilient setup.
Custom Draining Explained
Here’s why I used specific options in the draining script:
kubectl drain $node --force --delete-local-data --ignore-daemonsets --selector='essential!=true'
The --force
and --delete-local-data
options ensure the node drains without getting stuck, even if local data must be deleted.
Low-Power Playbook Example
Here’s an excerpt from the playbook designed for low-power mode:
- hosts: workers
tasks:
- name: Custom Drain worker nodes
command: /path/to/custom-drain.sh {{ inventory_hostname }}
- name: Stop unnecessary worker nodes
command: shutdown -h now
Future Plans: Full Automation
- Monitor Power Status: Create a script that interfaces with the NUT server to keep an eye on power changes.
- Trigger Playbooks: Automate the execution of playbooks during a power outage.
- Reversing the Process: Develop a playbook to restore the system when power is back.
Conclusion
From an initial struggle to an elegant manual solution, and finally to a plan for full automation, I’ve turned the challenge of Bay Area’s blackouts into an opportunity to innovate my home infrastructure. For those facing similar challenges, this journey proves that with ingenuity and the right technical approach, you can build a resilient system that thrives in uncertainty.
automation homelab blackouts Kubernetes Proxmox Ansible low-power-mode resilience
380 Words
2023-08-09 17:16