October 17, 2024 in HPC, OpenCHAMI, Booting by Alex Lovell-Troy2 minutes
One of the key strengths of OpenCHAMI is its flexibility. The software is fully containerized and can be deployed using a variety of methods. Our goal was to ensure sysadmins have the freedom to deploy it in a way that best fits their infrastructure, whether that’s through Docker, Podman, or another container management system.
For those looking to get started quickly, we recommend using our quickstart which leverages docker-compose
to spin up the services and infrastructure needed for OpenCHAMI. The
On the Badger cluster, however, we took a different approach. Our sysadmins already have a set of procedures for installing and managing systems with Ansible and systemd services. Rather than asking them to learn our development technology, we approached deployment of the microservices by integrating with what they were already used to. We used Podman Quadlets, which integrate with systemd. Quadlets allow you to manage containers as systemd unit files, providing an easy way to orchestrate services while keeping system-level control.
The following unit file describes the postgres container that holds all the state for OpenCHAMI. Many of the directives should be familiar from the corresponding docker-compose file in the quickstart.
This approach allowed us to take advantage of systemd’s service management while still using containers. Sysadmins can control and monitor the containers as they would any other systemd service, simplifying operations and improving reliability.
At LANL, we leverage Ansible for a lot of our sysadmin tasks. In order for our sysadmins to deploy OpenCHAMI without developer support, we needed to meet them where they were, not force them to learn a new technology. We built on our work with quadlets and created a set of ansible roles using the podman container modulethat set up each of the microservices in the right order using a simple ansible command to create and start the unit files.
Once created and started, the Units behave like any others on the system. Our admins could troubleshoot them with tools they understand and even trace dependencies as they would any other system in the datacenter.
In the next post, we’ll explore how to interact with OpenCHAMI via the CLI and API to manage large clusters efficiently.