If you’re familiar with Helm and use several Helm charts to deploy everything on your stack, you certainly already have felt about missing lifecycles. By default, Helm provides Hooks to manage lifecycles. This is excellent when you’re the chart owner, as you can control it.
But something is missing. How do you manage lifecycles when you’re using a community chart? You have to fork the original Chart, add your hooks and maintain them over time (more or less depending on how customized your Hooks are). Quite boring, right?
Also, Hooks require a container to run your code as a job, so you have to create a container only for this purpose, store it on a registry, etc.
Finally, how do you handle exceptions, fallbacks, ensure your app works as expected (in addition to Kubernetes lifecycles)? There are no familiar ways to do that with Helm.
That’s why we decided to build something on top of Helm directly in the Engine, to add a common lifecycle mechanism.
Terraform Helm provider based
In another article, I was talking about why we removed Helm from Terraform. Even if the move was required, the way the Helm provider requested Chart configuration, was pretty good. So we decided to use something close to it with a struct.
Compared to the Helm chart, you can note some differences we support:
Direct YAML content in yaml_files_content, which is sometimes super convenient.
last_breaking_version_requiring_restart: allowing us to uninstall a chart before installing it once again when some major breaking changes are required by community charts (and for sure there are no data associated)
We then decided to create defaults values as it’s very frequent to have common ones:
Here is starting the exciting part. We’re using an interface (called trait in Rust):
RUST
pub trait HelmChart: Send { fn run(&self, kubernetes_config: &Path, envs: &[(String, String)]) -> Result, SimpleError> { info!("prepare and deploy chart {}", &self.get_chart_info().name); let payload = self.check_prerequisites()?; let payload = self.pre_exec(&kubernetes_config, &envs, payload)?; let payload = match self.exec(&kubernetes_config, &envs, payload.clone()) { Ok(payload) => payload, Err(e) => { error!( "Error while deploying chart: {:?}", e.message.clone().expect("no error message provided") ); self.on_deploy_failure(&kubernetes_config, &envs, payload)?; return Err(e); } }; let payload = self.post_exec(&kubernetes_config, &envs, payload)?; let payload = self.validate(&kubernetes_config, &envs, payload)?; Ok(payload) }
}
As you can see there, there are several steps:
check_prerequisites: ensuring everything is ok before doing any action
pre_exec: run pre exec code before running any action on a chart
exec: perform an action (deploy/delete) on a chart
on_deploy_failure: run code when an action failed
post_exec: run code after helm action
validate: ensure deployed applications are working as expected
Lifecycles
Let’s dig into what those lifecycles contain.
check_prerequisites
By default, we simply check the prerequisites, like the file permissions on helm values override files:
RUST
fn check_prerequisites(&self) -> Result, SimpleError> { let chart = self.get_chart_info(); for file in chart.values_files.iter() { match fs::metadata(file) { Ok(_) => {} Err(e) => { return Err(SimpleError { kind: SimpleErrorKind::Other, message: Some(format!( "Can't access helm chart override file {} for chart {}. {:?}", file, chart.name, e )), }) } } } Ok(None) }
}
pre_exec
Pre exec is really useful for some charts, to pre-check/validate/update stuff before going further. Super useful for example for already deployed applications without Helm, and you want to give ownership to Helm by updating annotations (like AWS CNI). by default, nothing is done:
Obviously, this has to be adapted for any deployed solution.
Example of usage
Let’s try with a real use case. Here it’s the Prometheus Operator where we need to change the exec method to be able to manage lifecycle with CRDs (the uninstall phase):
#[derive(Default)]
RUST
pub struct PrometheusOperatorConfigChart { pub chart_info: ChartInfo,}impl HelmChart for PrometheusOperatorConfigChart { fn get_chart_info(&self) -> &ChartInfo { &self.chart_info } fn exec( &self, kubernetes_config: &Path, envs: &[(String, String)], payload: Option, ) -> Result, SimpleError> { let environment_variables: Vec<(&str, &str)> = envs.iter().map(|x| (x.0.as_str(), x.1.as_str())).collect(); let chart_info = self.get_chart_info(); match chart_info.action { HelmAction::Deploy => { if let Err(e) = helm_destroy_chart_if_breaking_changes_version_detected( kubernetes_config, &environment_variables, chart_info, ) { warn!( "error while trying to destroy chart if breaking change is detected: {:?}", e.message ); } helm_exec_upgrade_with_chart_info(kubernetes_config, &environment_variables, chart_info)? } HelmAction::Destroy => { let chart_info = self.get_chart_info(); match is_chart_deployed( kubernetes_config, environment_variables.clone(), Some(get_chart_namespace(chart_info.namespace.clone()).as_str()), chart_info.name.clone(), ) { Ok(deployed) => { if deployed { let prometheus_crds = [ "prometheuses.monitoring.coreos.com", "prometheusrules.monitoring.coreos.com", "servicemonitors.monitoring.coreos.com", "podmonitors.monitoring.coreos.com", "alertmanagers.monitoring.coreos.com", "thanosrulers.monitoring.coreos.com", ]; helm_exec_uninstall_with_chart_info(kubernetes_config, &environment_variables, chart_info)?; for crd in &prometheus_crds { kubectl_exec_delete_crd(kubernetes_config, crd, environment_variables.clone())?; } } } Err(e) => return Err(e), }; } HelmAction::Skip => {} } Ok(payload) }
}
Agents ship fast. Guardrails keep them safe.
Qovery ensures every agent action is scoped, audited, and policy-checked. Start deploying in under 10 minutes.
We’ve been using this for production usage at Qovery for more than five months now. From an experienced Kubernetes point of view (+6y of XP on the Kubernetes ecosystem), I finally feel confident on helm chart deployments.
We don’t know if we will move out to a dedicated library. If we receive requests, we’ll consider it.