If you have deployed anything with an Infrastructure as Code framework (Terraform, Pulumi, etc…) recently, then you have interacted with a state file, and may not have even known it! So, what is the state file? Why is it important? What should you do with it? These are some of the most asked questions when it comes to Infrastructure as Code management. So, let’s get into it!
What is a State File?
The state file is an artifact that you’re left with once an Infrastructure as Code framework finishes a deployment. It is the “single source of truth” about what was deployed, where it was deployed, and all the configuration needed to deploy it. As you can imagine, it contains a ton of sensitive data. This isn’t the kind of file you just want to put on an open filesystem somewhere. If this file is compromised, you could have a very bad time. The state file isn’t just a one-and-done file for a deployment. It is referenced during any update, or redeployment, so that the framework can look for drift, or a disparity between what the state file says is deployed, and what is actually deployed. It also helps performance, because if the IaC file says that it needs 3 instances deployed, and there are already 2, it will only deploy 1. This way it doesn’t have to redeploy all the resources every time an update is made.
State Management Options
When you deploy resources with Infrastructure as Code, you need to know what you’re going to do with this super important file. By default, some frameworks save a copy locally wherever you executed that deployment from. For example, if you use the CLI for Terraform on your laptop, it saves the state file on your local machine. On the other hand, by default, Pulumi’s CLI asks you for an API key so it can save the state on the Pulumi SaaS platform. You can see that the default behavior changes from framework to framework, but in the end, they boil down to a couple of basic options. This means that your state management options are either A) Local, or B) Remote. Local isn’t the best option because if something happens to that machine, or if someone else tries to make a deployment update from another machine, the state may not be available anymore. This can cause big problems with deployments. The remote option is the most widely accepted voice for state management, but even within the remote option, there are a few choices to make.
Remote State Management Options
When it comes to remote state management, you’ll need to decide where you’re going to drop those files for safe keeping. There are many 3rd party platforms that offer a remote state backend option. Usually, these are the same platforms that you might use for planning and executing your deployments through. The other option is to create a storage bucket in your own cloud environment, under your control, and configure your Infrastructure as Code configuration to store the state file there.
So, which one to choose?
Unfortunately, the answer to this question is the standard engineering answer of “it depends.” There is no right or wrong answer here, only opinions. If you’re reading this far, then you’re about to get my opinion, and how I like to think about it. There are pros and cons to using a 3rd party remote state backend vs. managing your own remote state backend in your cloud. But, I like to think of my state files like I think about my money. My money is very important to me. If I have a choice of putting my money into my own bank account, under my own control, vs. putting it in somebody else's bank account, under their control, I am going to side with myself every time. This is not to say that the 3rd party remote state backend providers are insecure in any way, it just comes down to “who touched it last.” If it’s in your cloud, with your security, then you know what’s going on with it at all times. Also, a 3rd party provider is a bigger (more likely) target for attackers, because if they compromise that one platform, they get access to more than one target's worth of confidential information.
In the end, the choice is yours to make. Hopefully this helped shed some light on some of the pitfalls that come along with using Infrastructure as Code, and how to try and avoid them.
Want to learn more? Check out this Quick Tech video by env0 about remote backends, and our guide to managing Terraform remote state using remote backend.