It turns out that in the last three years there have been more than five software distribution channels that were hacked by the same group — most likely Chinese — commonly known as Barium, ShadowPad, or Wicked Panda. Their modus operandi is to plant their malware into code before the software is signed with the company’s digital signature. As a result of this, they get to ride unnoticed through all the software distribution channels and get installed on hundreds of thousands of machines. They typically lie dormant and are ready and waiting to be invoked when the bad guys launch their attack. So far, they seem to be more inclined towards spying and are only using a small subset of the machines that they have compromised. The Wired article provides more details on these attackers and their methods and I encourage you to read it.
CI/CD pipelines To get malware inserted before the production image is built, hackers are targeting the CI/CD pipeline where software is built. CI/CD, which stands for Continuous Integration and Continuous Delivery, refers to the automation that links the steps going from pushing code into a repository, running tests on it, building artifacts (images or containers), storing them in a container repository (e.g. docker hub), and then deploying them on a cluster via Kubernetes. The picture below highlights the major systems and the flow between them.
In the old days, you had a separate operations group who were the only ones who had access to production systems. There are drawbacks to that model which I won’t go into here, but the modern way is to blur the line between development and ops so that developers have more insight and control over operations. However, it is not always a good idea to provide everyone access to all the systems above. You want to ensure that your image repository and production systems are secure. Most developers should only engage with these systems through scripts that are stored in separate repositories with access controls that ensure only a few privileged individuals can change them. In other words, you are limiting the surface area for a rogue employee or attacker to get in and insert malware into your CI/CD pipeline.
It is important to secure laptops and devices through which developers access source code and make sure that “credential access” scenarios (like ours :-) are not able to expose them. A best practice is to require multi-factor authentication for all developer accounts. Another common requirement (e.g. SOC II) is to have an audit trail so you know what changes were made by whom in your CI/CD pipeline. This way, if you detect that something is wrong, you can quickly revert to the old state.
GitOps - using git for development and production
GitOps, which builds on the ideas behind DevOps and Site Reliability Engineering, is emerging as the holy grail of CI/CD development processes. Git was released in 2005 by Linus Torvalds because he needed a version control system for the Linux kernel that he was building. Since then, it has expanded in scope and is now the de-facto standard for storing and managing most of the world’s software. GitOps extends this to using Git to also describe and manage the state of production systems. When you have your desired production state stored in Git, you can automatically roll out changes to production without requiring the sharing of cluster credentials.
Troubleshooting and fixing production systems is also handled using the Git workflow. The operator issues a pull request that triggers a review/discussion, and then when the fix is pushed out to the repo, the changes are automatically rolled to production. Furthermore, there is a clear audit trail, as all changes are recorded as commits in a version control system.
Finally, since the desired state of a production system is stored in Git as a declarative model of the cluster, you can define agents that run periodically to compare the observable state of your system with the desired state and issue an alert when things don’t match. If there is any change that is directly introduced into the production cluster by attackers, these agents will quickly detect the divergence and alert operators so corrective action can be taken immediately.
While Git is a great store for source code and operations states, you do not want to store unencrypted credentials to the production system in Git. A quick search for "removed passwords” in GitHub has close to 400,000 commits. There are ways to encrypt passwords and store them in Git or to use tools like git-secret, but I will save that discussion for another day.
Happy Cinco de Mayo to all!