Continuous Integration/Continuous Development with FOSS Tools
Up your DevOps game! Get the fundamentals of CI/CD with FOSS tools now!
One of the hottest topics within the DevOps space is Continuous Integration and Continuous Deployment (CI/CD). This attention has drawn lots of investment dollars, and a vast array of proprietary Software As A Service (SaaS) tools have been created in the CI/CD space, which traditionally has been dominated by free open-source software (FOSS) tools. Is FOSS still the right choice with the low cost of many of these SaaS options?
It depends. In many cases, the cost of self-hosting these FOSS tools will be greater than the cost to use a non-FOSS SaaS option. However, even in today's cloud-centric and SaaS-saturated world, you may have good reasons to self-host FOSS. Whatever those reasons may be, just don't forget that "Free" isn't free when it comes to keeping a service running reliably 24/7/365. If you're looking at FOSS as a means to save money, make sure you account for those costs.
Even with those costs accounted for, FOSS still delivers a lot of value, especially to small and medium-sized organizations that are taking their first steps into DevOps and CI/CD. Starting with a commercialized FOSS product is a great middle ground. It gives a smooth growth path into the more advanced proprietary features, allowing you to pay for those only once you need them. Often called Open Core, this approach isn't universally loved, but when applied well, it has allowed for a lot of value to be created for everyone involved.
An Embarrassment of Riches
The DevOps concept exploded in the past several years. The term quickly saturated the mainstream technology industry. With this increased mindshare comes a corresponding increase in the number of tools available to accomplish DevOps-related tasks. That's a blessing and a curse as a DevOps practitioner. Thanks to the endless buffet of options, you're sure to find something that meets your needs, but to a newcomer, the multitude of choices is overwhelming. Combine that with the vast scope of tasks that fall under the DevOps umbrella and the competing claims of "best" from all sides, and you have a recipe for paralysis. A good place for finding tools and filtering by a variety of criteria is DevOpsBookmarks.com. The content is all open source, and the maintainers are diligent about merging contributions, but it hasn't seen a lot of updates lately. Despite that, it makes a great jumping off point. If you find something noteworthy that should be included, a pull request would be appreciated!
Narrowing the Field
When talking with clients or peers about DevOps concepts, it's useful to break things into "lanes" to help simplify the conversation and provide rough boundaries for defining where tasks fall or how tools might be applied. At the highest levels, you have the "infrastructure", "code" and "visibility" lanes. CI/CD is primarily in the code lane, with some bits getting into infrastructure and visibility. The topic of CI/CD breaks down into lanes of "Source Code Management", "Build/Package/Deployment Automation" and "Test Automation".
Most organizations focus their DevOps journey on CI/CD because it has the largest perceived return on investment and is the one most obviously related to the goal of "get good code out faster". By and large, they are right, but they ignore the other lanes at their peril. Some organizations pour hundreds of thousands of dollars into implementing CI/CD tools and processes, only to have the whole effort stymied by shortcomings in the infrastructure lane. Perhaps even worse, multi-month deployment and training projects bear no fruit, because no one bothers to make sure the tools actually are getting used. This is where paying attention to your visibility lane comes into play. When doing DevOps, it's important to measure and report on as many metrics as you can that are relevant to your goals. Process and tool adoption metrics are critical to include.
CI/CD Put Simply
CI/CD aims to reduce the time in between when a code change is made and when it is deployed and in use. The Holy Grail that many on the path of CI/CD are pursuing is to reduce the time from commit to production down to minutes, without the need for human intervention along the way. To do this, many types of automation are employed to test, build, package and deploy code changes. To really get there though, your application architecture has to be amenable to this potential rate of change.
Microservices and serverless architectures are two design patterns that can handle it well, but if your application is a single monolithic service, odds are you won't get there without either changing that design first or having remarkably mature test automation. There will be times though, even in the most mature organizations, when you actually may not want to deploy a change right when it's made. For this reason, some people like to differentiate between "delivery" and "deployment", calling the process "CI/CD/CD".
Focus on What Matters Most
When adopting DevOps practices, the tools are the easy part. That isn't to say that selecting and implementing them is objectively "easy", only that it's a lot easier than the accompanying task of making sure that an organization's culture and processes are supportive of DevOps practices. When selecting tools, it's easy to get wrapped around the axle worrying about doing things "right". They say "You're not doing it right if you don"t have unit tests!", or "You're not doing it right if you don't have your infrastructure defined in code!"
Don't be overly worried about "right" until your organization has a fairly mature DevOps culture in place. Focus on the tools and practices that will give you the shortest time to value and provide the most quality of life improvement for your developers and ops people. Writing and maintaining unit tests takes a ton of effort, and the value it provides often lies in the far future. If you have only a few servers deployed behind a simple load balancer and that's not likely to change soon, automating your infrastructure may not pay. Go for the quickest wins possible in the beginning. Nothing encourages support like success. Just make sure you do plan to come back to fill in those gaps. They become more important as your organization matures. Don't let perfect get in the way of better.
The biggest payoff is usually found in automating build and deployment, so that's the best place for most folks start. Those are tasks that need to be done over and over as you iterate through the development process, often many times a day. The sooner the pain of these tasks can be reduced as near as possible to zero, the happier everyone will be. This is the core of a CI/CD pipeline.
Source Code Management
Many Source Code Management (SCM) systems exist. Mercurial, Microsoft Team Foundation Server and Perforce all come to mind. However, Git has become the de facto standard SCM, and GitHub is the dominant management layer people use on top of it. However, GitHub is not FOSS, so let's turn to its worthy competitor GitLab CE, also known as GitLab Core.
The rate at which GitLab has matured and features are added is staggering. GitLab is licensed under an Open Core model, which means many of those features exist only in their commercial offering, which is a shame, but an understandable one. The FOSS offering is still robust enough to be quite compelling though. It approaches feature parity with GitHub as a Git management tool, and it even surpasses it by offering a suite of additional DevOps-enabling features, such as CI/CD orchestration, Slack-like messaging, artifact repositories, tight Kubernetes integration and even a Function as a Service (FaaS) or "serverless". But for SCM, it offers everything you need to perform the core code development management tasks of branching, reviewing, approving and merging code changes and much more. A full-feature comparison matrix of the various editions of self-hosted and GitLab-hosted products is available here.
Other FOSS options exist, but GitLab is probably the place to start since it is mature and fully featured. One that I'm aware of that is also quite nice is Gitea, which is a very lightweight implementation of Git done in Go with a nice management interface. It's likely most useful if GitLab's admittedly large system requirements are too much for your use case.
Build/Package/Deployment Automation
This is where the rubber really meets the road, and where people working on CI/CD tasks likely will spend the bulk of their energy. The most well known tool in this space also happens to be FOSS, and that is Jenkins. Thanks in large part to its vast library of plugins, Jenkins can be much more than a CI/CD tool. It really is a Swiss Army knife of automation orchestration. The extensibility and flexibility of Jenkins can't be overstated. It's so flexible in fact, that CloudBees, a company that is a significant contributor to the FOSS project, uses it as the foundation for its primary commercial offerings. These offerings address some of the shortcomings of Jenkins FOSS, making it more appealing for very large, enterprise-class deployments.
Recently, complaints have started surfacing about Jenkins being "not modern" and "too hard to manage", especially when compared to very focused SaaS offerings like CircleCI or Shippable. Those arguments have some merit. HA isn't easily possible without moving to CloudBees, large Jenkins deployments can become unwieldy, the UI is dated, and its old-school Java roots do show from time to time. However, much of that can be ameliorated by running the Blue Ocean interface and running smaller, team-focused deployments in containers. Moving to competing SaaS tools also would lose a lot of the power that Jenkins brings to the table as a general automation orchestration tool, which is a role those options don't fill as well.
GitLab appears in this lane too. GitLab first introduced CI features in 2015, and they have matured rapidly since then. It has become a well regarded tool in this space and is a particularly easy choice if you've already deployed GitLab for source code management, as the CI tool is included.
There are several other notables in this space, each with their own particular take on CI/CD and a different set of strengths and weaknesses. One that is particularly interesting is Drone, because it aims to be "container native". It defines pipelines using YAML very similar to Docker Compose, which should make it accessible to anyone comfortable using Docker for local development. Like Gitea above, it is written in Go and has a very light footprint, and so it would be an appropriate choice for resource-constrained environments.
Test Automation
Test automation is a cornerstone of a true CI/CD pipeline; however, it's a very complicated topic. The tools vary by the language in which the application is written, the nature of the application itself, and even the composition of the team or teams writing the software. Of all the problems in the CI/CD space, this is the most challenging one. Not only is it a challenge to decide what to test, it's difficult to determine how best to test it. There are unit tests, integration tests, functional tests, system tests, validation tests, regression tests, black box testing, white box testing, static code analysis, dynamic code analysis and even open-source license, compliance and security analysis. The list could continue, opening this space up to become another seemingly endless array of choices. In the end though, it is best to stay focused on answering the question, "Is this code ready to be used?" and then come up with your own organization's definition of "ready". That will help you make decisions about what kinds of testing you should be doing now, later and perhaps not at all. As you journey down the road of test automation, your definition of "ready" likely will become more and more strict, and you'll iteratively bring on additional tools to meet the evolving criteria. The most common classes of automated testing, unit testing, system testing and functional testing are all great places to start. They all have lots of good FOSS options available.
Hundreds of different unit test frameworks are available, with at least a handful for nearly every language that has seen any amount of real world use. There is an incredible list of these frameworks available at Wikipedia. Start your search for a tool there, or search for "unit testing for $my_language" in your search engine of choice, and choose one that seems to be actively used and developed, and one that can be made to integrate with the other tools you intend to use. Many of them are "xUnit" style, which is a very common model for unit testing. If you choose one of that type, it's more likely that your developers will be comfortable writing tests for it, and there's a good chance it will create a results report in a JUnit-compatible XML file. JUnit XML reports are the lingua franca of the unit test world, and having reports in that format makes it far more likely that whatever tool you want to record your results in will be able to parse the report.
System testing isn't quite so tightly defined. Here again is a wide-open problem space with a multitude of possible solutions that will be heavily influenced by your particular situation. My preferred starting approach is fairly repeatable and broadly applicable. Deploy a disposable instance of your application (usually in containers) and run a load test with a tool like Gatling or Postman to run through the core functionality of your app quickly. If those tests fail, there's a good chance you have an system issue. Postman itself isn't a FOSS tool, but it is free for most uses, and the Postman folks release a lot of supporting tools as FOSS and generally seem to be a good FOSS community members.
It's also worth noting that system testing is often mistakenly called "integration testing". Integration testing is a step that would traditionally come between unit testing and system testing, but one should consider skipping it early in the adoption of automated testing practices, as it usually provides tangible value only in very complicated software written by very large teams or composed of the work from several separate teams.
For functional testing, the standout FOSS tool is Selenium, which forms the core of many other testing automation tools, both FOSS and commercial. If your application exposes anything through a webpage, Selenium or something like it should be in your toolkit.
And finally, all the testing in the world doesn't mean much if you can't view the results. Jenkins can display test results itself, but running Sonarqube adds a lot of value. Not only does it give you a great view of how your test results have changed over time, it can perform various kinds of static analysis on your code if you are using a supported language. It's another Open-Core-licensed tool, and some very useful features have been moved into the commercial version recently—perhaps most notably the ability to track multiple branches of a single codebase easily.
Conclusion
One could use a selection of the tools from each of the lanes listed above and provide the framework for an effective CI/CD pipeline. The options here are only a few of the possible choices; however, they are ones with a proven record of delivering value. And ultimately, that's the point: getting to value. In the end, that's what a CI/CD pipeline is for, delivering value to your users as quickly and smoothly as possible, by reducing friction within your development and deployment process. And that is an effective early step in embracing DevOps.
Resources
- Open Source Business Models Considered Harmful
- DevOps Bookmarks
- GitLab Core
- GitLab Features Matrix
- Gitea
- Jenkins
- CloudBees
- Jenkins Blue Ocean
- Drone
- List of Unit Testing Frameworks (Wikipedia)
- Gatling
- Postman
- Overview of Integration Testing (Wikipedia)
- Selenium
- Sonarqube
Image credit: Brian Ho on Unsplash.