How I Tuned My CI/CD Pipeline To Be Done in 60 Seconds

By MzFit

October 19, 2024 - 11 minutes read - 2172 words

A CI/CD Pipeline is one of the key tools software engineers have to produce high quality software. It stands for Continuous Integration (CI) and Continuous Delivery (CD). The idea being that instead of making a bunch of changes to your software and then pulling it all together to test it at the end, you should continuously integrate (test) and release (deploy) your software to find bugs faster.

Like many people, I store my software source code on GitHub. A few years ago I set up a simple CI/CD pipeline in GitHub to build, analyze, and test my web app/web services. It worked fine and since it was the first time I had set up a CI/CD pipeline in GitHub I kept it simple with essentially only one step.

build (and deploy)

image of GitHub showing a 5 minute 20 second build

Over time, though, I found myself shying away from making changes to my software. As a developer with ADHD, I sometimes find that I have issues getting things done when there are multiple hurdles involved, and I realized that one of the things that was causing me problems was the fact that my CI/CD pipeline took 5 minutes to run. Every time I wanted to make a change, I would have to code it up and then go make a cup of coffee while I waited for the pipeline to test and deploy the code. I wouldn’t always make it back. Often times I would get distracted.

For reference, when I started I was doing these things in 5.5 minutes:

building
purgecss
stylelint (css)
html-validate
yamllint
SCA vulnerability scanning (go vuln)
2 go linters (staticcheck and golangci-lint)
packaging the app, including nginx configs, into a deployable zip
running nearly unit and integration 200 tests

I decided that 1 minute was the maximum I was willing to wait for my code to test and deploy.

Here’s what I did to optimize my CICD pipeline:

split action into multiple parallel jobs
use github caching
optimize my linting
tweak the jobs to fit together

Even though my app is a golang app, I feel these techniques should work for any programming language.

Parallel Jobs

My first attempt at parallel jobs went a little overboard. I decided to take every step in my Makefile and separate it out into it’s own job. Linting the github yaml? Let’s put it in it’s own job. Linting my CSS for the web site? Yep, let’s put that in it’s own job. Etc. etc.

I went way overboard with parallel jobs at one point

This worked fine, but I ended up blowing through my GitHub billable minutes. Some of the jobs ran in as little as 9s, but I would still get billed to the nearest whole minute by GitHub. I wanted to do something a little more sane, so I combined many of the short targets down into something more reasonable. Since GitHub provisions dual core VM’s, my first approach was to combine items and run them in parallel using make -j2 .

This worked ok. But it left things a bit difficult to debug when they failed since the log messages would be interspersed. It was also hard to tell how long each subcomponent ran.

Where I eventually landed was on using 5 jobs, and if they all succeed they kick off the CD pipeline to deploy to the development server:

image of GitHub pipeline with 5 jobs

I felt like this was a pretty good tradeoff of cost and performance. But to get to the 5 jobs I had to do some other things first.

GitHub Caching

The biggest thing I did was enable caching. Every time I ran my build:

image of GitHub build job downloading go dependencies

GitHub would launch the build inside a docker container and would have to download all the go packages. Since this happens in serial fashion, it took over a minute just to download the dependencies for my project.

Dependency Caching

Fortunately, GitHub has a nice, well-documented caching feature.

  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/cache@v4
        with:
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-
      - name: Run make build
        run: |
              make build

By adding the actions/cache@v4 step before my build, GitHub automatically caches the dependencies based on my go.sum file. As long as I don’t change the dependencies, they cache automatically, and if I do change them, the first time it runs, it just runs slower to build the cache.

And with caching it’s super fast! Restoring 419MB in 6 seconds:

image of GitHub cache restoring in 6s

One of the tricks, though, is that I also use go for linting and doing vulnerability scanning, and these have different dependencies. So I ended up tweaking the cache keys to be slightly different when I run golangci-lint (for example):

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/cache@v4
        with:
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
          key: ${{ runner.os }}-golint-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-golint-
      - name: Run make ci-lint
        run: |
              make ci-lint

My makefile looks like this, where it both runs the task and installs the prerequisites. I use the Makefile extensively since it’s portable and lets me bootstrap a machine for local development and then run the same in the CI pipeline.

(I develop on both a mac and within WSL on Windows.)

install-golang-ci: install-golang
	go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest

ci-lint: install-golang-ci
	golangci-lint run

A side benefit is I can do most of my CI development locally, which is faster. Then I just add a well tested one line command to the github yaml file, so I don’t spend time debugging yaml.

(I’ll repeat this later, but in my experience, developing locally is 4x faster than testing out changes in GitHub runners.)

Data Caching

Another slowdown was my unit and integration tests bootstrap themselves by fetching many GB’s of data from YouTube. I decided to tar up the database and cache by running a job every night at midnight that builds the next day’s cache.

I shaved several minutes off the test time by loading those files into github cache using the cache/save@v4 action:

      - name: Run make test
        run: |
              make test

      - name: Run make cache-archive
        run: |
              make cache-archive
      - uses: actions/cache/save@v4
        with:
          path: |
            cache.tar.gz
          key: ${{ runner.os }}-cache-${{ hashFiles('cache.tar.gz') }}
      - uses: actions/cache/save@v4
        with:
          path: |
            bleve.tar.gz
          key: ${{ runner.os }}-bleve-${{ hashFiles('bleve.tar.gz') }}

And then restore them down using cache/restore@v4

  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/cache/restore@v4
        with:
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-
      - uses: actions/cache/restore@v4
        with:
          path: |
            cache.tar.gz
          key: ${{ runner.os }}-cache-${{ hashFiles('cache.tar.gz') }}
          restore-keys: |
            ${{ runner.os }}-cache-
      - uses: actions/cache/restore@v4
        with:
          path: |
            bleve.tar.gz
          key: ${{ runner.os }}-bleve-${{ hashFiles('bleve.tar.gz') }}
          restore-keys: |
            ${{ runner.os }}-bleve-
      - name: Run make cache-install
        run: |
              make cache-install
      - name: Run make test
        run: |
              make test

Now my builds and unit tests run in less than a minute.

Linting

I also spent some time looking at performance tuning my linting. Again, the fact that with my Makefile I can run everything locally allowed me to rapidly tweak the jobs for performance on my laptop.

Markup Linting

I ended up with a job for markup-lint where I bundled together similar jobs:

yamllint
html-validate
csspurge
stylelint

Technically I could have used GitHub caching to cache the NPM packages for html and css. But… I didn’t want to go down the rabbit hole of trying to understand NPM. No doubt someone reading this will say “but it’s easy.” . And yes, it probably is. But I got the 4 checks above running in about 20s on GitHub and that was good enough for me.

Not everything requires eaking out the last bit of performance. Sometimes good enough is good enough. I did make one change to speed up the NPM install: adding a few performance flags and installing all the targets at once, which seemed to shave 20s off vs installing each package in turn:

install-weblint: install-npm
	npm --no-audit --progress=false i -g html-validate@latest purgecss@latest stylelint@latest stylelint-config-standard@latest

Golang Linting

On the golang side, I had been using golangci-lint for a while, but had never spent much time thinking about what it did, so I had also been running staticcheck separately because I had noticed that the rules in stand alone staticcheck were more stringent than the one bundled with golangci-lint. So my initial impression of golangci-lint was not terribly favorable.

However golangci-lint has one big advantage: it can be made to bundle virtually everything together for speed.

So I ended up adding a .golangci.yml file as follows, and removing any separate linters I had been using:

---
linters:
  enable:
    - errcheck
    - gosimple
    - govet
    - ineffassign
    - staticcheck
    - unused
    - bodyclose
    - exhaustive
    - gocheckcompilerdirectives
    - godox
    - gofmt
    - goimports
    - gosec
    - whitespace
    - usestdlibvars
linters-settings:
  staticcheck:
    checks: ["all"]

I’m pretty happy with this. It runs all the linters that seem sensible to me to run (a much bigger list than golangci-lint’s defaults), and the whole list is cacheable and runs in less than a minute after caching.

Side rant: why golangci-lint does not enable gosec by default makes no sense to me.

  markup-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run make markup-lint
        run: |
              make -j2 markup-lint

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/cache@v4
        with:
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
          key: ${{ runner.os }}-golint-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-golint-
      - name: Run make ci-lint
        run: |
              make ci-lint

Reminder: one of the reasons I really like using Makefiles is that my Macbook M2 is much faster than GitHub runners.

Running golangci-lint on my laptop takes just shy of 13 seconds.

time make ci-lint
scripts/checkGoVersion.sh
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
golangci-lint run
make ci-lint  2.11s user 4.72s system 53% cpu 12.738 total

Running it on GitHub is 4x times slower (51 seconds):

image of GitHub showing linting takes 51 seconds

If you’re reading this and putting your logic inside your github yaml file, you’re commiting to spending something like 4x longer waiting on every single change you test.

I highly recommend each job in your GitHub yaml be a single command you tested locally first.

Tweak The Jobs

However, looking at the run times of the markup-lint and scan jobs, I realized I can actually combine these back into the build job.

If I make a custom Makefile job that packages up my build and archives it, and I call that “package” and I add markup-lint and vuln as prerequisites, and I run it with make -j2 so it takes advantage of both GitHub CPU cores in the runner …

package: install-golang arm64 package-nginx vuln tidy markup-lint
  scripts/build_and_package.sh

I get a simple GitHub action consisting of build/lint/test, none of which go over a minute (at the slight cognitive cost of having some linting run in the build job).

image of GitHub actions running build, lint, and cache under a minute

NOTE: I decided to call the GitHub job “build” but my Makefile target “package.” This is for purely aesthetic reasons of my own. Feel free to make your own aesthetic decisions on your pipeline.

Conclusion

So now I have a CI pipeline with build/test/lint jobs.

Each of the stages costs a billable minute

And then the deploy stage also costs a minute even though it runs for 2 seconds, and behind the scenes in another repo I have a deploy job that actually does the deploy:

  deploy:
    runs-on: ubuntu-latest
    needs: [
      lint,
      test,
      build,
    ]
    if: github.event_name == 'push' && github.ref_name == 'main'
    steps:
      - name: deploy
        run: |
            export GH_TOKEN=${{secrets.GH_DEPLOY_TOKEN}}
            gh workflow run github-actions-deploy.yml -f \
            env=DEV -f version=${{github.sha}} \
            --repo myrepo/deploy

So total cost of each build & deploy is 5 billable minutes.

At 2000 billable minutes per month in GitHub’s free tier, this gives me 400 deploys a month, or 13.3 deploys a day.

However, because of my cache job that packages up my database and cache files nightly, it eats up 4 billable minutes a night, or 120 minutes a month.

This makes the math (2000 - 120) / 5 = 360 deploys a month, or 12 a day.

Plenty for me as a solo developer.

All in all, this has been a fun experience. I feel like between GitHub caching and optimizing jobs locally first, most CICD pipelines can become much faster. At least nearly as fast as the local development is.

I was able to compress all of the following into less than a minute:

building
purgecss
stylelint (css)
html-validate
yamllint
SCA vulnerability scanning (go vuln)
SAST vulnerability scanning (gosec) ADDED
14 other golang linters ADDED 7
packaging the app, including nginx configs, into a deployable zip
running nearly unit and integration 200 tests

Unfortunately, if you have an app that takes a long time to build and install locally (bad memories of circa 2005 Java apps), you may not be able to improve much on that. I chose golang because of it’s fast compilation time, and feel like that has paid off well for me.

EDIT: I uploaded most of my scripts and config files to GitHub here. Please read the README with the disclaimers. There are many reasons not to use what I uploaded directly. It’s there for reference to help understand this post, not to blindly copy.