Tutorial: How to improve development speed by running Github workflows on your local machine

Orfeas Kypris
5 min readMar 10, 2022

--

Collage: images sourced from here and here

Do you rely on test workflows for upholding a high level of code quality? Have you ever been frustrated at the fact that every time you want to make changes to your github workflow .yml you have to commit changes which may lead to failed builds, again and again? have you ever been frustrated that this leads to "commit pollution"? I have. And this is why I decided to use act, which is a superb tool that spawns a docker instance to run the workflow locally. It is user friendly, but still requires some configuration to get it up and running, especially if you want fancy things like authenticating and connecting to external services.

The objective of this short tutorial is to run a Github test workflow on our local machine to speed up development iterations. This is of particular importance when you don’t want to pollute Git history by committing a lot of code whose purpose is to get the test workflow to pass. In our case, this is especially important, as our tests are data-intensive and incur large bandwidth overhead on our Azure instance, which translates to higher cost.

We are going to use act, which is a tool that spawns a docker instance to run the workflow locally.

NOTE: If you find this tool useful, please consider supporting the developer here.

For this to work we will need to do the following:

  1. Install Docker Engine. Please make sure that the Docker daemon is running and functional prior to proceeding with this tutorial.
  2. Create an .actrc config file, which tells act which base image to use. We can use a pre-built base image, or we can build our own base image.
  3. Create a Dockerfile and build it.
  4. Run the docker image, passing some environment variables needed for the workflow to execute.
  5. (Optional) Create a act.vault file, which stores our authentication credentials for connecting to our external service.

At this point your local directory structure should look something like this, as in, just your repository directory my-repo:

Directory structure

Let’s start! 🚀

Instantiating the .actrc config file

The .actrc config file should be in your top-level directory.

Contents of .actrc:

This specifies which Ubuntu image the Dockerfile should use. For more info on available docker images for act have a look here.

A typical workflow .yml file (i.e. tests.yml) may look like this:

NOTE: In our case we are using Azure Blob Storage to store the data and DVC to version it, and pytest to run tests.

The line

specifies upon which action ( push, pull_request) the workflow should run. The $ACTION variable will be passed to the container during runtime.

At this point your local directory structure should look something like this:

Prerequisites

Let us create a Docker-in-Docker Dockerfile:

Contents of Dockerfile

At this point your local directory structure should look something like this:

Build time

Let’s build the Dockerfile via

NOTE: If you haven’t enabled rootless mode, you may have to use sudo.

Now you can run docker images (or sudo docker images) and see the newly built image.

Now we can run our image and look at the logs.

Hopefully, this should now run your workflow, if you don’t require any kind of authentication to access your external service.

At this point your local directory structure should look something like this:

To observe the logs, run

Next step: passing auth credentials for connecting to external services

Sometimes we are connecting to external services (i.e. Azure Blob Storage)in order to fetch some data or do other things. To understand how to set up an Azure AD application and service principal, have a look at this tutorial. In our case, we have registered our Github workflow as an app on Azure, and have obtained an Azure secret credential which is passed to the workflow using a Github environment variable. It happens to be called secrets.AZURE_CREDENTIALS. On Github, this can be set via repository settings menu, available to the administrator.

Creating the secret file

Once you have set up your app on Azure and obtained your secret key, then you can also use this key locally. We can employ the --secret-file $PATH_TO_SECRET flag to tell act to look inside a file where we have stored our secret credential, i.e. act.vault. We have to be careful how we store our secret key inside this file, especially if it is a JSON file (check out this for more details).

Contents of act.vault, which in this case is formatted in yaml:

act.vault

(…make sure there are no newlines in your JSON!)

We have to put our secret file act.vault inside a directory secret/. Our directory structure should now look something like this:

Update the base image

As of the time of writing this, the Ubuntu 20.04 image kindly provided by @catthehacker does not come with the Azure CLI preinstalled, so we will have to use this image as a base and install az on top of it. The Dockerfile for our new base image will look like this:

Dockerfile for new base image

We have to first build the act image:

Update .actrc

Once we decide which act image to use (pre-built or our own), we also have to change the contents of our .actrc to use the new act image in our dind container:

Contents of .actrc:

I have provided a prebuilt image in my Docker hub repo. If you want to use that instead, you can replace the above with -P ubuntu-latest=orphefs/orphefs:act-ubuntu-20.04 in your .actrc.

Update Docker-in-Docker build

Now, let’s include the new argument inside Dockerfile:

Contents of Dockerfile:

Now,let’s build the dind image again:

Now we can run the dind container using

Hopefully the above runs smoothly and updates the ci-logs/run.log file, so we can view the output on stdout via

which should look something like this:

Happy workflowing 👍

Final Notes

You can find the template files used for this tutorial in this repo.

Acknowledgements

This tutorial was inspired by:

Originally published at http://github.com.

--

--

Orfeas Kypris

I am a software engineer and data scientist, with a passion for solving challenging problems of high societal and environmental impact.