Jonathan Windgassen
Python-CI-Template

Repository



Python CI Template
A template for a python project that comes with a basic support of the Continuous Integration (short CI or CD) function of GitLab. This feature allows an easy overview of the tidiness and structural completeness if the own code and all Merge Requests.

How it works
Whenever you commit/push a change to GitLab or someone creates a Merge Request, a GitLab Runner, here the Shared Runner opensuse-test, starts to run the python files in the ./src folder of the repo over a variety of test applications. Each check, called a job, exits with a 0 when the code has no major flaws or mistakes and the next job begins. As soon as all jobs have finished the pipeline has finished running and you can see whether you passed or didn't pass a job. The runner uses a Docker image that was uploaded to the Container Repository of this project. This Image contains all of the necessary packages to check the code. When configured right the project on GitLab also shows badges, which allow for a quick overview of what tests the code passed and on which it didn't. The central controller for all of this testing is the .gilab-ci.yaml file in the root directory. Here you can specify what the jobs do (in this case they install all sorts of required libraries and call the .sh scripts in ci/ afterwards), what runner to use and all sorts of miscellaneous configuration.

What is the code inspected with?
Currently, the code is inspected with the following 5 packages:


Prospector - Combines static analysis tools such as Pylint, pep8 and Pyflakes. Checks if your code is structured properly

Bandit - Checks if your code has potential security issues

mypy - Static Type Checker. Identified type-issues in the code and finds common bugs

pytest - Unit testing

coverage - Uses pytest to identify how much of your code is covered by test-routines


Quick Starting guide

Fork or Clone the required file into your project. You don't need src/, ci/Docker/ and README.md

All all the Files you want to have to directories.txt

Build an Docker Image with the Dockerfile and add any packages here that you might need for Unit Testing, etc. You can find the Instuctions under Packages & Registries - Container Registry

Replace gitlab.version.fz-juelich.de:5555/windgassen1/python-ci-template at the to of the gitlab-ci.yml file with the link to your Image (Use the Copy symbol next to the Image to get the URL).
Add Badges to you Project. The Instruction can be found below.


How to use
Initially, the pipeline only tests the python file in the src/ folder. To insert your existing project or to begin a new one just move/create all .py files in the src/ folder. All tools the code is tested with get this folder as an argument and will check all python files in the folder. If you want to copy the files over to your repo, you need everything except README.md and src/ for the pipeline to work correctly.
The files that are checked are listed in the directories.txt in the root directory. For each folder or file you additionally want to add to the pipeline checks, you have to add their paths to this files. One line per directory or file. Or you can even remove the src/ folder if you so desire.
Because the shared Runner that comes with this server is quite limited, we opted to upload our own Docker Image to be used by the runner. Note that if you want your project to work flawlessly, you also need to upload this Image to your repository. You can find a reference to the Dockerfile here. The Image is very basic and does not need to contain a lot of customization, but depending on the dependencies of your project you might need to add required packages or the jobs might fail because of a lack of installed packages in the Image.
You can find the instructions on how to build and upload an Image under Packages & Registries - Container Registry. To specify an Dockerfile as an source for the building process you need to add -f ./path/to/Dockerfile to the command. After pushing the Image to GitLab you can specify which image to use in the first row of the .gitlab-ci.yml file by replacing gitlab.version.fz-juelich.de:5555/windgassen1/python-ci-template with the path to your own Image (Use the copy-button next to the Image in the Registry).
If you want to use the badges you also have to configure them under Settings - General in your project. For each job you can get a badge on the front page of your project beneath the title.
Each badge has to be configured as follows:


Name: The name of the Badge. This can theoretically be anything, but for an easy overview it should be what the job does (check the code, tell the coverage, etc.)

Link: This is the Link you forward to when clicking on the badge. The default solution is https://gitlab.version.fz-juelich.de/%{project_path}/, but you could e.g. also put the URL to the website of the tool the badge represents there.

Badge Image URL: This is the really interesting and crucial part of the badge. The given URL is the path to the .svg image the badge is supposed to use. In this case, the project makes use of the artifacts from the pipeline (more on that below) and the link will only be available after you have run the pipeline for the first time. For a correctly working image you have to use the following link: https://gitlab.version.fz-juelich.de/%{project_path}/-/jobs/artifacts/%{default_branch}/raw/name.svg?job=name where name is the name of the job whose results you want the badge to show. For Prospector you would insert prospector, for Bandit bandit and so on. The name of the jobs can be found in the .gitlab-ci.yaml file or on the pipeline page when you hover or click the corresponding job. GitLab also shows you a quick preview of the badge, so you can see if you entered everything correctly.


Quick note to the URLs: The %{something} variables in the link are variables from GitLab and will be automatically replaced with the correct paths to your code. So %(project_path) would be replaced with windgassen1/python-ci-template here. A more detailed explanation can also be found in Settings - Badges


Help! A job has failed
Whenever a job has failed (and even if it didn't) the pipeline stores the logfiles that the tool has created. You can access it by going to CI/CD -> Pipelines -> In the rightmost column, there should be a download button where you can choose the job that you want the log of. You can also click on the job directly and select Download or Browse on the right. Spoiler: You also find the badge for the job here, but more on that later.

Note that a job always creates a log, even if it didn't fail. That's because a job only fails if the tool reports an error, but not on warnings. You can nonetheless open a log and try to fix some warnings or other problems the pipeline complains about.


But how does it really work?
The central part of the pipeline is the .gilab-ci.yaml file which acts as a script to tell the runner what to do. On the top, we define the stages in which the jobs are ordered. Normally the pipeline would run all jobs that are on the same stage in parallel, and only when all jobs succeed it would start the jobs in the next stage. In this case, I told the pipeline to continue even if a job fails, because we wouldn't get any usable .svg files for the badges for the jobs after a failed stage because the script couldn't run to generate one. With the before_script tag we can specify what bash-commands are executed on the start of each job. Each job gets his own shell, so the before_script will be run on the beginning of all jobs. Here we install pip because it's not included on the Runner and install packages we need for all jobs.
There are a lot more things you can specify and you can find a detailed overview here
After the initialization, we can begin to define the jobs we want the pipeline to run. After setting the name for the job each one is configured using the following tags:


stage: Here you define in which stage the job runs. For overview, I'd recommend to also sort the jobs in the yaml file in the correct order.

tags: Specify the properties of the runner. By Adding linux you specify that the runner for this jobs needs to be a linux system. If you have a variety of runners available, this lets you specify what properties your runner needs to fulfil

allow_failure: As mentioned early, this tells the pipeline to continue, even if this job fails. We need this so all badges are generated. The downside of this is of course, that the pipeline always does a full run which might consume more resources, but in a small example like this, it doesn't really matter. But be careful when you have bigger projects.

script: This is the juicy part of each job. Here you can script the commands that are executed in the bash-shell for this job. We could run all of the tests only here, but this has a few downsides. When calling a .sh script you have the advantage of running it on your local machine to test the it, whereas you only can run the yaml file on a GitLab runner, which is much more tedious. It also helps to organize the code when putting it in a .sh script and makes everything nice and tidy.

artifacts: As already mentioned, each job gets his own terminal. Additionally, each script can't edit files or directories in the repo. This is nice because it prevents you from accidentally destroying your repository, but is also annoying in case you want to change files in your repository (like generating .svg badges for example). With the artifacts tag, you can specify files which aren't deleted when the job finishes but are instead uploaded to GitLab. By accessing the artifacts over a URL we can retrieve the .svg for the badge from the job.

More configuration options are found on the GitLab wiki