PoorMan's CI in Emacs
16 Dec 2022I have been working on the Deep Learning System course. It is the hardest course I ever studied after university. I would never thought that I need CI for a personal study project. It just shows how complex this course is.
Here is the setup: the goal is to develop a pytorch-like DL library that supports ndarray ops, autograd, and to implement DL models, LSTM for example, from scratch. That's the exciting math part. The tricky part is it supports both CPU devices with C++11 and GPU devices with Cuda. On the user front, the interface is written in Python. I worked on my M1 laptop most of the time, and switch to my Debian desktop for Cuda implementation.
It was a fine Saturday afternoon, I made a breakthrough in implementing the gradient of Convolution Ops in Python after couple of hours of tinkering in a local coffee shop. I rushed home, boosted up Debian to test the Cuda backend, only to find "illegible memory access" error!
It took me a few cycles of rolling back to the previous change in git to find where the problems are. It made me think about the needs of CI. In the ideal scenario, I would have a CI that automatically runs the tests on the CPU and Cuda devices to ensure one bug-fix on CPU side doesn't introduce new bugs on the Cuda, and vice versa. But I don't have this setup at home.
Two Components of PoorMan CI
So I implemented what I call PoorMan CI. It is a semi-automated process that gives me some benefits of the full CI. I tried hard to refrain from doing anything fancy because I don't have time. The final homework is due in a few days. The outcome is simple yet powerful.
The PoorMan CI consists of two parts:
a bunch of bash functions that I can call to run the tests, capture the outputs, save them in a file, and version control it
For example, wrap the below snippet in a single function
a log file where I keep track of the code changes, and if the new change fixes anything, or breaks anything.
In the example below, I have a bullet point for each change committed to git with a short summary, and a link to the test results. The fce5edb and f43d7ab are the git commit hash values.
- fix grid setup, from (M, N) to (P, M)! [[file:test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log]] - ensure all data/parameters are in the right device. cpu and cuda, all pass! milestone. [[file:test_results/2022_12_11_13_51_22__f43d7ab__fast_and_cuda.log]]
As you can see, it is very simple!
Benefits
It changed my development cycle a bit: each time before I can claim something is done or fixed, I run this process which takes about 2 mins for two fast runs. I would use this time to reflect on what I've done so far, write down a short summary about what's got fixed and what's broken, check in the test results to git, update the test log file etc.
It sounds tedious, but I found myself enjoying doing it, it gives me confidence and reassurance about the progress I'm making. The time in reflecting also gives my brain a break and provides clarity on where to go next.
During my few hours of using it, it amazes me how easy it is to introduce new issues while fixing existing ones.
Implement in Org-mode
I don't have to use Org-mode for this, but I don't want to leave Emacs :) Plus, Org-mode shines in literate programming where code and documentation are put together.
This is actually how I implemented it in the first place. This section is dedicated to showing how to do it in Org-mode. I'm sure I will come back to this shortly, so it serves as documentation for myself.
Here is what I did: I have a file called poorman_ci.org, a full example can be found at this gist. An extract is demonstrated below.
I group all the tests logistically together into "fast and cpu", "fast and cuda", "slow and cuda", "slow and cuda". I have a top level header named group tests, Each group has their 2nd-level header.
The top header has a property drawer where I specify the shell session within which the tests are run so that
* grouped tests :PROPERTIES: :CREATED: [2022-12-10 Sat 11:32] :header-args:sh: :session *hw4_test_runner* :async :results output :eval no :END:
- it is persistent. I can switch to the shell buffer named hw4_test_runner and do something if needed
- it runs asynchronically on the background
All the shell code block under the grouped tests inherits those attributes.
The first code block defines variables that used to create a run id. It uses the timestamp and the git commit hash value. The run id is used for all the code blocks.
To run the code block, move the cursor inside the code block, and hit C-c
C-c
(control c control c).
Then I define the first code block to run all the tests on CPU except language model training. I name this batch of tests "fast and cpu".
- It creates the full path of the test results. The
fname
variable is set at the code clock header, this is a nice feature of Org-mode. pytest
provides an intuitive interface for filtering tests, here I use "not language_training and cpu".- The
tee
program is used to show the outputs and errors and at the same time save them to a file.
Similarly, I define code blocks for "fast and cuda", "slow and cpu", "slow and cuda".
So at the end of the development cycle, I open the poorman_ci.org file, run the code blocks sequentially, and manually update the change log. That's all.