I have been working on the Deep Learning System course. It is the
hardest course I ever studied after university. I would never thought
that I need CI for a personal study project. It just shows how
complex this course is.
Here is the setup: the goal is to develop a pytorch-like DL library
that supports ndarray ops, autograd, and to implement DL models, LSTM
for example, from scratch. That's the exciting math part. The tricky
part is it supports both CPU devices with C++11 and GPU devices with
Cuda. On the user front, the interface is written in Python. I worked
on my M1 laptop most of the time, and switch to my Debian desktop for
Cuda implementation.
It was a fine Saturday afternoon, I made a breakthrough in implementing
the gradient of Convolution Ops in Python after couple of hours of
tinkering in a local coffee shop. I rushed home, boosted up Debian
to test the Cuda backend, only to find "illegible memory access"
error!
It took me a few cycles of rolling back to the previous change in git to
find where the problems are. It made me think about the needs of
CI. In the ideal scenario, I would have a CI that automatically runs
the tests on the CPU and Cuda devices to ensure one bug-fix on CPU
side doesn't introduce new bugs on the Cuda, and vice versa. But I
don't have this setup at home.
Two Components of PoorMan CI
So I implemented what I call PoorMan CI. It is a semi-automated
process that gives me some benefits of the full CI. I tried hard to
refrain from doing anything fancy because I don't have
time. The final homework is due in a few days. The outcome is simple yet
powerful.
The PoorMan CI consists of two parts:
a bunch of bash functions that I can call to run the tests, capture
the outputs, save them in a file, and version control it
For example, wrap the below snippet in a single function
pytest -l-v-k"not training and cuda"\> test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log
git add test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log
a log file where I keep track of the code changes, and if the new
change fixes anything, or breaks anything.
In the example below, I have a bullet point for each change committed
to git with a short summary, and a link to the test results. The
fce5edb and f43d7ab are the git commit hash values.
- fix grid setup, from (M, N) to (P, M)!
[[file:test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log]]
- ensure all data/parameters are in the right device. cpu and cuda, all pass! milestone.
[[file:test_results/2022_12_11_13_51_22__f43d7ab__fast_and_cuda.log]]
As you can see, it is very simple!
Benefits
It changed my development cycle a bit: each time before I can claim
something is done or fixed, I run this process which takes about 2
mins for two fast runs. I would use this time to reflect on what I've
done so far, write down a short summary about what's got fixed and
what's broken, check in the test results to git, update the test log
file etc.
It sounds tedious, but I found myself enjoying doing it, it
gives me confidence and reassurance about the progress I'm making. The
time in reflecting also gives my brain a break and provides clarity on
where to go next.
During my few hours of using it, it amazes me how easy it is to
introduce new issues while fixing existing ones.
Implement in Org-mode
I don't have to use Org-mode for this, but I don't want to leave Emacs
:) Plus, Org-mode shines in literate programming where code and
documentation are put together.
This is actually how I implemented it in the first place. This section
is dedicated to showing how to do it in Org-mode. I'm sure I will come
back to this shortly, so it serves as documentation for myself.
Here is what I did: I have a file called poorman_ci.org, a full
example can be found at this gist. An extract is demonstrated below.
I group all the tests logistically together into "fast and cpu", "fast
and cuda", "slow and cuda", "slow and cuda". I have a top level header
named group tests, Each group has their 2nd-level header.
The top header has a property drawer where I specify the shell session
within which the tests are run so that
* grouped tests
:PROPERTIES:
:CREATED: [2022-12-10 Sat 11:32]
:header-args:sh: :session *hw4_test_runner* :async :results output :eval no
:END:
it is persistent. I can switch to the shell buffer named
hw4_test_runner and do something if needed
it runs asynchronically on the background
All the shell code block under the grouped tests inherits those
attributes.
The first code block defines variables that used to create a run
id. It uses the timestamp and the git commit hash value. The run id is
used for all the code blocks.
#+begin_src sh :eval nowd="./test_results/"ts=$(date +"%Y_%m_%d_%H_%M_%S")git_hash=$(git rev-parse --verify--short HEAD)echo"run id: "${ts}__${git_hash}$#+end_src
To run the code block, move the cursor inside the code block, and hit C-c
C-c (control c control c).
Then I define the first code block to run all the tests on CPU except
language model training. I name this batch of tests "fast and cpu".
#+begin_src sh :var fname="fast_and_cpu.log"fname_full=${wd}/${ts}__${git_hash}__${fname}
pytest -l-v-k"not language_training and cpu"\
2>&1 | tee${fname_full}#+end_src
It creates the full path of the test results. The fname variable
is set at the code clock header, this is a nice feature of
Org-mode.
pytest provides an intuitive interface for filtering tests, here
I use "not language_training and cpu".
The tee program is used to show the outputs and errors and at the
same time save them to a file.
Similarly, I define code blocks for "fast and cuda", "slow and cpu",
"slow and cuda".
So at the end of the development cycle, I open the poorman_ci.org
file, run the code blocks sequentially, and manually update the change
log. That's all.