<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Yi Tang</title>
 <link href="http://yitang.uk/atom.xml" rel="self"/>
 <link href="http://yitang.uk/"/>
 <updated>2026-04-25T22:11:54+01:00</updated>
 <id>http://yitang.uk</id>
 <author>
   <name>Yi Tang</name>
   <email>yi.tang.uni@gmail.com</email>
 </author>

 
 <entry>
   <title>Giving Qwen 3.6 35B Vision</title>
   <link href="http://yitang.uk/ai/2026/04/25/giving-qwen-36-35b-vision/"/>
   <updated>2026-04-25T00:00:00+01:00</updated>
   <id>http://yitang.uk/ai/2026/04/25/giving-qwen-36-35b-vision</id>
   <content type="html">&lt;p&gt;Qwen 3.6 35b has been a fantastic thinking companion for me, anything
that I don’t know, I am not comfortable with, or having doubts
with, I would check with it. I found Qwen 3.6 + DeerFlow 2.0 is much
better than the paid version of Grok, and miles better than
Perplexity.&lt;/p&gt;

&lt;p&gt;Today, I made it even better by giving it vision. Earlier I uploaded
an image of my staircase and asked it to check the conditions when I
plan the staircase renovation project.&lt;/p&gt;

&lt;p&gt;This blog post highlights the key steps of how i did it.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Firstly, Qwen 3.6 has vision encoder built-in already, but it
requires an additional &lt;strong&gt;mmproj&lt;/strong&gt; component to make it work.
Honestly I have no idea what does it mean at the moment, I just
think of it as the eyes to LLM.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Download the mmproj file from the Unsloth Qwen 3.6 repo&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, add the
path to –mmproj argument for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llama-server&lt;/code&gt; command, reboot
llama.cpp, that’s it.&lt;/p&gt;

    &lt;p&gt;The vision component requires additional 1-2GB of vram, so to make
them fit to RTX 3090, I had to quantize the mmproj component from
bf16 to q4:&lt;/p&gt;

    &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;llama-quantize mmproj-BF16.gguf mmproj-Q4_K_M.gguf Q4_K_M
llama-server Qwen3.6-35B-A3B-UD-Q4_K_M.gguf &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
             &lt;span class=&quot;nt&quot;&gt;--mmproj&lt;/span&gt; mmproj-Q4_K_M.gguf &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
             ...  &lt;span class=&quot;c&quot;&gt;# rest of the llama-server arguments&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;To test it,
    &lt;ol&gt;
      &lt;li&gt;
        &lt;p&gt;check the mmproj is loaded successful from the llama.cpp log,&lt;/p&gt;

        &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;9517 alloc_compute_meta: graph splits = 1, nodes = 823                                                                                                                        
9518 warmup: flash attention is enabled                                                                                                                                       
9519 srv    load_model: loaded multimodal model, &apos;mmproj-BF16.gguf&apos;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;        &lt;/div&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Ask Qwen 3.6 35B model to describe a small image file, using
this snippet&lt;/p&gt;

        &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-X&lt;/span&gt; POST http://192.168.1.34:8000/v1/chat/completions &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;-H&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Content-Type: application/json&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;{
    &quot;model&quot;: &quot;Qwen3.6-35B-A3B&quot;,
    &quot;messages&quot;: [{
      &quot;role&quot;: &quot;user&quot;,
      &quot;content&quot;: [
        {&quot;type&quot;: &quot;image_url&quot;, &quot;image_url&quot;: {&quot;url&quot;: &quot;https://picsum.photos/512/512&quot;}},
        {&quot;type&quot;: &quot;text&quot;, &quot;text&quot;: &quot;Describe this image&quot;}
      ]
    }],
    &quot;max_tokens&quot;: 100
  }&apos;&lt;/span&gt; | jq
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;        &lt;/div&gt;

        &lt;p&gt;This is the response I got, so it confirms it works. The image
will change from time to time, so the response will be
different.&lt;/p&gt;

        &lt;blockquote&gt;
          &lt;p&gt;The image is a scenic landscape photograph, likely taken in late autumn or winter. It features a vast mountain range in the background, rolling hills in the mid-ground covered in snow and trees, and a foreground of dry, grassy terrain. The sky is dramatic with a mix of blue and warm sunset/sunrise colors.\n\n**2. Breaking down the image into layers&lt;/p&gt;
        &lt;/blockquote&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;if 1. success, but 2. failed, query the log file, grep vision or
image, e.g. this is what I got when i misspell mmproj in
llama-server at one point:&lt;/p&gt;

        &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;print_info: PAD token             = 248055 &apos;&amp;lt;|vision_pad|&amp;gt;&apos;
srv    operator(): got exception: {&quot;error&quot;:{&quot;code&quot;:500,&quot;message&quot;:&quot;image input is not supported - hint: if this is unexpected, you may need to provide the mmproj&quot;,&quot;type&quot;:&quot;server_error&quot;}}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;        &lt;/div&gt;
      &lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The model is equipped for vision tasks, next step is to enable vision
on DeerFlow 2.0, all I need is adding the support_vision to true in
config, full model spec is listed below to avoid ambiguity&lt;/p&gt;

    &lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;models&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;                                                                                
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Qwen3.6-35B&lt;/span&gt;                                                                    
  &lt;span class=&quot;na&quot;&gt;display_name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Qwen 3.6 35B (RTX 3090)&lt;/span&gt;                                                
  &lt;span class=&quot;na&quot;&gt;use&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;langchain_openai:ChatOpenAI&lt;/span&gt;                                                     
  &lt;span class=&quot;na&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Qwen3.6-35B&lt;/span&gt;                                                                   
  &lt;span class=&quot;na&quot;&gt;base_url&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;http://192.168.1.34:8000/v1&lt;/span&gt;                                                
  &lt;span class=&quot;na&quot;&gt;api_key&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;dummy_key&lt;/span&gt;                                                                   
  &lt;span class=&quot;na&quot;&gt;supports_thinking&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;                                                              
  &lt;span class=&quot;na&quot;&gt;supports_reasoning_effort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;                                                      
  &lt;span class=&quot;na&quot;&gt;supports_vision&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;                                                                
  &lt;span class=&quot;na&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;600&lt;/span&gt;    
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;

    &lt;p&gt;I have to add increase the timeout to 10 mins because the vision
component is a lot slower than text generation, with the default
value, DeerFlow will throw errors thinking the LLM is not
responding. the vision component can be optimised later to reduce
the runtime, but so far so good.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Now test DeerFlow 2.0. Restart the services (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make docker-stop &amp;amp;&amp;amp;
   make docker-start&lt;/code&gt;), open a new chat, upload a PNG file, and ask
to describe, wait for a bit, then boom!&lt;/p&gt;

    &lt;p&gt;I can also copy an image, and paste it to deerflow, which is very
nice interface.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Qwen 3.6 describes an uploaded image in DeerFlow 2.0 &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20260425215703screenshot.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;600&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Qwen 3.6 describes an uploaded image in DeerFlow 2.0 &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF&quot;&gt;https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Multiple Working Emacs</title>
   <link href="http://yitang.uk/emacs/2025/08/22/multiple-working-emacs/"/>
   <updated>2025-08-22T00:00:00+01:00</updated>
   <id>http://yitang.uk/emacs/2025/08/22/multiple-working-emacs</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;I work solely inside of Emacs, so when Emacs is down, I cannot do any
work. Emacs itself is very reliable, but there might be some risks
of downtime when upgrading Emacs or any of the 3rd party libraries
that I use.&lt;/p&gt;

&lt;p&gt;The downtime can be minimised by always having multiple Emacs versions
and their 3rd party libraries available. This blog post documents
how I implement it.&lt;/p&gt;

&lt;h2 id=&quot;installation&quot;&gt;Installation&lt;/h2&gt;

&lt;p&gt;Firstly, install each Emacs into its separate folder, e.g. on my
Debian box, I have ~/bin/emacs30.0.92/ installed 8 months ago and
~/bin/emacs30.2/ installed yesterday. This is easy to achieve by adding
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prefix&lt;/code&gt; option when building Emacs from source, e.g.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
./configure &lt;span class=&quot;nt&quot;&gt;--with-tree-sitter&lt;/span&gt;  &lt;span class=&quot;nt&quot;&gt;--prefix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$HOME&lt;/span&gt;/bin/emacs30.2&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h2 id=&quot;daemon&quot;&gt;Daemon&lt;/h2&gt;

&lt;p&gt;Then have a separate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;systemd&lt;/code&gt; service for each Emacs version. Taking
version 30.2 as an example, its unit file is saved as
~/.config/systemd/user/emacs30.2.service.&lt;/p&gt;

&lt;p&gt;In that unit file, the Emacs executable is specified in full path to
wherever it is installed&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[Unit]
Description=Emacs text editor
Documentation=info:emacs man:emacs(1) https://gnu.org/software/emacs/
After=graphical-session.target


[Service]
Type=simple
ExecStart=%h/bin/emacs30.2/bin/emacs --fg-daemon=work --init-directory=%h/.config/emacs/emacs.d_30.2
ExecStop=%h/bin/emacs30.2/bin/emacsclient -s work --eval &quot;(kill-emacs)&quot;
Environment=SSH_AUTH_SOCK=%t/keyring/ssh
Restart=on-failure

[Install]
WantedBy=graphical-session.target
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the unit file, I also added the &lt;a href=&quot;https://www.gnu.org/software/emacs/manual/html_node/emacs/Initial-Options.html&quot;&gt;initial option&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init-directory&lt;/code&gt; so
it has its own .emacs.d directory. It ensures the 3rd party packages
will be installed there.&lt;/p&gt;

&lt;p&gt;Note if there is an init.el file in that directory, Emacs will use
that instead of the ancient ~/.emacs file.&lt;/p&gt;

&lt;h2 id=&quot;gui&quot;&gt;GUI&lt;/h2&gt;

&lt;p&gt;Finally, to open an Emacs GUI that connects to the Emacs 30.2 daemon,
run&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
~/bin/emacs30.2/bin/emacsclient &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; work &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; .&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;from the command line.&lt;/p&gt;

&lt;p&gt;I sometimes found it is more natural to have a desktop application for
GUI, so I have ~/.local/share/applications/emacsclient-30.2.desktop
file, and the content is&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[Desktop Entry]
Name=Emacs 30.2 (Client)
GenericName=Text Editor
Comment=Edit text
MimeType=text/english;text/plain;text/x-makefile;text/x-c++hdr;text/x-c++src;text/x-chdr;text/x-csrc;text/x-java;text/x-moc;text/x-pascal;text/x-tcl;text/x-tex;application/x-shellscript;text/x-c;text/x-c++;x-scheme-handler/org-protocol;
Exec=~/bin/emacs30.2/bin/emacsclient --create-frame -s work %F
Icon=emacs
Type=Application
Terminal=false
Categories=Development;TextEditor;
StartupNotify=true
StartupWMClass=Emacs
Keywords=emacsclient;
Actions=new-window;new-instance;

[Desktop Action new-window]
Name=New Window
Exec=~/bin/emacs30.2/bin/emacsclient --create-frame -s work %F

[Desktop Action new-instance]
Name=New Instance
Exec=~/bin/emacs30.2/bin/emacsclient --create-frame -s work %F
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;not-perfect-but-close&quot;&gt;Not Perfect But Close&lt;/h2&gt;

&lt;p&gt;There could still be some risks of downtime due to conflicts between
Emacs/package versions, or caused by updating the OS/other
programs. These cases are rare, so this setup is good enough for me.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Rebate Architrave</title>
   <link href="http://yitang.uk/diy/2025/08/16/rebate-architrave/"/>
   <updated>2025-08-16T00:00:00+01:00</updated>
   <id>http://yitang.uk/diy/2025/08/16/rebate-architrave</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;Had a chill day walking around the canal path in London, at 6:30 pm, I
was keen to continue on the home office project.&lt;/p&gt;

&lt;p&gt;The existing wall is not plumb, so when I put the architrave, there’s
a gap. This is a typical issue, and a small gap (less than 3mm) can be
filled with deco chalk. In my case, the bottom has a 10mm gap, which I
have to address.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; 10mm gap at the bottom between the architrave and the wall &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20250816140134screenshot.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;450&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; 10mm gap at the bottom between the architrave and the wall &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;In general, there are two ways: either add a small piece to the door
lining to fill the gap, or rebate the architrave to accommodate the
wall protrusion. I jumped to the rebate approach as I didn’t have any
additional strips of wood for the first approach (happy skip days).&lt;/p&gt;

&lt;p&gt;A quick measurement told me the architrave needs a rebate of 45mm
wide, and the depth varies: starting from 1300mm height, reaching to
10mm deep at the bottom.&lt;/p&gt;

&lt;p&gt;The easiest way to do this in this scenario is to cut 10mm deep across
the board, as it is okay to have some voids behind the architrave, and
there are still 25mm for the architrave to be fixed on.&lt;/p&gt;

&lt;p&gt;My first attempt was using a track saw: first cut was at 45mm line,
from the bottom all the way to the 1300mm mark. The next cut is right
next to the previous cut to increase the rebate area. Repeating this
process many times to get to the whole 45mm area. The groove in the
photos below is made from 3-4 passes.&lt;/p&gt;

&lt;p&gt;With a blade kerf of size 1.8mm, I figured it requires 25 cuts to get
to 45mm. My efficiency-seeking brain took over and said: There must be
a better way.&lt;/p&gt;

&lt;p&gt;So I pulled out the Dewalt router from the drawer, set the depth, and
clamped the architrave down to the table. The immediate problem I
faced was that it didn’t cut in a straight line: it went like 45-60
degrees for some reason, so I couldn’t cut a long groove like I had
done with a track saw.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Rebate using a track saw and a router &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20250816135923screenshot.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;450&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Rebate using a track saw and a router &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;So I turned the router 90 degree and cut small and short chunks
instead. It worked well: the grooves I cut using a track saw serve as
a stopping line so I won’t cut extra. It was not perfect because there
were tons of dust coming out from the router, and it made so much
noise.&lt;/p&gt;

&lt;p&gt;I put my headphones on and made a few more passes. I started seeing
how it can be done for the whole 1300m length. Then I saw my neighbour
over the fence, asking what I was doing. Well, it turned out to be
7:30 pm already, so I had to stop and leave it for tomorrow.&lt;/p&gt;

&lt;p&gt;In hindsight, the track saw can do a much better job because I
realised only 5-10 passes would be enough. The small pieces between
grooves can be knocked off rather easily using a chisel. The track saw
has better dust collection, and the noise is much lower.&lt;/p&gt;

&lt;p&gt;Another completely different approach is to remove the protrusion on
the wall using a multi-tool: placing the blade on the door lining so
the cuts will be flush with the door lining, and pre-cutting the 45mm
line to have a neat finish.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Terminating Ethernet Cable At Height For CCTV Cameras</title>
   <link href="http://yitang.uk/diy/2025/08/06/better-way-of-terminating-cat6-ethernet-cable/"/>
   <updated>2025-08-06T00:00:00+01:00</updated>
   <id>http://yitang.uk/diy/2025/08/06/better-way-of-terminating-cat6-ethernet-cable</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;on-the-ground&quot;&gt;On the Ground&lt;/h2&gt;

&lt;p&gt;The standard Cat 6 plug is a pain to work with: I have to untwist the 4
pairs, make them perfectly straight, lay the 8 wires side by side with
no gaps, and then insert all of them into the RJ45 plug in one go.&lt;/p&gt;

&lt;p&gt;It sounds easy, but since the wires are flexible, it is actually very
hard: very often the wires move around and become misaligned or
misplaced during the fitting. If that happened or any other part of it
went wrong, I would have to pull out the whole lot and restart again.&lt;/p&gt;

&lt;p&gt;I had successes before, usually after a couple of attempts, often
accompanied by frustration in between. It requires me to activate the
fight mode, give it 100% focus while sitting in an “Orz”
position&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, so there’s quite a lot of energy poured into it.&lt;/p&gt;

&lt;h2 id=&quot;at-height&quot;&gt;At Height&lt;/h2&gt;

&lt;p&gt;However, even if I want to, it becomes physically impossible when it
comes to fitting a plug in the air for the CCTV cameras: the ladder is
a bit wobbly with uneven ground underneath it, and it is windy and
raining due to a summer storm.&lt;/p&gt;

&lt;p&gt;Since I wasn’t happy with the normal Cat 6 plug, I was keen to try new
products. So when I first saw the &lt;a href=&quot;https://www.kenable.co.uk/en/networking/network-accessories/network-plugs-couplers/12934-idc-punch-down-to-rj45-plug-for-cat6a-solid-ethernet-cable-connector-4-pack--5054338129341.html&quot;&gt;IDC Punch Down to RJ45 Plug from
Kenable&lt;/a&gt; &lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, I ordered a few.  It turned out to be a smart little
move (this time).&lt;/p&gt;

&lt;p&gt;This product has a built-in RJ45 plug that is already wired up, so I
can skip that difficult part. All I have to do is punch down the wires
into the IDC terminal. Punching down itself is very easy; I can do it
half-minded with one hand.&lt;/p&gt;

&lt;p&gt;Another benefit is that I can split the fitting into multiple steps,
and I can take mini breaks for my arms between steps. Once one or two
wires are inserted into the IDC terminal, it binds the cable to the
plug. The binding is strong, so it hangs in the air and swings a bit
with the wind with no issues. Then I take my time for the rest of the
wires. If you don’t appreciate how important it is, trust me, your
arms become rather fatigued when working with your hands overhead.&lt;/p&gt;

&lt;p&gt;The only flaw with this product is that the punch-down slots are too
wide for the &lt;a href=&quot;https://www.kenable.co.uk/en/networking/network-accessories/network-tools-testers/6212-newlink-newlink-adjustable-impact-punch-push-down-tool-for-idc-terminals-006212-5055383462124.html&quot;&gt;impact adjustable punch-down tool&lt;/a&gt;; I was lucky to have a
&lt;a href=&quot;https://www.kenable.co.uk/en/networking/network-accessories/network-tools-testers/2391-punch-down-utp-cable-cutter-stripper-tool-idc-network-yellow-002391-5055383423910.html&quot;&gt;tiny punch-down tool&lt;/a&gt; at hand to use.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; For people who don’t know what “Orz” stands for, “O” is head, “z”
is legs and hips, and “r” is arms.&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; This post is not affiliated with Kenable&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Extend Ethernet Cable</title>
   <link href="http://yitang.uk/diy/2025/08/05/extend-ethernet-cable/"/>
   <updated>2025-08-05T00:00:00+01:00</updated>
   <id>http://yitang.uk/diy/2025/08/05/extend-ethernet-cable</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;I had to extend the main Ethernet cable that connects the main router
in the living room to the secondary router in the new
office. Technically, the cable size is spot on, but I had to cut back
2-3 times because a combination of my lack of experience and the LAP
data module from Screwfix is rubbish.&lt;/p&gt;

&lt;p&gt;It seems like an unusual task given that there are only a few products
available on the market. I tested two, and I am happy with the
results, so I am documenting here for people who might find it useful.&lt;/p&gt;

&lt;h2 id=&quot;jelly-crimps&quot;&gt;Jelly Crimps&lt;/h2&gt;

&lt;p&gt;The first product I tested was from my electrician. It took me a while
to find out that its name is Jelly Crimps.  You can get it from &lt;a href=&quot;https://www.tlc-direct.co.uk/Main_Index/Cable_Accessories_Index/Jelly_Crimp/index.html&quot;&gt;TLC&lt;/a&gt; or &lt;a href=&quot;https://www.amazon.co.uk/Telephone-Connectors-Waterproof-Connector-Network/dp/B0F7HKL2CS/ref=sr_1_30?dib=eyJ2IjoiMSJ9.W9chBkMOovw0A1UQQcTibL0RLeXhlHIY06AtAiLVqmzefgzZg5lUvIRAA8PxTZVG1EiE1bIUw10OOG_eJ4PY4PkOp7lpuHpckdZ8gIjUr1Tc2unDzpeo4F2HFd97ogjs1vQXgiGcdTf2KUX7rTjN4enJINzQPjciopDeyQdLdETypdFQTJSBt5xy0bt1qEyevdjFMKZCCfI3nM_02iiuHv_RslkfrE1tgoFypeAxtyuctMzMtKRh0N1QVxakdCfBMCrTG8sQOh7PAq5FPfXrQb3GazGXnJIz1JwJqEWOkCI.BnL2tcZpRKIWR8B1C0gHCSeJQsi2IZOkkScciGJyKrI&amp;amp;dib_tag=se&amp;amp;keywords=jelly+crimps&amp;amp;qid=1754373965&amp;amp;sr=8-30&quot;&gt;Amazon&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The little connector has two long sleeves that host two wires. It has
a button in the middle; press it very hard, and it will release the
gel. I highly recommend using a piler unless you have super strong
figures.&lt;/p&gt;

&lt;p&gt;The process is simply: insert the wires, press with a piler to release
the gel, and repeat 8 times for each wire.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Jelly Crimps in Use &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20250805071027shopping.jpeg&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;250&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Jelly Crimps in Use &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;It costs about £0.2 to extend one cable, so it is very
cost-effective. I wasn’t sure it would work, but it does, and my
electrician vouches for it.&lt;/p&gt;

&lt;p&gt;The only problem with this product is that it is not
maintenance-free. According to my electrician, I will have to put
these connectors into a back box and put a front cover over it, which
changes it to a much bigger job.&lt;/p&gt;

&lt;h2 id=&quot;inline-coupler-from-kenable&quot;&gt;Inline Coupler from Kenable&lt;/h2&gt;

&lt;p&gt;So I decided to look for a better solution, and I found this &lt;a href=&quot;https://www.kenable.co.uk/en/networking/network-accessories/network-plugs-couplers/2232-inline-punch-down-coupler-for-lan-cables-cat6-white-002232-5055383422326.html&quot;&gt;Cat 6
Inline Coupler from Kenable&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It got 2 terminate blocks built-in, one for the incoming cable, and
one for the outgoing. There is a diagram of the Type B protocol
printed on the product, so I don’t have to look it up on my phone. All
I have to do is punch down the 16 wires one by one. With a quality
punch down tool it is a lot easier and quicker than I thought.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Inline Coupler In Use &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20250805082349screenshot.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;250&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Inline Coupler In Use &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;The product itself is solid, much better quality than the LAP data
module. I didn’t have to worry about damaging the terminal or face
plate when pushing it against the wall while using a punch-down tool.&lt;/p&gt;

&lt;p&gt;The size is on a sweet spot, about 24mm depth, just enough to tuck it
into the 25mm service void. I am not sure if it is maintenance-free or
not, but I am comfortable leaving it in the service void as it has an
enclosing cover on it.&lt;/p&gt;

&lt;p&gt;Kenable is the only place that sells it at a reasonable price, about
£2 each, while the rest of the sellers is asking for £5 so thank you
Kenable for making it affordable.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Full 500 Mbps Speed in the Office &lt;/em&gt;
  
  &lt;img src=&quot;/assets/20250805074806screenshot.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;250&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Full 500 Mbps Speed in the Office &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

</content>
 </entry>
 
 <entry>
   <title>Re-discovery the Ancient Info Documentation System in the Age of LLM</title>
   <link href="http://yitang.uk/2025/04/21/rediscover-the-info-documentation-system/"/>
   <updated>2025-04-21T00:00:00+01:00</updated>
   <id>http://yitang.uk/2025/04/21/rediscover-the-info-documentation-system</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;So there are few notes that helped me to learn Info. Hopefully it can
bring more new users to the Info system.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Travel with Info&quot;&gt;Travel with Info&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Why Info is not Popular?&quot;&gt;Why Info is not Popular?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#=dir= the Index File&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; the Index File&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Setup Info in MacOS&quot;&gt;Setup Info in MacOS&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Travel with Info&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;travel-with-info&quot;&gt;Travel with Info&lt;/h2&gt;

&lt;p&gt;The best time to learn difficult thing is in travelling. During my
last two-week’s trip to Singapore/Malaysian, I was reading about
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ledger Cli&lt;/code&gt; causally. without putting much efforts, it clicked. It
suddenly started to make sense to me.&lt;/p&gt;

&lt;p&gt;The more I learn, the more I want to learn more. I cannot wait for the
next opportunities to open Emacs and dive into Ledger’s brilliant
documentation. This is me with my Emacs in Changi Airport next to the
Jewe.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Emacsing next to the Rain Vortex in Changi Airport &lt;/em&gt;
  
  &lt;img src=&quot;/assets/092ecff4ffeb4f21b38d1bfa6f6fbea71105c.jpeg&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;450&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Emacsing next to the Rain Vortex in Changi Airport &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;I was able to apply the learning and came up with the project-rule to
keep data hygiene (will blog next). The positive feedback energise
me. The flight to London is ready for boarding but I don’t want stop
exploring during the 14 hours flight without WIFI.&lt;/p&gt;

&lt;p&gt;That’s where I re-discovered the Info documentation system.  I used it
to read the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger.el&lt;/code&gt; library between sleep sessions 8000 feet above
the ground. Reading in plain text inside of Emacs has great benefits,
no distractions, fraction free in taking notes. it was a breeze.&lt;/p&gt;

&lt;p&gt;Then I stepped into learning the Info documentation system itself, how
to navigate, search text/index and all that. I was able to pick it up
quickly, the concepts and shortcuts are native to me as an experienced
Emacs user.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Why Info is not Popular?&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;why-info-is-not-popular&quot;&gt;Why Info is not Popular?&lt;/h2&gt;

&lt;p&gt;I envisioned myself to use it to read all the documentation,
e.g. Pandas library’s in Python. That would be ideal I told
myself. However, I soon realised that Info documentation system is a
niche tool: it is mostly used in GNU projects and Emacs libraries.&lt;/p&gt;

&lt;p&gt;Why it is no popular? I was wondering myself. I decided to have a go
myself. well, the journey to start is already full of hiccups. This is
typical theme in learning legacy system, and could put many people
off.&lt;/p&gt;

&lt;p&gt;So there are few notes that helped me to learn Info. Hopefully it can
bring more new users to the Info system.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;=dir= the Index File&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;dir-the-index-file&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; the Index File&lt;/h2&gt;

&lt;p&gt;The first and most important thing I realised is, in the context of
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Info&lt;/code&gt;, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; is not a directory, but a plain text file. I simply
call it index file, then the rest becomes so much clearer.&lt;/p&gt;

&lt;p&gt;The =dir=/index file is the entry point of the Info program. it has a
lists of the available Info manuals with their name, Info file
location, and desecration.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Setup Info in MacOS&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;setup-info-in-macos&quot;&gt;Setup Info in MacOS&lt;/h2&gt;

&lt;p&gt;Then there is a bug in emacs-plus: during the installation of Emacs,
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; file somehow got deleted in the cleaning process. So the
manuals for the default libraries that comes with Emacs are not
available. In my case, I only have few Info from the packages I
installed post-installation, like orderlies, org-roam for example.&lt;/p&gt;

&lt;p&gt;I took a slightly different approach to fix this problem: I kept the
system level tools separate from the Emacs’s library, so i have two
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; files.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
&lt;span class=&quot;c&quot;&gt;# manuals of system level programs&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; /opt/homebrew/share/info
&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;file &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do &lt;/span&gt;install-info &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$file&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# manuals of Emacs and Emacs libraries&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; /opt/homebrew/share/emacs/info
&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;file &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do &lt;/span&gt;install-info &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$file&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Then tell Emacs the locations of those &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; files as below.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Info-directory-list&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/opt/homebrew/share/info&quot;&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;&quot;/opt/homebrew/share/info/emacs&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Note, the convention is for each directory in the list, there is a
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt; file, on in Emacs, we are specifying the file using directory,
and the file is happened to be called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dir&lt;/code&gt;. I feel the naming can be
improved to avoid such confusion!&lt;/p&gt;

&lt;p&gt;After restarting Emacs, Info will show there are about 500+ manuals
available, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;find&lt;/code&gt; tool, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mu4e&lt;/code&gt; library, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ledger3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Lastly, an quick note &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;install-info&lt;/code&gt;. As shown above, it is used to
install Info manuals, taking &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger3.info&lt;/code&gt; as an example, to install
it requires&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
install-info ledger3.info /opt/homebrew/share/info/dir&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;After that, the following line is added to the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/opt/homebrew/share/info/dir&lt;/code&gt; file.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; Ledger3: &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;ledger3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;           Command-Line Accounting&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;a bit of explanation:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;*:&lt;/strong&gt; mark the starting of the entry&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Ledger3:&lt;/strong&gt; is the node/manual name&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;ledger3:&lt;/strong&gt; inside of a parenthesis is the path to the Info
file without extension.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Command-Line Accounting:&lt;/strong&gt; is the description of the manual.&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Filter Ledger Transactions using Tags</title>
   <link href="http://yitang.uk/2025/04/17/filter-ledger-transactions-using-tags/"/>
   <updated>2025-04-17T00:00:00+01:00</updated>
   <id>http://yitang.uk/2025/04/17/filter-ledger-transactions-using-tags</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;I have been testing using &lt;a href=&quot;http://yitang.uk/2025/01/14/use-ledgercli-to-track-diy-project-expenses/&quot;&gt;Ledger-Cli to track my expenses&lt;/a&gt;, so far I
have found the tagging system useful. In my ledger journal, each
transaction is associated with a project, for example, the below
transaction is assigned to project “2024 Monitor Stand”&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;2024-12-08 Screwfix
    ; project: 2024 Monitor Stand
    Expenses:HomeImprovement:Tools            £ 4.99 
    Expenses:HomeImprovement:PPE             £ 19.98 
    Expenses:HomeImprovement:PPE             £ 14.99 
    ; :refund:
    Assets:Amex
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This constraint I came up with helps avoid meaningless spending on new
shiny tools. Operationally, imposing this limitation on my book
provides flexible ways of querying the data.&lt;/p&gt;

&lt;p&gt;For example, bring up the transactions that do not have projects
assigned to:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
ledger reg exp and &lt;span class=&quot;s2&quot;&gt;&quot;expr&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;not has_meta(&apos;project&apos;)&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
       &lt;span class=&quot;nt&quot;&gt;--format&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;| %(date) | %P | %(amount) | %(note) |&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 1:&lt;/span&gt; posts without project&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Date&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Payee&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Amount&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Note&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;2024/12/22&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;Selco&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;£ 30.570&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;; CaberFloor p5 T&amp;amp;G 2400x600x18mm x 2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;There is only one post that I forgot to add the project tag, so pretty
good.&lt;/p&gt;

&lt;p&gt;A bit of explanation of the ledger-cli query syntax&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;exp:&lt;/strong&gt; check only accounts contain ‘exp’, in the ledger’s
convention, it is all expending accounts, i.e. Expense::*&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;expr:&lt;/strong&gt; invoke filters using expressions&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;has_meta(‘project’):&lt;/strong&gt; check if the transactions have the metadata
key ‘project’&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;and, not:&lt;/strong&gt; logical operators&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;–format:&lt;/strong&gt; specify the output formatting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another use case is counting the number of transactions per project. I
use the number of purchased items as a proxy to gauge the project
size.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
ledger reg exp and &lt;span class=&quot;s2&quot;&gt;&quot;expr&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;has_meta(&apos;project&apos;)&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
       &lt;span class=&quot;nt&quot;&gt;--format&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;%(meta(&apos;project&apos;))&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;  &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
       | &lt;span class=&quot;nb&quot;&gt;sort&lt;/span&gt; |  &lt;span class=&quot;nb&quot;&gt;uniq&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;sort&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-bgr&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 2:&lt;/span&gt; Number of items purchased for each project&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;No. Items&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Project&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;39&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;2024 Loft Lights&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;34&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;2024 Loft Insulation&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;32&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;2025 Garage Conversion&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;8&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;2024 Monitor Stand&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;General&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The data shows the “2024 Loft Lights” project is by far the largest
. That was a simple project by itself, however, since that was my
first electrical project, I had to purchase a lot of stuff, 1.5mm
cables, clamps, grommets, connectors, switches, sockets etc.&lt;/p&gt;

&lt;p&gt;Finally, I have the “refund” tag so I can flag up the items to remind
of myself to check if I received the refund fully.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
ledger reg &lt;span class=&quot;s2&quot;&gt;&quot;expr&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;has_tag(&apos;refund&apos;)&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
        &lt;span class=&quot;nt&quot;&gt;--format&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;| %(date) | %P | %(amount) | %(note) |&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Date&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Payee&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Amount&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Note&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;2024/12/08&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;Screwfix&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;£ 14.990&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;Site Optimus Gel Knee Pads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;So far I enjoyed the plain text accounting using &lt;a href=&quot;https://ledger-cli.org&quot;&gt;ledger-cli&lt;/a&gt;. The
format and syntax are simple, and yet I can do complicated queries.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Setup ssh-agent Systemd Service for Emacs</title>
   <link href="http://yitang.uk/2025/01/26/setup-sshagent-systemd-service-for-emacs/"/>
   <updated>2025-01-26T00:00:00+00:00</updated>
   <id>http://yitang.uk/2025/01/26/setup-sshagent-systemd-service-for-emacs</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;problem-statement&quot;&gt;Problem Statement&lt;/h2&gt;

&lt;p&gt;My personal desktop is not booting (the motherboard is probably dead)
so I have been setting my server so I can work while sorting things
out.&lt;/p&gt;

&lt;p&gt;I got stuck in getting &lt;a href=&quot;https://magit.vc/&quot;&gt;magit&lt;/a&gt; working in emacsclient: I thought I could
run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ssh-add&lt;/code&gt; inside of Emacs that would allow &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;magic&lt;/code&gt; to access my
git repos using ssh, but apparently, it is not the case.&lt;/p&gt;

&lt;p&gt;After some digging, I learnt that the problem I have to solve is to
run one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ssh-agent&lt;/code&gt; in the background and then make the Emacs/Magit or
any programs hook onto it. Then once I run ssh-add and type the
passphrase for the first time, either inside of Emacs or in a bash
terminal, everything would work.&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;

&lt;p&gt;Drop the following unit file below to
~/.config/systemd/user/ssh-agent.service.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[Unit]
Description=SSH key agent

[Service]
Type=simple
Environment=SSH_AUTH_SOCK=%t/ssh-agent.socket
ExecStart=/usr/bin/ssh-agent -D -a $SSH_AUTH_SOCK

[Install]
WantedBy=default.target
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The important things are&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The environment variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SSH_AUTH_SOCK&lt;/code&gt; is specified. It can be
anywhere as long as this environment variable in other programs
points to the same location.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ssh-agent&lt;/code&gt; is invoked with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-a&lt;/code&gt; option to provide an address
specified in the above step.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$t&lt;/code&gt; is a specifier&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; in systemd, it is equivalent to
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$XDG_RUNTIME_DIR&lt;/code&gt; variable in Debian. It points to the runtime
temporary directory which apparently is safer&lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; than the /tmp
directory. The runtime directory was cleaned up after stopping the
ssh-agent so it is non-persistent.&lt;/p&gt;

&lt;p&gt;To start the ssh-agent service:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
systemctl &lt;span class=&quot;nb&quot;&gt;enable&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--user&lt;/span&gt; ssh-agent
systemctl start &lt;span class=&quot;nt&quot;&gt;--user&lt;/span&gt; ssh-agent&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;After that, update the unit file of Emacs to include this line (follow
up my blog post &lt;a href=&quot;http://yitang.uk/2021/06/18/managing-emacs-server-as-systemd-service/&quot;&gt;Managing Emacs Server as Systemd Service&lt;/a&gt; for the full
setup).&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Environment=SSH_AUTH_SOCK=%t/ssh-agent.socket
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To make it work for bash shell and all other programs calling from a
bash terminal, add this line to ~/.bashrc.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;SSH_AUTH_SOCK&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$XDG_RUNTIME_DIR&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/ssh-agent.socket&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h2 id=&quot;alternatives&quot;&gt;Alternatives&lt;/h2&gt;

&lt;p&gt;There are programs developed to solve this specific problem (see &lt;a href=&quot;https://wiki.debian.org/SSH#keychain&quot;&gt;Debian wiki&lt;/a&gt;). While using such a program seems like a simpler
alternative (e.g. &lt;a href=&quot;https://www.funtoo.org/Funtoo:Keychain&quot;&gt;keychain&lt;/a&gt;), I prefer to use systemd as the unified
approach for managing background services. I have been using it for
emacsclient, and I’m adding ssh-agent to it.&lt;/p&gt;

&lt;p&gt;What is your preference? How do you solve this problem?&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; All the specifiers are listed &lt;a href=&quot;https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html#Specifiers&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; I am not a security expert but &lt;a href=&quot;https://unix.stackexchange.com/questions/316161/whats-the-difference-between-tmp-and-run&quot;&gt;the StackExchange comments&lt;/a&gt; seem to make sense.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Retiring Raspberry Pi 4 as Home Server and NAS</title>
   <link href="http://yitang.uk/2025/01/20/retiring-raspberry-pi-4-as-home-server-and-nas/"/>
   <updated>2025-01-20T00:00:00+00:00</updated>
   <id>http://yitang.uk/2025/01/20/retiring-raspberry-pi-4-as-home-server-and-nas</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Good Start for Self-Hosting&quot;&gt;Good Start for Self-Hosting&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Lack of NAS Capacity&quot;&gt;Lack of NAS Capacity&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Looking for a Successor&quot;&gt;Looking for a Successor&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Unexpected&quot;&gt;Unexpected&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Setting up z170a&quot;&gt;Setting up z170a&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Power Consumption&quot;&gt;Power Consumption&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Good Start for Self-Hosting&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;good-start-for-self-hosting&quot;&gt;Good Start for Self-Hosting&lt;/h2&gt;

&lt;p&gt;The little Raspberry Pi 4 (RP4) served me well in the last two
years. I used it to host NextCloud/Syncthing for syncing files between
devices, scraping financial data from Yahoo Finance and &lt;a href=&quot;http://yitang.uk/2022/03/11/wireless-backup-solution-using-raspberry-pi-for-macos/&quot;&gt;TimeMachine
for MacOS backup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The latest addition to the service stack is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;paperless-ngx&lt;/code&gt;. It allows
my Canon printer/scanner to send digital copies of documents directly
to the RP4 or Gmail.&lt;/p&gt;

&lt;p&gt;The RP4 handles all the demands without showing any signs of struggle.
It costs as little as 6kW per hour while the Xbox One S draws 11kW
while sleeping. Thanks to the energy crisis in the UK, I started to
appreciate the energy efficiency of RP4. The ARM chips in it really
impressed me.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Lack of NAS Capacity&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;lack-of-nas-capacity&quot;&gt;Lack of NAS Capacity&lt;/h2&gt;

&lt;p&gt;A 3TB portal hard drive (WD My Passport) was attached to the PR4 to
store media data. The USB 3.0 connector is surprisingly stable and
fast. With both ends connected by ethernet cables, the file transfer
speed can reach up to 100 MB/s. When my MacBook Pro uses Wi-Fi, the
speed drops to about 40-50 MB/s but it is still great because of the
convenience.&lt;/p&gt;

&lt;p&gt;Later I started using it as a NAS to store the Final Cut Pro
library. The 4k home gym videos I shot using iPhone 12 Pro are
numerous &lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;!  The hard drive keeps getting filled up.&lt;/p&gt;

&lt;p&gt;I can get another portal hard drive, but then it will get filled up
again, say in less than a month? So it occurred to me that I need a
proper home server with full NAS capacity.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Looking for a Successor&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;looking-for-a-successor&quot;&gt;Looking for a Successor&lt;/h2&gt;

&lt;p&gt;I did a bit of research but I am not able to find a good product. I
suspect the reason is the NAS build is a niche area while the PC
industry is gaming-centric, focusing on getting faster, bigger, and
fancier hardware with unnecessary RGB lights, that is where the
profits are I presume.&lt;/p&gt;

&lt;p&gt;I came across some innovative products on AliExpress from China, such
as the &lt;a href=&quot;https://www.aliexpress.us/item/1005005347605550.html&quot;&gt;TopTon N5105 board&lt;/a&gt;. It is more powerful, consumes
slightly more electricity, and it has 6 SATA cables! It would be a
perfect successor for my PR4.&lt;/p&gt;

&lt;p&gt;But I am not comfortable ordering electronic stuff from
AliExpress, returning it or sending it back for repair would be a nightmare.&lt;/p&gt;

&lt;p&gt;PS: The company is growing fast, it continued to innovate, and the product
lines extended to &lt;a href=&quot;https://www.aliexpress.us/item/1005006313023975.html&quot;&gt;Intel N100&lt;/a&gt; with an additional NVME drive and a USB-C. Their website
and marketing materials look notched up quite a bit. I kind of regret not
taking the risk back then.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Unexpected&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;unexpected&quot;&gt;Unexpected&lt;/h2&gt;

&lt;p&gt;The other day, I was re-organising (again) my home office, so had to
move a bookshelf. I started moving it without taking everything off,
then a motherboard fell off. It was the z170a with an i5-6600k and a
heat sink attached to it. The motherboard was in my first desktop that
I purchased 10 years ago when I started participating in Kaggle competitions
in 2014.&lt;/p&gt;

&lt;p&gt;After a quick inspection, I saw some pins were bent.  I felt ashamed and
sorry for the motherboard that I had not taken care of it.  So I made a
promise: if it survived the fall, I would use it for my NAS.&lt;/p&gt;

&lt;p&gt;Well, it did so I found my NAS.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Setting up z170a&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;setting-up-z170a&quot;&gt;Setting up z170a&lt;/h2&gt;

&lt;p&gt;While putting it up, one SATA port was snapped and came up, but the
rest is still fine. Apart from that, everything else went smoothly. The
Debian 12 became much easier to install with the &lt;a href=&quot;https://www.debian.org/releases/stable/amd64/ch04s03.en.html&quot;&gt;isohybrid technology&lt;/a&gt;
and the non-free firmware is now part of the installation image
itself.&lt;/p&gt;

&lt;p&gt;The server setup scripts and configuration are saved in a
selfhosted-services git repository so restoring the services took
little efforts.&lt;/p&gt;

&lt;p&gt;I had one little trick: I assigned the IP address of RP4 to the new
z170a server so that on the client side I didn’t have to change
anything. This was achieved rather easily: few clicks in the &lt;a href=&quot;https://www.asus.com/uk/support/faq/1000906/&quot;&gt;ASUS
router web UI&lt;/a&gt; and then a reboot.&lt;/p&gt;

&lt;p&gt;While setting it up, I noticed the z170a system is much more
responsive, thanks to the 3.5 GHz i5-6600k CPU and a much faster SSD
over the SD card. I was able to run multiple processes at the same
time.&lt;/p&gt;

&lt;p&gt;The longest part is copying files from the 3TB portal hard drive to the
z170a’s internal HDD, which took about 20 hours.&lt;/p&gt;

&lt;p&gt;It has great extensibilities: there are 3 free SATA for HHD and two
PCIe slots.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Power Consumption&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;power-consumption&quot;&gt;Power Consumption&lt;/h2&gt;

&lt;p&gt;The only downside is that it consumes a lot more electricity. When testing
in barebone, it drew only 10W. After putting everything together with
additional HDDs, fans, and ethernet cable, the power metre jumped to
45W. I removed hard drives one by one to see where the bottleneck is.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;No HDD, 27W&lt;/li&gt;
  &lt;li&gt;IronWolf alone, 32W, 5W increases.&lt;/li&gt;
  &lt;li&gt;IronWolf + Seagate, 37W, another 5W increase.&lt;/li&gt;
  &lt;li&gt;IronWolf + Seagate + Toshiba, 45W, 8W increase.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I kept only IronWolf which is a 3TB NAS grade HDD.&lt;/p&gt;

&lt;p&gt;I also tried tweaking the BIOS and Linux kernel to get to &lt;a href=&quot;https://edc.intel.com/content/www/br/pt/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/001/package-c-states/&quot;&gt;C-states&lt;/a&gt; but
I felt it was over-engineering so I am happily settled down with 27W.&lt;/p&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; I record weightlifting to correct and improve my
techniques.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Use Ledger-Cli to Track DIY Project Expenses</title>
   <link href="http://yitang.uk/2025/01/14/use-ledgercli-to-track-diy-project-expenses/"/>
   <updated>2025-01-14T00:00:00+00:00</updated>
   <id>http://yitang.uk/2025/01/14/use-ledgercli-to-track-diy-project-expenses</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Personal Technical Challenge&quot;&gt;Personal Technical Challenge&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Baby Steps&quot;&gt;Baby Steps&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Why? - Effort Estimation&quot;&gt;Why? - Effort Estimation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Personal Technical Challenge&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;personal-technical-challenge&quot;&gt;Personal Technical Challenge&lt;/h2&gt;

&lt;p&gt;I used &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cli&lt;/code&gt;&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; before and it was a painful experience. The
problem was not rooted in the tool but in how I intended to use it: I
wanted to track all my expenses, from buying a cup of coffee to
booking a holiday package. When I started this journey, there was a
massive jump from knowing little to nothing about personal finance to
doing double-entry accounting in plain text.&lt;/p&gt;

&lt;p&gt;Though I gave up, it introduced me to the idea of owning my bank
transaction data in text files on my personal computer. So over the
years, I manually curated about 8 years of historical transaction
data.&lt;/p&gt;

&lt;p&gt;If you haven’t done so, I strongly recommend you go to your banks’
website and download the transaction data manually, going as far back as
you can. You will notice that the banks only give access to 3-5 years
of data&lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. It’s a shame that banks use outdated technologies but
it is better than having nothing.&lt;/p&gt;

&lt;p&gt;Since I had the data, I did some analysis and charts in Python/R. But
I kept wondering what &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cli&lt;/code&gt; can offer. I occasionally saw blog
posts on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cli&lt;/code&gt; in the Emacs communities, so there must be
something out there.&lt;/p&gt;

&lt;p&gt;It also has become a personal challenge. I turned not to give up but
put it aside to tackle it again after I got older.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Baby Steps&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;baby-steps&quot;&gt;Baby Steps&lt;/h2&gt;

&lt;p&gt;Hopefully, I had become smarter as well. This time, to ensure I can
successfully adopt the tool, I am going to reduce the scope to limit
to only tracking DIY project expenses.&lt;/p&gt;

&lt;p&gt;I love DIY and I wish I had more days for DIY projects. It is usually
labour-intensive and I feel hyped and extremely confident after a couple
of DIY. Pairing it with learning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cli&lt;/code&gt;, a cognitive-intensive
activity, would make them a nice bundle&lt;sup&gt;&lt;a id=&quot;fnr.3&quot; class=&quot;footref&quot; href=&quot;#fn.3&quot; role=&quot;doc-backlink&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Though the usage is simple, the question it can answer is important. I
want to know, during or after the DIY project, how much it exactly
costs. I could use a much simpler tool, like spreadsheets or a
pen/notebook, but I want it to be a stepping stone to acquire
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cli&lt;/code&gt; properly in the future.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Why? - Effort Estimation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;why---effort-estimation&quot;&gt;Why? - Effort Estimation&lt;/h2&gt;

&lt;p&gt;I need an accurate answer to the actual costs so that I can use the
data to train myself in cost estimation. This is an very important
skill to have as a homeowner, it would put me in a much better
position in negotiation with the tradesman. A lot of the people
in the UK complained that they or their relatives got ripped off by
tradesman.&lt;sup&gt;&lt;a id=&quot;fnr.4&quot; class=&quot;footref&quot; href=&quot;#fn.4&quot; role=&quot;doc-backlink&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;In general, house repairs and improvements are getting much more
expensive every year, due to the shortage of labourers, inflation and
Brexit etc. To give an example using my last two quotes, adding an
electrical socket costs £240 and replacing a small section of water
pipes costs £500.&lt;/p&gt;

&lt;p&gt;I have a good habit of using org-mode to track time, my goal to add
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ledger-cil&lt;/code&gt; to my system to track the expenses. After that, I would
know if it is really worth doing the DIY or finding a proper
tradesman. The total cost itself is not the only metric that matters,
but n very essential one to have.&lt;/p&gt;

&lt;h2 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h2&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://ledger-cli.org/&quot;&gt;https://ledger-cli.org/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://money.stackexchange.com/questions/37480/why-dont-banks-give-access-to-all-your-transaction-activity&quot;&gt;Why don’t banks give access to all your transaction activity?&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.3&quot; href=&quot;#fnr.3&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; I might pick it up from Atomic Habit&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.4&quot; href=&quot;#fnr.4&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://www.reddit.com/r/DIYUK/comments/1g8ljja/how_many_of_you_have_been_ripped_off_by_builders/&quot;&gt;How many of you have been ripped off by builders / tradesmen? (or know someone closely that has)&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Finding Highly Correlated Features</title>
   <link href="http://yitang.uk/2024/10/20/finding-highly-correlated-features/"/>
   <updated>2024-10-20T00:00:00+01:00</updated>
   <id>http://yitang.uk/2024/10/20/finding-highly-correlated-features</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Motivation&quot;&gt;Motivation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Implementation&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Parameterisation&quot;&gt;Parameterisation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Motivation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt;

&lt;p&gt;From a modelling perspective, it is not a big problem to have highly
correlated features in the dataset. We have regularised Lasso/Ridge
regression that are designed to deal with this kind of dataset. The
ensemble trees are robust enough to be almost immune from this. Of
course all model requires proper hyperparameter tuning with proper
cross validation.&lt;/p&gt;

&lt;p&gt;The problem raises in understanding the feature contributions: if
there are 5 features that are highly correlated, and their individual
contribute could be tiny, but their true contribution should be
aggregated by adding the contribution together and considered them as
a group, e.g. adding their coefficients in Ridge, and adding feature
importance in LightGBM.&lt;/p&gt;

&lt;p&gt;If their aggregated feature importance turns out to be indeed little,
I can remove them from the model to have a simpler model. A mistake I
used to make is removing the correlated features based on their
individual feature importance, it leads to less performant models.&lt;/p&gt;

&lt;p&gt;A better and cleaner approach is to the clean up correlated features
to begin with, then I won’t need to do the feature importance
aggregation, and it would speed up the model development cycle: there
are less features to look at, to train the model, to verify the data
qualities etc. When the model goes live in production, it translates
to less data to source and maintenance.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Implementation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;

&lt;p&gt;So I need to enrich my tool set to identify highly correlated
features. I couldn’t find an existing library that does that, so I
implemented it myself.&lt;/p&gt;

&lt;p&gt;The key steps are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Based on the correlation matrix, create a correlation long
table. Each row stands for the correlation between feature $X_1$
and feature $X_2$. Assuming there are three features in the
dataset, the table looks like this.&lt;/p&gt;

    &lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
    
    
&lt;colgroup&gt;
&lt;col class=&quot;org-right&quot; /&gt;
    
&lt;col class=&quot;org-left&quot; /&gt;
    
&lt;col class=&quot;org-left&quot; /&gt;
    
&lt;col class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Row&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;X1&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;X2&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Corr&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
    
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;1&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;A&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;B&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.99&lt;/td&gt;
&lt;/tr&gt;
    
    
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;A&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;C&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.80&lt;/td&gt;
&lt;/tr&gt;
    
    
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;3&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;B&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;C&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.95&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Remove rows if the correlation is less than the threshold $T$. It
significantly reduces the input to Step 3.&lt;/p&gt;

    &lt;p&gt;If the threshold is 0.9, then the Row 2 will be removed.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Treat the correlation table as a directed graph,&lt;/p&gt;
    &lt;ol&gt;
      &lt;li&gt;
        &lt;p&gt;Let $E$ be the unexplored nodes, filled with all the features
$X_1$ in the start, $R$ is the result.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;For each node in $E$,&lt;/p&gt;
        &lt;ol&gt;
          &lt;li&gt;
            &lt;p&gt;Continue to travel the graph in depth-first fashion until
there is no connections left, and add the connected node to
the result $R$ at each visit.&lt;/p&gt;
          &lt;/li&gt;
          &lt;li&gt;
            &lt;p&gt;Remove the connected nodes in $R$ from the remaining nodes to
explore in $E$.&lt;/p&gt;
          &lt;/li&gt;
        &lt;/ol&gt;
      &lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The vanilla Python code corresponding to Step 3 is listed below. The
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ds&lt;/code&gt; object is a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pandas.DataFrame&lt;/code&gt;, multi-indexed by $X_1$ and $X_2$,
so &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ds.loc[&apos;A&apos;].index&lt;/code&gt; gives all the connected features from $A$ whose
correlation with $A$ is large than the provided threshold.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt; 
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;find_neighbors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;recursively find the nodes connected with root in graph ds.
    &quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tolist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;find_neiboughr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;find_correlated_groups&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    The ds object is a pandas.DataFrame, multi-indexed by X1 and X2.
    &quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;defaultdict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# contiune til all nodes are visited.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;cols&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_level_values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tolist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cols&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# always start from the root as ds is directed graph.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;col&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cols&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;find_neighbors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;col&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;col&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# remove connected nodes from the remaining.
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;col&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cols&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;cols&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The result is a collection of mutually exclusive groups. Each group
contains a set of highly correlated features, for example&lt;/p&gt;

&lt;center&gt;Group A: {A, B, C}&lt;/center&gt;
&lt;center&gt;Group D: {D, K, Z}&lt;/center&gt;

&lt;p&gt;The next step is to decide which feature to keep and remove the rest
within each group. The deciding factors can be data availability
(e.g. choose the one feature with less missingness), costs in data
sourcing (e.g. free to download from the internet) or familiarity
(e.g. the feature is well understood by people) etc.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Parameterisation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;parameterisation&quot;&gt;Parameterisation&lt;/h2&gt;

&lt;p&gt;There are two hyperparameters:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;The correlation type:&lt;/strong&gt; It can be Pearson for numerical data and Spearman
for ordinal/categorical data. For a large dataset, it would take
some time to calculate the correlation matrix.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;The correlation threshold $T$:&lt;/strong&gt; The higher the threshold, the less
number of features to remove, so it is less effective. However, if
the threshold is set too low, it leads to a high false positive
rate, e.g. two features can be correlated, but they can still
complement each other in the model.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I would test a range of values from 0.9 to 1, and review the
results. Below graph shows the number of features to remove
with varying thresholds.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When $T=0.9$, there are about 95 groups, and in total 153 features to remove.&lt;/li&gt;
  &lt;li&gt;When $T=1$, there are 29 groups, and in total 37 features to
remove.&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt;  &lt;/em&gt;
  
  &lt;img src=&quot;/assets/num-features-to-remove.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;450&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt;  &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;Proper end-to-end test runs are required to identify the best
hyperparameters. As a quick rule of thumb, those 37 duplicated
features identified with $T=1$ can be dropped without further testing.&lt;/p&gt;

&lt;p&gt;The group sizes with varying thresholds $T$ provide an interesting
insight of the data. The 75% percentile of the group sizes is plotted,
which suggests that apart from the 33 duplicated features, there are a
large number of paired features (i.e. group size is 2) whose
correlation is large, more than 92%.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt;  &lt;/em&gt;
  
  &lt;img src=&quot;/assets/group-size-quantile.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;450&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt;  &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

</content>
 </entry>
 
 <entry>
   <title>Less Excel, More R/Python in Emacs</title>
   <link href="http://yitang.uk/2024/04/10/2024-04-10-less-excel-more-rpython-in-emacs/"/>
   <updated>2024-04-10T00:00:00+01:00</updated>
   <id>http://yitang.uk/2024/04/10/2024-04-10-less-excel-more-rpython-in-emacs</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Excel Is Great&quot;&gt;Excel Is Great&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#But&quot;&gt;But&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Emacs Has More To Offer&quot;&gt;Emacs Has More To Offer&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Excel Is Great&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;excel-is-great&quot;&gt;Excel Is Great&lt;/h2&gt;

&lt;p&gt;Regardless of how powerful and convenient the R/Python data ecosystem
becomes, there is still value in looking at the data in Excel,
especially when exploring the data together with less technical
people.&lt;/p&gt;

&lt;p&gt;Thanks to its trivial interface Excel is widely used in data
analysis: hoover the mouse to select columns, apply filters then
calculate some statistics. Most of the time that is all it takes to get
the answers the clients are seeking.&lt;/p&gt;

&lt;p&gt;I recently realised that having transparency and working with the
tools that clients use plays a crucial role in strengthening the trust
and delivering the impacts to the business. Sometimes I think I should
do more in Excel.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;But&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;but&quot;&gt;But&lt;/h2&gt;

&lt;p&gt;The problem with Excel is reproducibility - I’m not able to codify the
clickings done in Excel and integrate them into the automated data
pipeline. It is rather foolish to have quality control procedures,
including code reviews, automated testing, CI etc in the system but in
the very end drop all those gatekeepers and go for error-prone manuals.&lt;/p&gt;

&lt;p&gt;Plus it is way more efficient to have everything done in one place to
have a smooth process with no fractions. It is a key factor in
enabling quick turnaround.&lt;/p&gt;

&lt;p&gt;So I had the motive to limit the usage of Excel to deliver data to the
business and pair data analysis. Again I have been looking into how
much it can be done without leaving Emacs.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Emacs Has More To Offer&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-has-more-to-offer&quot;&gt;Emacs Has More To Offer&lt;/h2&gt;

&lt;p&gt;I was pleased to discover the &lt;a href=&quot;https://github.com/ShuguangSun/ess-view-data&quot;&gt;ess-view-data&lt;/a&gt; package and its Python
counterpart &lt;a href=&quot;file:///github.com/ShuguangSun/python-view-data&quot;&gt;python-view-data&lt;/a&gt;. They interact with an active R/Python
session in Emacs and print out data.frame objects in plain text,
a.k.a. view data. What’s more, it can process the data before viewing,
for example, subset the data row/column-wise, summarise the dataset
etc.&lt;/p&gt;

&lt;p&gt;The package keeps a record of the data processing pipeline so in the
end I would have a copy of the R/Python code that generates the
output. I can then effortlessly transfer the code to a script to
ensure reproducibility in the future.&lt;/p&gt;

&lt;p&gt;Another benefit derives from having a plain text buffer for the data.
It is handy in exploring large datasets with an excessive number of
columns. For example, the dataset I work on daily basis has about 300
columns. It contains different flavours of the financials, the raw
values, imputed, ranked, smoothed etc.&lt;/p&gt;

&lt;p&gt;It’s not possible to remember all the column names even after more
time was spent in giving meaningful names or ensuring the correct
columns are referred to. Having a persistent plain text buffer that I
can search for makes finding the right column names a lot easier. It
also helps to check what’s in and not in the data.&lt;/p&gt;

&lt;p&gt;That’s my first impression of Shuguang Sun’s packages, it looks
promising.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Blog in Emacs - Use Jekyll's Draft Mode</title>
   <link href="http://yitang.uk/2024/02/12/blog-in-emacs-use-jekylls-draft-mode/"/>
   <updated>2024-02-12T00:00:00+00:00</updated>
   <id>http://yitang.uk/2024/02/12/blog-in-emacs--use-jekylls-draft-mode</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h1 id=&quot;why&quot;&gt;Why?&lt;/h1&gt;

&lt;p&gt;I wasn’t aware of Jekyll’s &lt;a href=&quot;https://jekyllrb.com/docs/posts/#drafts&quot;&gt;draft mode&lt;/a&gt;. My workaround was manually
changing the &lt;a href=&quot;https://jekyllrb.com/docs/front-matter/#predefined-global-variables&quot;&gt;published field&lt;/a&gt; in the front matter to true when the post
is ready to publish. It works fine. However, with naive support from
Jekyll, there are more benefits to using the draft mode.&lt;/p&gt;

&lt;p&gt;To start with, I like the drafts saved in the _drafts folder, not
mixed with other published posts in the _posts folder. It is way more
cleaner and easy to manage. With a glimpse of my eyes, I can see what
are the posts that I am drafting.&lt;/p&gt;

&lt;p&gt;It also gives a piece of mind: only posts under the _posts folder are
exported and shown in my blog. It ensures I don’t accidentally publish
a post in draft.&lt;/p&gt;

&lt;p&gt;Once there are files in _drafts folder, adding ​-​-​drafts argument to
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jekyll serve&lt;/code&gt; command is all I need to be able to see the drafts
locally.&lt;/p&gt;

&lt;p&gt;Of course, I also need to write a bit of Lisp code to integrate the
draft mode into my blogging workflow. This is the remaining of this
post is about.&lt;/p&gt;

&lt;h1 id=&quot;implementation&quot;&gt;Implementation&lt;/h1&gt;

&lt;p&gt;For a blog post, I have the source file in org-mode and its exported
file in Markdown. Now there is a new location dimension: they can be
either in the _drafts or _posts folder.&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;mode&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;source file (org mode)&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;exported md in Jekyll&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;draft&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;org/_drafts/on_image.org&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;jekyll/_drafts/on_image.md&lt;/td&gt;
&lt;/tr&gt;


&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;publish&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;org/_posts/2027_02_08_on_image.org&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;jekyll/_posts/2027_02_28_on_image.md&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;In terms of content, the published post and its final draft, and their
exported counterparts are the same, only in different locations. Their
content can be different to have some flexibility, e.g. published post
has higher resolution of screenshots. This feature is possible to
implement in the future. For now, I follow the simple “same but in
different places” rule.&lt;/p&gt;

&lt;p&gt;The new process looks like this: When I publish a post, it moves the
org file from _draft to _posts folder, adds a date to the filename
(which I have already), and then triggers the exporting process. To
avoid duplication, it removes the original org file and its exported
draft in Markdown.&lt;/p&gt;

&lt;p&gt;To achieve that, the main missing piece from &lt;a href=&quot;https://github.com/yitang/.emacs.d/blob/main/config.org#jekyll-in-emacs&quot;&gt;my current Emacs
configuration&lt;/a&gt; is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yt/jekyll-find-export&lt;/code&gt; function (see
below). For a post in _drafts or _posts, it finds the full path of the
corresponding exported markdown file. I can then delete it or start
the exporting process.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-is-draft-p&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;if the file is inside of the draft directory, it is a draft.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;draft-dir&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-truename&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-source-drafts-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;filepath&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-truename&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-file-name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;string-prefix-p&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;draft-dir&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;filepath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;


&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-find-export&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;find the full path to the exported file of the current post.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;src-file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-nondirectory&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-file-name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-with-extension&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;src-file&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;.md&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;yt/jekyll-is-draft-p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-concat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-site-draft-dir&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-concat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-site-post-dir&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</content>
 </entry>
 
 <entry>
   <title>Blog in Emacs - Work with Images</title>
   <link href="http://yitang.uk/2024/02/03/blog-in-emacs-work-with-images/"/>
   <updated>2024-02-03T00:00:00+00:00</updated>
   <id>http://yitang.uk/2024/02/03/blog-in-emacs--work-with-images</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;I do my best to keep my blog simple, I would not use images/videos
unless I can’t demonstrate well enough in plain text, for example, to
demonstrate a mobile app using a screenshot (&lt;a href=&quot;http://yitang.uk/2024/01/28/learn-in-emacs-building-up-vocabulary/&quot;&gt;Learn in Emacs - Building
Up Vocabulary&lt;/a&gt;) or how to represent stock price charts for neutral
network (&lt;a href=&quot;http://yitang.uk/2023/01/05/low-latency-sparse-boolean-array/&quot;&gt;Speed Up Sparse Boolean Data&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Even when I do, I keep the usage to the bare minimum: all I do is
insert the image, make it centralised, and put a caption on top of it.&lt;/p&gt;

&lt;p&gt;The Org-mode supports images well, with a few additional HTML
attributes for each inserted image, I can fine-control the images’
position, alignment, size etc.&lt;/p&gt;

&lt;p&gt;However, I can’t get the benefits because I migrated my blog posts
from HTML to the Markdown format for its simplicity. Plus Jekyll comes
with its little quirks when it comes to Markdown images. So I have to
write something for myself.&lt;/p&gt;

&lt;p&gt;I managed to achieve a satisfactory workflow for my simple usage of
images in Emacs. It works well for the Jekyll site. Here’s the code
and explanation.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-download&lt;/code&gt; :&lt;/strong&gt; is the package that I use to create the images
for blogging from various sources.&lt;/p&gt;

    &lt;p&gt;I can drag images from external applications to Emacs, including
browsers, Preview, or iPhoto.  The images will be saved in the
/project/assets/org-download folder per my matrix/project setup.&lt;/p&gt;

    &lt;p&gt;For the application that I can’t drag the images, I take a
screenshot inside Emacs by calling the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-download-screenshot&lt;/code&gt;
function.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yt/jekyll-copy-from-org-downkload&lt;/code&gt;:&lt;/strong&gt; is a little helper function
that transfers the files under the org-download folder to the
/assets folder in a Jekyll site.&lt;/p&gt;

    &lt;p&gt;It lists the files in the source org-download folder and provides
them as a selection list. It comes with auto-completion and fuzzy
matches to help me choose the file.&lt;/p&gt;

    &lt;p&gt;It also strips out the special characters in the filename otherwise
the URL will be broken in Jekyll.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yt/jekyll-insert-image&lt;/code&gt;:&lt;/strong&gt; lists the files in the /assets folder so
I can choose easily which image to use.&lt;/p&gt;

    &lt;p&gt;It brings up the Liquid template for image so I don’t have to
remember its syntax. It ensures the file path is in the correct
format (starts with ​/assets​/), I just fill in the caption and size
after selecting the file.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An extract of the code is listed below for demonstration
propose. Future updates will be reflected in &lt;a href=&quot;https://github.com/yitang/.emacs.d/blob/main/config.org#image-hyperlink&quot;&gt;my .emacs.d git repo&lt;/a&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-insert-image&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;caption&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;read-file-name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;images to include: &quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-assets-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                     &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;read-string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Caption: &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-insert-image-liquid-template&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-nondirectory&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;caption&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-copy-org-download-to-assets&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;copy file from project org-download folder to the blog assets folder.
it ensures there&apos;s no underscore(_) in the file name.
&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;read-file-name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;file to copy: &quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-download-image-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ext&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-extension&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;base&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-base&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;dest-base&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;jekyll-make-slug&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;base&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;expand-file-name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-with-extension&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dest-base&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jekyll-site-assets-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;copy-file&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;dest-file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</content>
 </entry>
 
 <entry>
   <title>Learn in Emacs - Building Up Vocabulary</title>
   <link href="http://yitang.uk/2024/01/28/learn-in-emacs-building-up-vocabulary/"/>
   <updated>2024-01-28T00:00:00+00:00</updated>
   <id>http://yitang.uk/2024/01/28/learn-in-emacs--building-up-vocabulary</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#WHY?&quot;&gt;WHY?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Workflow for Building Vocabulary&quot;&gt;Workflow for Building Vocabulary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Revision on Mobile Devices&quot;&gt;Revision on Mobile Devices&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Org-mode Based Simple Study Strategies&quot;&gt;Org-mode Based Simple Study Strategies&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Emacs Lisp Implementation&quot;&gt;Emacs Lisp Implementation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;WHY?&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;why&quot;&gt;WHY?&lt;/h2&gt;

&lt;p&gt;Research shows having effective and rapid communication can boost
creativity and spark joy&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. I believe in it from my personal
experience in conversing, reading a book in my native language or
trying to understand a large codebase.&lt;/p&gt;

&lt;p&gt;I wasn’t enable to achieve similar results when it came to using
English. In the past I have been trying to improve my English language
skills to boost my productivity in reading books and to make it more
enjoyable. The approach was practising more in reading and writing.
However, I started questioning the effectiveness. This year I decided
to take one step back to focus on the basics and improve my
vocabulary.&lt;/p&gt;

&lt;p&gt;I want to take the “slip-box” method&lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; which proved to be effective for
learning Emacs Lisp language. It is a bottom-up approach, so I would
have one note for each word with the explanation in it, links to other
similar words, or words I got confused with.&lt;/p&gt;

&lt;p&gt;One advantage is that I can also leverage my existing setup.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Workflow for Building Vocabulary&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;workflow-for-building-vocabulary&quot;&gt;Workflow for Building Vocabulary&lt;/h2&gt;

&lt;p&gt;When come across a new word that I’m not sure about its meaning, I
will&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;move the cursor to the word,&lt;/li&gt;
  &lt;li&gt;press &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;F1 d&lt;/code&gt; to look into the dictionary, the result will shown in
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osx-dictionary&lt;/code&gt; buffer,&lt;/li&gt;
  &lt;li&gt;read its meaning and try to understand it,&lt;/li&gt;
  &lt;li&gt;press &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;r&lt;/code&gt; to listen the pronunciation and read after it. I usually
repeat it a couple of times to deepen the memory,&lt;/li&gt;
  &lt;li&gt;press &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; to create an atomic note. it has the dictionary meaning
in it for future reference,&lt;/li&gt;
  &lt;li&gt;edit the notes to add my understanding and copy the
sentence/paragraph that contains the new word.&lt;/li&gt;
  &lt;li&gt;press &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C-c C-c&lt;/code&gt; to save it to my vocabulary database, which is just
a folder with flat org-mode files.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There’s quite a lot of automation so I can focus on understanding it
(Step 3) and write a good note (Step 6) in my own words.&lt;/p&gt;

&lt;p&gt;This workflow depends on two Emacs packages:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;osx-dictionary:&lt;/strong&gt; it interfaces with macOS’s dictionary app. It
displays the meaning and says the pronunciation.&lt;/p&gt;

    &lt;p&gt;The package is well written and easy to work with; I managed to
extend it to add Steps 5-7 with little effort.&lt;/p&gt;

    &lt;p&gt;It has limitations: it works only in macOS and it only outputs one
dictionary. Adding the meaning in Chinese requires a few more manual
steps: 1) press &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;o&lt;/code&gt; to open the Dictionary.app, 2) go to the Chinese
dictionary tab and copy the meaning, and 3) paste it to the note in
Emacs.&lt;/p&gt;

    &lt;p&gt;I personally find the Oxford dictionary macOS uses is not easy to
follow. From time to time I have to visit
&lt;a href=&quot;https://dictionary.cambridge.org/&quot;&gt;https://dictionary.cambridge.org/&lt;/a&gt; to find the explanation that I
could understand. In transforming my old vocabulary notes to the new
format, I found the explanation from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vocabulary.com&lt;/code&gt; is the best. I
might have to resurrect my voca-builder&lt;sup&gt;&lt;a id=&quot;fnr.3&quot; class=&quot;footref&quot; href=&quot;#fn.3&quot; role=&quot;doc-backlink&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; package.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;org-roam:&lt;/strong&gt; it interfaces with org-mode for creating atomic
notes. It avoids duplication: if there’s a note for the word that
exists already, it opens the note, so I can have a look and enrich
it.&lt;/p&gt;

    &lt;p&gt;I can link notes/words in my vocabulary database which is very
useful because for me learning by comparing is super effective.&lt;/p&gt;

    &lt;p&gt;The org-mode provides a lot of functionalities that might be useful
to facility learning in the future.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a id=&quot;Revision on Mobile Devices&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;revision-on-mobile-devices&quot;&gt;Revision on Mobile Devices&lt;/h2&gt;

&lt;p&gt;Once I have a fleet of notes, the next step is to revise them on a
regularly. The routine I’m trying to get myself into is rereading the
notes I created for the last few days while waiting for the tube/bus,
I call it a revision break.&lt;/p&gt;

&lt;p&gt;So far I have an Emacs lisp program&lt;sup&gt;&lt;a id=&quot;fnr.4&quot; class=&quot;footref&quot; href=&quot;#fn.4&quot; role=&quot;doc-backlink&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; that filters all my notes by time
so I have last_24_hours.org, last_3_days.org and
last_7_days.org. These files are synced with iCloud so they are
available to review on my iPad and iPhone using the &lt;a href=&quot;https://beorgapp.com/&quot;&gt;beorg App&lt;/a&gt;.&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Reading my vocabulary notes on iPhone &lt;/em&gt;
  
  &lt;img src=&quot;/assets/2024012822433515c6563022e747a397f76e5c777e61be1102o.jpeg&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;400&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Reading my vocabulary notes on iPhone &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;&lt;a id=&quot;Org-mode Based Simple Study Strategies&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;org-mode-based-simple-study-strategies&quot;&gt;Org-mode Based Simple Study Strategies&lt;/h2&gt;

&lt;p&gt;For the dedicated study sessions, I need a few strategies to shortlist
the notes. I think They will be based on the metadata of the
note. With org-mode’s API, it should be easy to implement.&lt;/p&gt;

&lt;p&gt;I haven’t done it yet, but the idea is to score the notes from 0 to 5,
5 means the most important notes so I would study them first, 0 means
not important notes so will be at the bottom.&lt;/p&gt;

&lt;p&gt;There can be multiple scores, for example, one for pronunciation. the
word that I got the pronunciation completely wrong would get a 5, and
the word would get a 3 if sometimes I got it wrong, and sometimes I
got it right.&lt;/p&gt;

&lt;p&gt;Another score is how many times I looked into the word. There are
words that I just keep forgetting about it, or keep confusing with
another similar word. So the property of ‘visited_at’ gets a timestamp
appended at the time of visiting and the score is calculated by the
number of timestamps.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Emacs Lisp Implementation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-lisp-implementation&quot;&gt;Emacs Lisp Implementation&lt;/h2&gt;

&lt;p&gt;Adding an action to the headline in osx-dictionary’s buffer.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;osx-dictionary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;osx-dictionary-mode-header-line&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;append&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:propertize&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;a&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;face&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;mode-line-buffer-id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;s&quot;&gt;&quot;: Add to vocabulary&quot;&lt;/span&gt;
                &lt;span class=&quot;s&quot;&gt;&quot;    &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;nv&quot;&gt;osx-dictionary-mode-header-line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yt/add-to-vocabulary&lt;/code&gt; to key &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; in osx-dictionary buffer. It
creates an note using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-roam&lt;/code&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defvar&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;vocabulary-repo-dir&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;~/matrix/learning/meta-leanring/vocabulary&quot;&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;where to save the vocabulary notes.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defvar&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/voca--roam-template&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;d&quot;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;default&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;plain&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;%?&quot;&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:target&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file+head&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;%&amp;lt;%Y%m%d%H%M%S&amp;gt;-${slug}.org&quot;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;#+title: ${title}

%?

#+begin_example
%(yt/osx-dict--get-meaning)
#+end_example

&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
     &lt;span class=&quot;ss&quot;&gt;:unnarrowed&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;roam template for vocabulary notes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/add-to-vocabulary&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;add a new vocabulary note for ther highlihted region or word at
point.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-roam-directory&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;expand-file-name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;notes&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;vocabulary-repo-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-roam-db-location&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;expand-file-name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;org-roam.db&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-roam-directory&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-roam-capture-templates&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/voca--roam-template&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-roam-node-find&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;yt/osx-dict--get-word-and-pronounce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/osx-dict--get-word-and-pronounce&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;extract the word and its pronunciation from the *osx-dictionary* buffer&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;with-current-buffer&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;*osx-dictionary*&quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;goto-char&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;search-forward&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;|&quot;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-substring-no-properties&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/osx-dict--get-meaning&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;wrap the *osx-dictionary* buffer cnotent as a string&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;with-current-buffer&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;*osx-dictionary*&quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-substring-no-properties&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;define-key&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;osx-dictionary-mode-map&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;a&quot;&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;yt/add-to-vocabulary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; reference is lost; it is somewhere in the book “The Second
Mountain”, the chapter on religions.&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; from the book “How to Take Smart Notes”. I plan to reread this
book in early 2024.&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.3&quot; href=&quot;#fnr.3&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; My first Emacs package in 2015,
&lt;a href=&quot;https://github.com/yitang/voca-builder&quot;&gt;https://github.com/yitang/voca-builder&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.4&quot; href=&quot;#fnr.4&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; next blog post is on my lisp programs&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Atomic Habit in Emacs - Keep Git Repos Clean</title>
   <link href="http://yitang.uk/2024/01/14/atomic-habit-in-emacs-keep-git-repos-clean/"/>
   <updated>2024-01-14T00:00:00+00:00</updated>
   <id>http://yitang.uk/2024/01/14/atomic-habit-in-emacs--keep-git-repos-clean</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Why?&quot;&gt;Why?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Emacs Lisp Helper&quot;&gt;Emacs Lisp Helper&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Practise&quot;&gt;Practise&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Why?&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;why&quot;&gt;Why?&lt;/h2&gt;

&lt;p&gt;I am having a hard time keeping my git repositories clean: there are
just too many repositories, I counted 31 in total, and I have 5
computers where I work on them.&lt;/p&gt;

&lt;p&gt;The consequence is that sometimes I get surprised at seeing a lot of
seemingly useful changes that are not committed to the git repo. I had
to stop whatever I was doing to just think about what to do with those
changes. It breaks the flow!&lt;/p&gt;

&lt;p&gt;There are other occasions where I thought I fixed some bugs, but I
don’t have the patches on my laptop. It turned out I didn’t check in
to the cloud, so I have to log back to the right server to run a couple
of git commands, or if I don’t have access to the servers, I have
to fix the bugs from scratch again. It is inefficient!&lt;/p&gt;

&lt;p&gt;It can happen a lot in active projects where I work on multiple
systems and multiple git repos or when I travel. I plan to revisit my
filesystem (which is inspired by Stephen Wolfram &lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;) and tech
setup to reduce the number of repos by merging them and keeping only
1 laptop, 1 workstation and 1 server. This is something for summer, it
can reduce the severity of the problem but can not eliminate it.&lt;/p&gt;

&lt;p&gt;At the moment, I just have to become more disciplined in managing
files, e.g. to have an atomic habit of checking my git repo regularly,
or at least do it once at the end of the day, or as part of the
shutdown ritual after finishing a task&lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Emacs Lisp Helper&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-lisp-helper&quot;&gt;Emacs Lisp Helper&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;The 3rd Law of Behavior Change is make it easy.&lt;/p&gt;

  &lt;p&gt;James Clear, Atomic Habit&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To facilitate the forming of this habit, I implemented a utility
function in Lisp to list the dirty git repo, and provide a clickable
link to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;magit-status&lt;/code&gt; buffer of the git repo. With one click on
the hyperlink, I can start to run git commands via the mighty &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;magit&lt;/code&gt;
package. I bind this action to keystroke &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;F9-G&lt;/code&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/git--find-unclean-repo&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;;; (interactive)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dolist&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;dir&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;directory-files-recursively&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;\\.git$&quot;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;message&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;checking repo %s&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;git-dir&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-parent-directory&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
           &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;default-directory&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;unless&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;string=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;shell-command-to-string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;git status --porcelain&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-dir&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/dirty-git-repos&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;&amp;amp;optional&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;list the dirty git repos, provides a clickable link to their
magit-status buffer.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;read-directory-name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Where&apos;s the root directory?&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;get-buffer-create&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;*test-git-clean*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;git-repos&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;yt/git--find-unclean-repo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;with-current-buffer&lt;/span&gt;  &lt;span class=&quot;nv&quot;&gt;buffer&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;unless&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;eq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;major-mode&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;org-mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;goto-char&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Number of dirty git repos: %s &quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-repos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dolist&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;git-repo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-repos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;\n[[elisp:(magit-status \&quot;%s\&quot;)][%s]]&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-repo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;git-repo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The workhorse is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git status --porcelain&lt;/code&gt; command: If the git repo
is clean, it returns nothing, otherwise, it outputs the file names
whose changes are not checked in, e.g. the first file is modified (M),
and the second file is not untracked (??).&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; M config/Dev-R.el
?? snippets/org-mode/metric
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The rest of the code is for parsing the outputs and turning them
into a user-friendly format in Org-mode. What’s interesting is that
The org-mode provides a kind of hyperlink that evaluates Lisp
expressions, using the example below,&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;nv&quot;&gt;[elisp:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;magit-status&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/foo&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Git Status of Repo /foo&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The description of the hyperlink is “Git Status of Repo /foo” , after
I click it, it runs the expression &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(magit-status &quot;/foo&quot;)&lt;/code&gt; which shows
the git status of /foo repo in a dedicated buffer.&lt;/p&gt;

&lt;p&gt;Before executing it will ask for a confirmation. It can be a bit
annoying and inconvenienced at first which naturally leads to the
temptation of removing this behaviour by setting
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-link-elisp-confirm-function&lt;/code&gt; to nil. I discourage you from doing
so in case someone embeds funny codes, (for example &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rm -rf ~/&lt;/code&gt;) in
a hyperlink, so make sure to check that variable’s documentation
before changing it&lt;sup&gt;&lt;a id=&quot;fnr.3&quot; class=&quot;footref&quot; href=&quot;#fn.3&quot; role=&quot;doc-backlink&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;!&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Practise&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;practise&quot;&gt;Practise&lt;/h2&gt;

&lt;p&gt;It was fun to write the lisp functions. I learnt how to use the
optional function argument and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;interactive&lt;/code&gt; so that the function can
be used both interactively and pragmatically. I’m very much wanting to
spend more time in coding, to enhance it with some ideas I got from
reading Xu Chunyang’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osx-dictionary&lt;/code&gt; package&lt;sup&gt;&lt;a id=&quot;fnr.4&quot; class=&quot;footref&quot; href=&quot;#fn.4&quot; role=&quot;doc-backlink&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;However, the effectiveness of those functions has little to do with
the extra features I had in mind but really depends on how I use
them. Solving the problems requires deliberate practise and changing
my behaviours so that cleaning git repos becomes a habit of mine, which
is always the hardest part.&lt;/p&gt;

&lt;p&gt;One key indicator for this habit&lt;sup&gt;&lt;a id=&quot;fnr.5&quot; class=&quot;footref&quot; href=&quot;#fn.5&quot; role=&quot;doc-backlink&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; can be the number of check-ins
and see if there’s a substantial increase from today.&lt;/p&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; see Stephen Wolfram’s &lt;a href=&quot;https://writings.stephenwolfram.com/2019/02/seeking-the-productive-life-some-details-of-my-personal-infrastructure/&quot;&gt;blog posts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Cal Newport, &lt;a href=&quot;https://www.amazon.co.uk/Deep-Work-Focused-Success-Distracted/dp/1455586692&quot;&gt;Deep Work&lt;/a&gt;, Page 151&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.3&quot; href=&quot;#fnr.3&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://orgmode.org/manual/Code-Evaluation-Security.html&quot;&gt;https://orgmode.org/manual/Code-Evaluation-Security.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.4&quot; href=&quot;#fnr.4&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://github.com/xuchunyang/osx-dictionary.el&quot;&gt;https://github.com/xuchunyang/osx-dictionary.el&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.5&quot; href=&quot;#fnr.5&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; inspired by Andrew Grove’s book &lt;a href=&quot;https://www.amazon.co.uk/High-Output-Management-Andrew-Grove/dp/0679762884&quot;&gt;High Output Management&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>GPG in Emacs - Functions to Decrypt and Delete All</title>
   <link href="http://yitang.uk/2024/01/06/gpg-in-emacs-functions-to-decrypt-and-delete-all/"/>
   <updated>2024-01-06T00:00:00+00:00</updated>
   <id>http://yitang.uk/2024/01/06/gpg-in-emacs--functions-to-decrypt-and-delete-all</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Motivation&quot;&gt;Motivation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Emacs Lisp Implementation&quot;&gt;Emacs Lisp Implementation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Bash Implementation&quot;&gt;Bash Implementation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Motivation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt;

&lt;p&gt;Continuing from &lt;a href=&quot;http://yitang.uk/2023/12/28/gpg-in-emacs-first-step-towards-data-security/&quot;&gt;my last post&lt;/a&gt;, the EPA provides a seamless interface when
working with GPG files in Emacs. But there are situations where I have
to work with GPG files using other programs (mostly Python) which EPA
cannot help.&lt;/p&gt;

&lt;p&gt;For those cases, I have to decrypt the GPG files first before using
them (for example, calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pandas.read_csv&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Obviously, there’s no point in encrypting a file if there is a
decrypted version next to it. So I also need a function to delete all
the decrypted files.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Emacs Lisp Implementation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-lisp-implementation&quot;&gt;Emacs Lisp Implementation&lt;/h2&gt;

&lt;p&gt;Of course, I run Python inside of Emacs, I wrote the Lisp functions to
decrypt GPG files and delete all the decrypted files.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/gpg--decrypt-recursively&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;It decrypts all the files ends .gpg under the root-dir. The decrypted files have the same filename but without the .gpg extension.

It stops if the decryption fails. 
&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dolist&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;directory-files-recursively&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;\\.gpg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;;; the 2nd argument for epa-decrypt-file can only be the base filename without the directory.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;default-directory&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-directory&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;epa-decrypt-file&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-base&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/gpg--delete-decrypted-files&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;It deletes the decrypted files under the root-dir directory.

e.g. if there&apos;s a file foo.tar.gz.gpg, it attempts to remove the foo.tar.gz file.
&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dolist&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;directory-files-recursively&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root-dir&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;\\.gpg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;delete-file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-sans-extension&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;A bit of explanation:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;directory-files-recursively:&lt;/strong&gt; searches for files with a
pattern. Here, it returns all the files ending with &lt;em&gt;.gpg&lt;/em&gt; under the
given &lt;em&gt;root-dir&lt;/em&gt;,&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;dolist:&lt;/strong&gt; loops over the GPG files to process them one by one,&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;epa-decrypt-file:&lt;/strong&gt; decrypts a GPG file into a new file.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;delete-file:&lt;/strong&gt; deletes a given filename.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It seems the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epa-decrypt-file&lt;/code&gt; function does not like the new
filename with the directory in its path, so I have to set the default
directory (working directory) and use the base filename after removing
the directory as a workaround.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Bash Implementation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;bash-implementation&quot;&gt;Bash Implementation&lt;/h2&gt;

&lt;p&gt;It would be useful to have those functionalities outside of the Emacs,
so I implemented their counterpart in Bash.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;decrypt_recursively&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# PS: this function is equivalent to `gpg --decrypt-files $1/**/*.gpg`&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;fn &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;find &lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-iname&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;*.gpg&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;do
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo &lt;/span&gt;decrypt &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; to &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;%.*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
        gpg &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;%.*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; 
    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;function &lt;/span&gt;remove_decrypted_files&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;fn &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;find &lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-iname&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;*.gpg&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;do
        &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo &lt;/span&gt;removing &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;%.*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;%.*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The interface is the same: given a root directory, it decrypts all the
GPG files or deletes the decrypted files.&lt;/p&gt;

&lt;p&gt;A little bit of Bash:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;$1:&lt;/strong&gt; refers to the first function argument, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$2&lt;/code&gt; refers to the
second function argument and so on. This is the Bash way. When the
function is called, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$1&lt;/code&gt; will be replaced with the actual argument,
here it means the root directory.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;$(find …):&lt;/strong&gt; is a list of files returned by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;find&lt;/code&gt; program. In
this context, it stands for all the files whose filename ends with
&lt;em&gt;.gpg&lt;/em&gt;.&lt;/p&gt;

    &lt;p&gt;It can be achieved using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ls&lt;/code&gt; program but it will be a lot
slower &lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and requires some configuration in MacOS &lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;${fn%.*}:&lt;/strong&gt; removes the last file extension of the variable $fn$,
for example, &lt;em&gt;foo.tar.gz.gpg&lt;/em&gt; becomes &lt;em&gt;foo.tar.gz&lt;/em&gt;.&lt;/p&gt;

    &lt;p&gt;Another approach is using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$(basename $fn .gpg)&lt;/code&gt; to remove the
&lt;em&gt;.gpg&lt;/em&gt; extension explicitly.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;for, do, done:&lt;/strong&gt; loops through each file.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Bash functions have the advantage of being easily incorporated
into the system, for example, call the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;remove_decrypted_files&lt;/code&gt;
function automatically prior to shutting down or after login.&lt;/p&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://unix.stackexchange.com/questions/12659/why-does-ls-take-so-much-longer-than-ls&quot;&gt;why glob is slow&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; href=&quot;#fnr.2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://apple.stackexchange.com/questions/291287/globstar-invalid-shell-option-name-on-macos-even-with-bash-4-x&quot;&gt;how to enable globstar option in MacOS&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>GPG in Emacs - First Step Towards Data Security</title>
   <link href="http://yitang.uk/2023/12/28/gpg-in-emacs-first-step-towards-data-security/"/>
   <updated>2023-12-28T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/12/28/gpg-in-emacs--first-step-towards-data-security</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h3 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#WHY?&quot;&gt;WHY?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#GNU Privacy Guard (GPG)&quot;&gt;GNU Privacy Guard (GPG)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#EPA - Emacs Interface to GPG&quot;&gt;EPA - Emacs Interface to GPG&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Org-Agenda and Dired&quot;&gt;Org-Agenda and Dired&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Lisp to Close all GPG Files&quot;&gt;Lisp to Close all GPG Files&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;WHY?&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;why&quot;&gt;WHY?&lt;/h3&gt;

&lt;p&gt;I have growing concerns about data security. It is not that I have
something to hide, it’s that I don’t like how my data is being
harvested in general by the big corporations for their own benefits,
which is mostly trying to sell me stuff that I don’t need or I
purchased already. Seeing the advertisements specifically targeting me
motivates me to do something.&lt;/p&gt;

&lt;p&gt;Setting my personal cloud seems a bit too extreme, and I don’t have
the time for it anyway. So I did a little “off-the-grid” experiment
in which I exclusively used an offline Debian laptop for data
sensitivity work (password management, personal finance, diary
etc). It is absolutely secure for sure, but the problem is
accessibility: I can only work when I have access to the physical
hardware.&lt;/p&gt;

&lt;p&gt;It becomes infeasible when I travel, and it gives me some headaches to
maintain one more system. Also, the laptop’s screen is only 720p, I
can literally see the pixels when I write; it feels criminal to not
use the MBP’s Retina display. Lastly, It cannot be off the grid
completely; at one point, I have to back it up to the cloud.&lt;/p&gt;

&lt;p&gt;So I spent some time researching and learning. I just need a data
protection layer so that I don’t have to worry  about leaking private
data accidentally by myself, or the cloud storage provider getting hacked.&lt;/p&gt;

&lt;p&gt;The benefits include not only having peace of mind but also
encouraging myself to work on those types of projects with greater
convenience.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;GNU Privacy Guard (GPG)&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;gnu-privacy-guard-gpg&quot;&gt;GNU Privacy Guard (GPG)&lt;/h3&gt;

&lt;p&gt;is the tool I settled with. It is a 24 years old software that enables
encrypting/decrypting files, emails or online communication in
general. It is part of the GNU project which weighs a lot to me.&lt;/p&gt;

&lt;p&gt;There are two methods in GPG:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Symmetric method:&lt;/strong&gt; The same password is used to both encrypt and decrypt
the file, thus the symmetric in its name.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Asymmetric method:&lt;/strong&gt; It requires a public key to encrypt, and a
separate private key to decrypt.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There seems no clear winner in which method is better&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. I choose
the asymmetric method simply for its ease of use. The symmetric method
requires typing the passwords twice whenever I save/encrypt the file
which seems too much.&lt;/p&gt;

&lt;p&gt;The GPG command line interface is simple. Take the below snippet as an
example,&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
gpg &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Bob&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; foo.org
gpg &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; foo2.org &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; foo.org.gpg&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The first line encrypts the foo.org file using the public key identified as
“Bob”. It results in a file named foo.org.gpg.&lt;/p&gt;

&lt;p&gt;The second line decrypts the foo.org.gpg file to foo2.org which will
be identical to foo.gpg.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;EPA - Emacs Interface to GPG&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;epa---emacs-interface-to-gpg&quot;&gt;EPA - Emacs Interface to GPG&lt;/h3&gt;

&lt;p&gt;Emacs provides a better interface to GPG: Its EPA package enables me
to encrypt/decrypt files in place. So I don’t have to keep jumping
between the decrypted file (foo.org) and the encrypted file
(foo.org.gpg) while working on it.&lt;/p&gt;

&lt;p&gt;Below is the simple configuration that works well for me and its
explanation.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;epa-file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;epa-file-enable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;epa-file-encrypt-to&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;foo@bar.com&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;epg-pinentry-mode&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;loopback&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;epa-file-enable:&lt;/strong&gt; is called to add hooks to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;find-file&lt;/code&gt; so that
decrypting starts after opening a file in Emacs. It also ensures the
encrypting starts when saving a GPG file I believe.&lt;/p&gt;

    &lt;p&gt;To stop this behaviour, call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(epa-file-disbale)&lt;/code&gt; function.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;epa-file-encrypt-to:&lt;/strong&gt; to choose the default key for
encryption.&lt;/p&gt;

    &lt;p&gt;This variable can be file specific, for example, to use the key
belonging to foo2@bar.com key, drop the following in the file&lt;/p&gt;

    &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;;; -*- epa-file-encrypt-to: (&quot;foo2@bar.com&quot;) -*-
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;epg-pinentry-mode:&lt;/strong&gt; should be set to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loopback&lt;/code&gt; so that GPG reads
the password from Emacs’ minibuffer, otherwise, an external program
(pinentry if installed) is used.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a id=&quot;Org-Agenda and Dired&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;org-agenda-and-dired&quot;&gt;Org-Agenda and Dired&lt;/h3&gt;

&lt;p&gt;That’s more benefits Emacs offers in working with GPG files. Once I
have the EPA configured, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-agenda&lt;/code&gt; command works pretty well
with encrypted files with no extra effort.&lt;/p&gt;

&lt;p&gt;In the simplified example below, I have two GPG files as
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-agenda-files&lt;/code&gt;. When the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-agenda&lt;/code&gt; is called, Emacs first try
to decrypt the foo.org.gpg file. It requires me to type the password
in a minibuffer.&lt;/p&gt;

&lt;p&gt;The password will be cached by the GPG Agent and will be used to
decrypt the bar.org.gpg assuming the same key is used for both files. So I
only need to type the passphrase once.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;setq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-agenda-files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;foo.org.gpg&quot;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;bar.org.gpg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-agenda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;After that, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-agenda&lt;/code&gt; works as if these GPG files are normal
unencrypted files; I can extract TODO lists, view the clock summary
report, search text and check schedules/deadlines etc.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired&lt;/code&gt; provides functions to encrypt (shortcut “:e”) and decrypt
(shortcut “:d”) multiple marked files in a dired buffer. Under the
hood, they call the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epa-encrypt-file&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;epa-decrypt-file&lt;/code&gt;
functions.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Lisp to Close all GPG Files&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;lisp-to-close-all-gpg-files&quot;&gt;Lisp to Close all GPG Files&lt;/h3&gt;

&lt;p&gt;It seems that once a buffer is decrypted upon opening or encrypted upon
saving in Emacs, it stays as decrypted forever. So I need a utility
function to close all the GPG buffers in Emacs to avoid leakage.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/gpg--kill-gpg-buffers&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;It attempts to close all the file visiting buffers whose filename ends with .gpg.

It will ask for confirmation if the buffer is modified but unsaved.&quot;&lt;/span&gt;

  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kill-matching-buffers&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;\\.gpg$&quot;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Before I share my screens or start working in a coffee shop, I would
call this function to ensure I close all buffers with sensitive data.&lt;/p&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;p&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; href=&quot;#fnr.1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://security.stackexchange.com/questions/54360/when-should-i-use-symmetric-encryption-instead-of-rsa&quot;&gt;stackexchange: symmetric vs asymmetric method&lt;/a&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Jekyll in Emacs - Align URL with Headline</title>
   <link href="http://yitang.uk/2023/12/19/jekyll-in-emacs-align-headline-with-url/"/>
   <updated>2023-12-19T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/12/19/jekyll-in-emacs--align-headline-with-url</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Problem&quot;&gt;Problem&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Solution&quot;&gt;Solution&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Implementation&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;Problem&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;problem&quot;&gt;Problem&lt;/h2&gt;

&lt;p&gt;While I was working on improving the URL in my last post, I noticed
the URLs are not readable, for example,&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://yitang.uk/2023/12/18/jekyll-in-emacs-update-blog-post-title-and-date/#org0238b9f&quot;&gt;http://yitang.uk/2023/12/18/jekyll-in-emacs-update-blog-post-title-and-date/#org0238b9f&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The URL links to the section called Code, so a much better URL should
be&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://yitang.uk/2023/12/18/jekyll-in-emacs-update-blog-post-title-and-date/#Code&quot;&gt;http://yitang.uk/2023/12/18/jekyll-in-emacs-update-blog-post-title-and-date/#Code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My notes show I have had this issue since 9 months ago. I made another
attempt, but still could not find a solution!&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Solution&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;solution&quot;&gt;Solution&lt;/h2&gt;

&lt;p&gt;I then switched to tidy up my Emacs configuration, and the variable
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-html-prefer-user-labels&lt;/code&gt; caught my eye.&lt;/p&gt;

&lt;p&gt;its documentation says&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;By default, Org generates its own internal ID values during HTML
export.

When non-nil use user-defined names and ID over internal ones.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So “#org0238b9f” is generated by org-mode. They are randomly
generated; they change if I update the export file. It means every
time I update a blog post, it breaks the URLs. This was a problem I
wasn’t aware of.&lt;/p&gt;

&lt;p&gt;Anyway, what’s important is that, in the end, it says&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Independently of this variable, however, CUSTOM_ID are always
used as a reference.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s it, I just need to set CUSTOM_ID. That’s the solution to my
problem. It is hidden in the documentation of some variables…&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Implementation&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;

&lt;p&gt;So I need a function to loop through each node, and set the &lt;em&gt;CUSTOM_ID&lt;/em&gt;
property to its headline. The org-mode API provides three helpful
functions for working with org files:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-entry-get&lt;/code&gt;:&lt;/strong&gt; to get a textual property of a node. the headline
title is referenced as “ITEM”,&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-entry-put&lt;/code&gt;:&lt;/strong&gt; to set a property of a node,&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-map-entries&lt;/code&gt;:&lt;/strong&gt; to apply a function to each node.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I changed the final function a bit so it is used as an export hook
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org-export-before-processing-functions&lt;/code&gt;) as an experiment. With this
setup, it runs automatically whenever I export a blog post in org-mode
to Markdown. Also, it works on the exported file so it leaves the
original org file unchanged.&lt;/p&gt;

&lt;p&gt;The code is listed below. It can also be found at &lt;a href=&quot;https://github.com/yitang/.emacs.d/blob/main/config.org#next-update-custom_id-property&quot;&gt;my .emacs.d git repo&lt;/a&gt;
which includes many other useful Emacs configurations for Jekyll.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
 &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll--create-or-update-custom_id-field&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;so that the CUSTOM_ID property is the same as the headline and 
the URL reflects the headline.

by default, the URL to a section will be a random number.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-entry-put&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;CUSTOM_ID&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-entry-get&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;ITEM&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll--create-or-update-custom_id-field-buffer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;backend&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;eq&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;backend&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;jekyll-md&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-map-entries&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;yt/jekyll--create-or-update-custom_id-field&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;add-hook&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;org-export-before-processing-functions&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;yt/jekyll--create-or-update-custom_id-field-buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</content>
 </entry>
 
 <entry>
   <title>Jekyll in Emacs - Update Blog Post Title and Date</title>
   <link href="http://yitang.uk/2023/12/18/jekyll-in-emacs-update-blog-post-title-and-date/"/>
   <updated>2023-12-18T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/12/18/jekyll-in-emacs--update-blog-post-title-and-date</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#Emacs Lisp Time&quot;&gt;Emacs Lisp Time&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#Code&quot;&gt;Code&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I’m the type of writer who writes first and comes up with the title
later. The title in the end is usually rather different to what I
started with. To change the title is straightforward - update the
title and date fields in the front matter.&lt;/p&gt;

&lt;p&gt;However, doing so leads to discrepancies between the title and date
fields in front matter and the filename. In Jekyll, the filename
consists of the original date and title when the post is first
created.&lt;/p&gt;

&lt;p&gt;This can be confusing sometimes in finding the file when I want to
update a post. I have to rely on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grep/ack&lt;/code&gt; to find the right files. A
little bit of inefficiency is fine.&lt;/p&gt;

&lt;p&gt;Recently, I realised that readers sometimes can be confused as well
because the URL apparently also depends on the filename.&lt;/p&gt;

&lt;p&gt;For example, I have my previous post in a file named
2022-12-08-trx-3970x.md. It indicates that I started writing it on 08
Dec with the initial title “trx 3970x”. A couple of days later on 13
Dec, I published the post with the title “How Much Does Threadripper
3970x Help in Training LightGBM Models?”.&lt;/p&gt;

&lt;p&gt;The URL is however &lt;a href=&quot;http://yitang.uk/2023/12/13/trx4-3970x/&quot;&gt;yitang.uk/2022/12/13/trx-3970x&lt;/a&gt;. It has the correct
updated publish date, but the title is still the old one. This is just
how Jekyll works.&lt;/p&gt;

&lt;p&gt;Anyways, the correct URL should be&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;http://yitang.uk/2023/12/13/how-much-does-threadripper-3970x-help-in-training-lightgbm-models/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From that point, I decided to write a bit of Emacs Lisp code to help the
readers.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Emacs Lisp Time&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-lisp-time&quot;&gt;Emacs Lisp Time&lt;/h2&gt;

&lt;p&gt;The core functionality is updating the filename and front matter to
have the same publish date and title. It can breakdown into three
parts:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;when called, it promotes a new title. The publish date is fixed to
whenever the function is called.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It renames the current blog post file with the new date and title.
It also updates the title and date fields in the front matter
accordingly.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It deletes the old file, closes the related buffer, and opens the
new file so I can continue to work on it.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;My Emacs Lisp coding skill is rusty but I managed to get it working in
less than 2 hours.  I won’t say it looks beautiful, but it does the
job!&lt;/p&gt;

&lt;p&gt;I spent a bit of time debugging, it turns out the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(org-show-all)&lt;/code&gt;
needs to be called first to flatten the org file, otherwise, editing
with some parts of the content hidden can lead to unexpected results.&lt;/p&gt;

&lt;p&gt;I always found working with the filename/directory in vanilla Emacs
Lisp cumbersome, I wonder if is there any modern lisp library with a
better API, something like Python’s &lt;em&gt;pathlib&lt;/em&gt; module?&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;Code&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;code&quot;&gt;Code&lt;/h2&gt;

&lt;p&gt;Here are the main functions in case someone needs something similar.
They are extracted from &lt;a href=&quot;https://github.com/yitang/.emacs.d/blob/main/config.org#update-post-title-and-date&quot;&gt;my Emacs configuration&lt;/a&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt; 
 &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-update-post-name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;it update the post filename with a new title and today&apos;s date.

it also update the font matter.&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;title&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;read-string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;new title: &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ext&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-extension&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-file-name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;;; as of now, the ext is always .org.&lt;/span&gt;

         &lt;span class=&quot;c1&quot;&gt;;; the new filename is in the format of {date}-{new-title}.org&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;filename&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;concat&lt;/span&gt;
                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;format-time-string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;%Y-%m-%d-&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;file-name-with-extension&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;jekyll-make-slug&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

         &lt;span class=&quot;c1&quot;&gt;;; normalise the filename. &lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;filename&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;expand-file-name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

         &lt;span class=&quot;c1&quot;&gt;;; keep the current point which we will go back to after editing.&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;old-point&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;rename-file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;buffer-file-name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;;; update the filename&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kill-buffer&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;;; kill the current buffer, i.e. the old file.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;find-file&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;;; open the new file.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;set-window-point&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;selected-window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;old-point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;;; set the cursor to where i was in the old file.&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; udpate title field. &lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;;; note jekyll-yaml-escape is called to ensure the title field is yaml friendly.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;yt/jekyll-update-frontmatter--title&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;jekyll-yaml-escape&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;    
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;defun&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;yt/jekyll-update-frontmatter--title&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;Update the title field in the front matter.

title case is used. 
&quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let*&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;old-point&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; ensure expand all the code/headers/drawers before editing.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-show-all&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; go to the first occurence of &apos;title:&apos;.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;goto-char&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;point-min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;search-forward&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;title: &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; update the title field with the new title.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;beginning-of-line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kill-line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;title: %s&quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; ensure the title is in title case&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;xah-title-case-region-or-line&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;line-beginning-position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;line-end-position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;;; save and reset cursor back to where it started.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;save-buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;    
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;goto-char&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;old-point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

</content>
 </entry>
 
 <entry>
   <title>How Much Does Threadripper 3970x Help in Training LightGBM Models?</title>
   <link href="http://yitang.uk/2023/12/13/trx4-3970x/"/>
   <updated>2023-12-13T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/12/13/trx4-3970x</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h1 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h1&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#org397676a&quot;&gt;Experiment Set up&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#orgb7ddd20&quot;&gt;i5-13600k - Efficient Cores Count&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org0238b9f&quot;&gt;3970x - Disappointing Results&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org59a1ed6&quot;&gt;i5 vs. 3970x - Training in Parallel&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org6917deb&quot;&gt;CPU vs. GPU  - Impressive Performance&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org76dedfb&quot;&gt;Is the 3970x worth it?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Back in my Kaggle days, I always wondered how much my ranking could
improve with a better computer. I finally pulled the triggers (twice)
and got myself a 32-Cores Threadripper 3970x workstation.&lt;/p&gt;

&lt;p&gt;Before I can tell if it helps my Kaggle competitions or not, I thought
it would be interesting to quantify how much benefits I can get from
upgrading the i5-13600k to 3970x in training LightGBM model.&lt;/p&gt;

&lt;p&gt;The TLDR is:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The speedup is 3 times in training LightGBM using CPU.&lt;/li&gt;
  &lt;li&gt;To my surprise, it is 2 times faster using GTX 1080Ti GPU than
i5-13600k.&lt;/li&gt;
  &lt;li&gt;There are no obvious gains from GTX 1080Ti to RTX 3080.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;org397676a&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;experiment-set-up&quot;&gt;Experiment Set up&lt;/h1&gt;

&lt;p&gt;I use the data in the &lt;a href=&quot;https://www.kaggle.com/competitions/optiver-trading-at-the-close&quot;&gt;Optiver - Trading At The Close&lt;/a&gt;
competition. There are about 500,000 rows and 100 features. I train a
3-fold (expanding window) LightGBM model. Repeating the same process
with varying numbers of cores used in the process to get a performance
graph like this:&lt;/p&gt;

&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;

&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
&lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br /&gt;
  &lt;em&gt; Threadripper 3970x vs i5-13600k: Train LightGBM Models on CPU &lt;/em&gt;
  
  &lt;img src=&quot;/assets/temp.png&quot; alt=&quot;&quot; class=&quot;img-class&quot; width=&quot;&quot; align=&quot;center&quot; /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Threadripper 3970x vs i5-13600k: Train LightGBM Models on CPU &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;p&gt;&lt;a id=&quot;orgb7ddd20&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;i5-13600k---efficient-cores-count&quot;&gt;i5-13600k - Efficient Cores Count&lt;/h1&gt;

&lt;p&gt;The i5-13600k has 6 performance cores and 8 efficient cores. In
practice, I never use more than 6 cores in training ML models. My
theory is mixing fast performance and slow efficient cores leads to a
worse performance than using the performance cores alone. By
specifying 6 cores, I assume the OS uses only performance cores.&lt;/p&gt;

&lt;p&gt;The result shows that I was wrong - Using more than 6 cores can give
considerable performance gain. It reduces the runtime by 10 minutes
from 6 to 14 cores.&lt;/p&gt;

&lt;p&gt;The only plausible explanation is that when training LightGBM with 6
cores, it is already mixed with efficient cores. Therefore I see an
increases in performance while adding more cores.&lt;/p&gt;

&lt;p&gt;Regardless I will start to use 12 cores in practise.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org0238b9f&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;3970x---disappointing-results&quot;&gt;3970x - Disappointing Results&lt;/h1&gt;

&lt;p&gt;I know the performance gain will not scale linearly with the number of
cores but I wasn’t expecting that adding more cores can slow down the
model training.&lt;/p&gt;

&lt;p&gt;The graph shows the 3970x achieves its best performance at using 12
cores. After that, adding more cores increases the runtime.&lt;/p&gt;

&lt;p&gt;This type of behaviour is usually observed in simple tasks where the
overhead of coordinating between cores outweighs the benefits of extra
cores bring in.&lt;/p&gt;

&lt;p&gt;But training thousands of decision trees with half a million data
points is definitive not in this simple task category. So I don’t
understand why this is happening.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org59a1ed6&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;i5-vs-3970x---training-in-parallel&quot;&gt;i5 vs. 3970x - Training in Parallel&lt;/h1&gt;

&lt;p&gt;For 6 cores, it took i5 51 minutes and 3970x 42 minutes, which is about
1.2 speedup which is not bad. The same speed boost is also observed at
using 10 and 12 cores.&lt;/p&gt;

&lt;p&gt;I found this consistent speedup confusing because there’s a mix of
performance and efficient cores in i5, so in theory every performance
core I add in 3970x should increase the performance marginal when
compared to i5.&lt;/p&gt;

&lt;p&gt;In general, because of the poor scalability with respect to the number
of cores, the best performance is achieved when training the model with
a small number of cores and running multiple training in parallel. This is
the trick I use to get the extra performance boost for CPU-bound
tasks.&lt;/p&gt;

&lt;p&gt;Here’s the setup for each computer:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;i5-13600:&lt;/strong&gt; use 6 cores to train each model, and train 2 models in
parallel. They are 2 cores left for OS background activities.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;3970x:&lt;/strong&gt; also use 6 cores to train each model, but train 5 models in
parallel! It also leaves 2 cores for OS background activities.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After a little bit of maths, it takes 14 hours for 3970x to train 100
models, and 42.8 hours for i5, so the speedup is 3 times. This is just
based on my theory. It would be good to actually run the experiment
and see the actual numbers.&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 1:&lt;/span&gt; Training 100 models in parallel setting.&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col class=&quot;org-left&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;CPU&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Runtime of 1 model (S)&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;No. models in Parallel&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;No. Batches&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Total Runtime (H)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;13600k&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;3083&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;50&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;42.8&lt;/td&gt;
&lt;/tr&gt;


&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;3970x&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2523&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;5&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;20&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;14.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;So the most benefit I can get from 3970x is in running multiple
experiments in parallel!&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org6917deb&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;cpu-vs-gpu----impressive-performance&quot;&gt;CPU vs. GPU  - Impressive Performance&lt;/h1&gt;

&lt;p&gt;I have a GTX 1080Ti in my i5 PC for running deep learning models and
CUDA code.  I never use it for LightGBM because the GPU implementation
was slower than the CPU in 2019 when I tried it.&lt;/p&gt;

&lt;p&gt;In summer Guolin Ke, the author LightGBM, promised a significant
improvement in GPU performance when he was looking for volunteers to
work on improving LightGBM’s GPU algorithm.&lt;/p&gt;

&lt;p&gt;Since I have the experiments set up already, it took me little time to
repeat the same experiments using the GPU trainer. All I did was adding
device_type=’gpu’ in the configuration files.&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 2:&lt;/span&gt; Runtime of training a single mdoel&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;# CPU Cores&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;i5-13600k&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;tr-3970x&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;GTX 1080ti&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;RTX 3080&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;6&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;3083&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2523&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1435&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1256&lt;/td&gt;
&lt;/tr&gt;


&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;10&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2695&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1940&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1269&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1147&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The result shocks me: I can get 2 times speedup just by switching from
i5 to 1080Ti with one additional line in the config and it outperforms
the 3970x in training single model setting by a big margin!&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org76dedfb&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;is-the-3970x-worth-it&quot;&gt;Is the 3970x worth it?&lt;/h1&gt;

&lt;p&gt;I found myself asking this question after seeing the results. In the
context of this experiment, no, it makes no sense to spend £2,000 to
get 3 times speedup when I can simply switch to 1080Ti to get 2 times
speed up with no costs.&lt;/p&gt;

&lt;p&gt;However, the reason I go for the Threadripper and the TRX40 platform
is the 128 PCIe 4.0 lanes. The workstation is capable of running 4
GPUs at the same time at full capability while as i5 can only run 1
GPU.&lt;/p&gt;

&lt;p&gt;If I had 4 GTX 3080 installed, it would finish training 100 models in
just under 8 hours! That’s 5.25 speedup to i5 and 1.75 speedup to
3970x in parallel setting.&lt;/p&gt;

&lt;p&gt;This calculation is not for just entertainment. It turns out that
utilising multiple GPU to train gradient boost tree can be a really
big thing!&lt;/p&gt;

&lt;p&gt;I just found another reason to buy more GPUs! :)&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>State of This Blog</title>
   <link href="http://yitang.uk/2023/03/28/state-of-this-blog/"/>
   <updated>2023-03-28T00:00:00+01:00</updated>
   <id>http://yitang.uk/2023/03/28/state-of-this-blog</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h3 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#orge6c6bf5&quot;&gt;Technical Debt&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org12d0b47&quot;&gt;Revisit the Tech Stack&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#orgea394ab&quot;&gt;Long Ride with Jekyll&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This static blog is built using &lt;a href=&quot;https://jekyllrb.com/&quot;&gt;Jekyll&lt;/a&gt; in 2014. It survived after 7
years which is a success when it comes to personal blogging. Part of
the reason is having a good blogging workflow: write posts in Org
Mode, export to HTML with a front matter, build the site using Jekyll,
send the folder to an Amazon S3 bucket, and that’s it. All done in
Emacs of course.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;orge6c6bf5&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;technical-debt&quot;&gt;Technical Debt&lt;/h3&gt;

&lt;p&gt;I added a few things to the workflow to enhance the reading experience
including code highlights, centred images with caption, table of
content etc. There are more features I want to add but at the same
time, I want to be able to just write.&lt;/p&gt;

&lt;p&gt;With that mindset, whenever there are issues, I apply quick fixes
without a deep understanding of the actual causes. It seems efficient
until recently some fixes become counter-productive.&lt;/p&gt;

&lt;p&gt;I started seeing underscore (_) is exported as \_ and &amp;lt;p​&amp;gt; tag
appears in code snippets. It all sounds like quick fix, but I just
couldn’t get it correct after few hours. For the last few posts, I had
to manually fix them for each of the read-edit-export-fix iteration.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org12d0b47&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;revisit-the-tech-stack&quot;&gt;Revisit the Tech Stack&lt;/h3&gt;

&lt;p&gt;I have an ambitious goal for this blog. So it is time to go sweep the
carpet. I studied the technologies used for this blog, Jekyll, AWS and
Org Mode exporting. It was a good chance to practise Org-roam for
taking atomic notes. The time is well spent as I learnt a lot.&lt;/p&gt;

&lt;p&gt;I was impressed I got the whole thing up and running 7 years ago. I
don’t think I have the willpower to do it now.&lt;/p&gt;

&lt;p&gt;Still, there are a lot of things that I do not have a good understand,
e.g.  the Liquid templates, HTML and CSS tags etc. The syntax just
puts me off.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;orgea394ab&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;long-ride-with-jekyll&quot;&gt;Long Ride with Jekyll&lt;/h3&gt;

&lt;p&gt;I prefer a simple format like Org Mode or Markdown and don’t have to
deal with HTML/CSS at all. There are a couple of occasions when I
cannot resist the temptation to look for an alternative to
Jekyll. There’s no luck in the search. It seems HTML is the only way
because it is native to the web.&lt;/p&gt;

&lt;p&gt;So the plan is to stick with Jekyll for at least a few years. In the
next couple of weeks, I’d try to fix all the issues, after that,
gradually add more features to enhance the writing and reading
experience.&lt;/p&gt;

&lt;p&gt;I hope people who also uses the similar tech stack (Org-mode, Emacs,
Jekyll, AWS) can benefit my work.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Setup Emacs Servers in MacOS</title>
   <link href="http://yitang.uk/2023/02/09/emacs-as-service-in-macos/"/>
   <updated>2023-02-09T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/02/09/emacs-as-service-in-macos</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;#org8dff1c7&quot;&gt;Emacs Server Configuration&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org503e229&quot;&gt;Launch Emacs GUI in Terminal&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#orged355bc&quot;&gt;Launch Emacs GUI in Spotlight&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I switched to MacOS last year for editing home gym videos. I was and
am still amazed by how fast the M1 chip is for exporting 4K
videos. The MacOS also enriched the Emacs experience which makes it
deserve another blog post.&lt;/p&gt;

&lt;p&gt;So I have been slowly adapting my Emacs configuration and workflow to
MacOS. One of the changes is the Emacs server.&lt;/p&gt;

&lt;p&gt;The goal is to have fully loaded Emacs instances running all the time
so I can use them at any time and anywhere, in Terminal or Spotlight. They are
initiated upon login. In cases of Emacs crashes (it is rare but more
often than I like) or I have to stop them because I messed up the
configuration, they restart automatically.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org8dff1c7&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;emacs-server-configuration&quot;&gt;Emacs Server Configuration&lt;/h2&gt;

&lt;p&gt;I have this setup in Linux using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;systemd&lt;/code&gt;, as in my &lt;a href=&quot;http://yitang.uk/2021/06/18/managing-emacs-server-as-systemd-service/&quot;&gt;previous blog
post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In MacOS, the &lt;a href=&quot;https://ss64.com/osx/launchctl.html&quot;&gt;launchctl&lt;/a&gt; is the service manager. It provides a user
interface to list, start and stop services.&lt;/p&gt;

&lt;p&gt;To build an Emacs server, create a plist file in ~/Library/LaunchAgents
folder. In my case, I named it emacs_work.plist.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; 1  # cat ~/library/LaunchAgents/emacs_work.plist
 2  &amp;lt;plist version=&quot;1.0&quot;&amp;gt;
 3    &amp;lt;dict&amp;gt;
 4      &amp;lt;key&amp;gt;Label&amp;lt;/key&amp;gt;
 5      &amp;lt;string&amp;gt;emacs_work&amp;lt;/string&amp;gt;
 6      &amp;lt;key&amp;gt;ProgramArguments&amp;lt;/key&amp;gt;
 7      &amp;lt;array&amp;gt;
 8        &amp;lt;string&amp;gt;/opt/homebrew/opt/emacs-plus@31/bin/emacs&amp;lt;/string&amp;gt;
 9        &amp;lt;string&amp;gt;--fg-daemon=work&amp;lt;/string&amp;gt;
10        &amp;lt;string&amp;gt;--init-directory=~/.config/emacs/emacs.d_v31&amp;lt;/string&amp;gt;
11      &amp;lt;/array&amp;gt;
12      &amp;lt;key&amp;gt;RunAtLoad&amp;lt;/key&amp;gt;
13      &amp;lt;true/&amp;gt;
14      &amp;lt;key&amp;gt;KeepAlive&amp;lt;/key&amp;gt;
15      &amp;lt;true/&amp;gt;    
16      &amp;lt;key&amp;gt;StandardOutPath&amp;lt;/key&amp;gt;
17      &amp;lt;string&amp;gt;/tmp/emacs_work.stdout.log&amp;lt;/string&amp;gt;
18      &amp;lt;key&amp;gt;StandardErrorPath&amp;lt;/key&amp;gt;
19      &amp;lt;string&amp;gt;/tmp/emacs_work.stderr.log&amp;lt;/string&amp;gt;
20    &amp;lt;/dict&amp;gt;
21  &amp;lt;/plist&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It is an extension of &lt;a href=&quot;https://github.com/d12frosted/homebrew-emacs-plus&quot;&gt;Emacs Plus’&lt;/a&gt; plist file. I made a few changes for
running two Emacs servers: one for work (data sciences, research) and
one for personal usage (GTD, books). Taking the “work” server as an
example, the important attributes of the plist configuration file are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Line 5:&lt;/strong&gt; The unique service name to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;launchctl&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Line 8:&lt;/strong&gt; The full path to the Emacs program. In my case, it is
/opt/homebrew/opt/emacs-plus@31/bin/emacs&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Line 9:&lt;/strong&gt; The “–fg-daemon” option set the Emacs server name to
“work”. Later I can connect to this server by specifying “-s=work”
option to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;emacsclient&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Line 13:&lt;/strong&gt; The &lt;em&gt;KeepAlive&lt;/em&gt; is set to true so it keeps trying to
restart the server in case of failures&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Line 16 and 18:&lt;/strong&gt; The location of standard output and error
files. They are used to debug. Occasionally I have to check those
files to see why Emacs servers stopped working, usually because of
me introducing bugs in &lt;a href=&quot;https://github.com/yitang/.emacs.d&quot;&gt;my .emacs.d&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the updated plist files in place, I start the Emacs servers with&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
launchctl load &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; ~/Library/LaunchAgents/emacs_work.plist
launchctl load &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; ~/Library/LaunchAgents/emacs_org.plist&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;launchctl list | grep -i emacs&lt;/code&gt; is a handy snippet that lists the
status of the services whose name includes “emacs”. The output I have
right now is&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-right&quot; /&gt;

&lt;col class=&quot;org-left&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;PID&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Exit Code&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Server ID&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;1757&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;emacs_org&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-right&quot;&gt;56696&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;emacs_work&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;It shows both Emacs servers are running fine with exit code 0.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org503e229&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;launch-emacs-gui-in-terminal&quot;&gt;Launch Emacs GUI in Terminal&lt;/h2&gt;

&lt;p&gt;I can now open a Emacs GUI and connect it to the “work” Emacs server
by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;emacsclient -c -s work &amp;amp;&lt;/code&gt;. The &lt;em&gt;-c&lt;/em&gt; option&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;orged355bc&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;launch-emacs-gui-in-spotlight&quot;&gt;Launch Emacs GUI in Spotlight&lt;/h2&gt;

&lt;p&gt;In MacOS, I found it is natural to open applications using Spotlight,
for example, type ⌘ + space to invoke Spotlight, put “work” in the
search bar, it narrows the search down to “emacs_work” application,
and hit return to finalise the search. It achieves the same thing as
the command above but can be used anywhere.&lt;/p&gt;

&lt;p&gt;I uploaded a &lt;a href=&quot;https://www.youtube.com/watch?v=N456qZWymQU&quot;&gt;demo video&lt;/a&gt; on YouTube to show it in action. You might want
to watch it at 0.5x speed because I typed so fast…&lt;/p&gt;

&lt;p&gt;To implement this shortcut, open “Automator” application, start a new
“Application”, select “Run Shell Script”, and paste the following bash
code&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt; 
/opt/homebrew/opt/emacs-plus@31/bin/emacsclient &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--no-wait&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--quiet&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--suppress-output&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--create-frame&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; work &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$@&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;and save it as &lt;em&gt;emacsclient_work&lt;/em&gt; in the ~/Application
folder.&lt;/p&gt;

&lt;p&gt;Essentially, the bash script above is wrapped up as a MacOS
application, named &lt;em&gt;emacsclient_work&lt;/em&gt; and the Spotlight searches the
application folder by default.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Speed Up Sparse Boolean Data</title>
   <link href="http://yitang.uk/2023/01/05/low-latency-sparse-boolean-array/"/>
   <updated>2023-01-05T00:00:00+00:00</updated>
   <id>http://yitang.uk/2023/01/05/low-latency-sparse-boolean-array</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#org2bb7dca&quot;&gt;Imaging On-the-fly&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org2b75ad0&quot;&gt;Scipy Sparse Matrix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org1e75759&quot;&gt;Numpy Bites&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org1c0f8fd&quot;&gt;Problems of Having Millions of Files&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;
I’m working on replicating the &lt;a href=&quot;https://deliverypdf.ssrn.com/delivery.php?ID=353112008124002077094005080117000024116009029087084092125004002022016102006095106098020124039040057048105001001001110100020000122033066059014117091120117101078117010053032060068104112097090097091095012089117085024111093117095004119031100008079065071099&amp;amp;EXT=pdf&amp;amp;INDEX=TRUE&quot;&gt;(Re-)Imag(in)ing Price Trends paper&lt;/a&gt; -
the idea is to train a Convolutional Neutral Network (CNN) &quot;trader&quot; to
predict the stocks&apos; return. What makes this paper interesting is the
model uses images of the pricing data, not in the traditional
time-series format. It takes financial charts like the one below
and tries to mimic the traders&apos; behaviours to buy and sell stocks to
optimise future returns.
&lt;/p&gt;


&lt;p&gt;
&lt;!-- &lt;figure class=&quot;image&quot;&gt; --&gt;
&lt;!--   &lt;figcaption&gt;&lt;/figcaption&gt; --&gt;
&lt;!--   &lt;img src=&quot;&quot; alt=&quot;&quot; align=&quot;center&quot;&gt; --&gt;

&lt;!-- &lt;/figure&gt; --&gt;


&lt;!-- https://talk.jekyllrb.com/t/need-help-with-image-caption/6715/15 --&gt;
&lt;!-- &lt;figure --&gt;
  &lt;p align=&quot;center&quot;&gt;
    &lt;!-- style=&quot; --&gt;
    &lt;!--         &lt;\!-- padding: 10px; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-top: 1px solid #999; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-right: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-bottom: 2px solid #555; -\-&gt; --&gt;
    &lt;!--         &lt;\!-- border-left: 1px solid #999; -\-&gt; --&gt;
    &lt;!--       &quot; --&gt;

  &lt;br&gt;
  &lt;em&gt; Alphabet 5-days Bar Chart Shows OHLC Price and Volume Data &lt;/em&gt;
  
  &lt;img
    src=&quot;/assets/dfa367394d6ccf04f018e3eb806297ab1e9a89aa.png&quot;
    alt=&quot;&quot;
    class=&quot;img-class&quot;
    width=&quot;&quot;
    align=&quot;center&quot;
  /&gt;
  &lt;!-- &lt;figcaption --&gt;
  &lt;!--   style=&quot;text-align: center;&quot; --&gt;
  &lt;!--   &gt; --&gt;
  &lt;!--   &lt;sup&gt;&lt;em&gt; Alphabet 5-days Bar Chart Shows OHLC Price and Volume Data &lt;/em&gt;&lt;/sup&gt; --&gt;
  &lt;!-- &lt;/figcaption&gt; --&gt;
&lt;/p&gt;
&lt;!-- &lt;/figure&gt; --&gt;

&lt;/p&gt;


&lt;p&gt;
I like this idea. So it becomes my final assignment for &lt;a href=&quot;https://dlsyscourse.org/&quot;&gt;Deep Learning
Systems: Algorithm and Implementations&lt;/a&gt; course.
&lt;/p&gt;

&lt;div id=&quot;outline-container-org2bb7dca&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org2bb7dca&quot;&gt;Imaging On-the-fly&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org2bb7dca&quot;&gt;
&lt;p&gt;
To train the model, the price and volume data are transformed into
black-white images which is just a 2D matrix with 0s and 1s. For just
around 100 stocks&apos; pricing history, there are around 1.2 million
images in total.
&lt;/p&gt;

&lt;p&gt;
I used the on-the-fly imaging process during training: in each batch,
it loads pricing data for a given stock, sample one day in the
history, slice a chunk of pricing data, and then convert it to an image. It
takes about 0.2 milliseconds (ms) to do all that, so in total it takes 4
minutes to loop through all the 1.2 million images.
&lt;/p&gt;

&lt;p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;o&quot;&gt;%%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt; 
&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MarketData&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DATA_DIR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;GOOGL&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;imager&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ImagingOHLCV&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img_resolution&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;price_prop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;price_prop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;imager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/p&gt;

&lt;pre class=&quot;example&quot;&gt;
1.92 ms ± 26.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
&lt;/pre&gt;


&lt;p&gt;
To train 10 epochs, that&apos;s 40 minutes in loading data. To train one
epoch on the full dataset with 5,000 stocks, that&apos;s 200 minutes in
loading data alone!
&lt;/p&gt;

&lt;p&gt;
PyToch utilises multiple processing in loading the data using CPU
while training using GPU. So the problem is less severe, but I&apos;m using
the &lt;a href=&quot;https://github.com/dlsyscourse/hw4/tree/main/python/needle&quot;&gt;needle&lt;/a&gt;, the deep learning framework we developed during the
course, it does have this functionality yet.
&lt;/p&gt;

&lt;p&gt;
During training using needle, the GPU utilisation is only around
50%. After all the components in the end-to-end are almost completed,
it is time to train with more data, go deeper (larger/more complicated
morel), try hyper-parameters tuning etc.
&lt;/p&gt;

&lt;p&gt;
But before moving to the next stage, I need to improve the IO.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;


&lt;div id=&quot;outline-container-org2b75ad0&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org2b75ad0&quot;&gt;Scipy Sparse Matrix&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org2b75ad0&quot;&gt;
&lt;p&gt;
In the image above, there are a lot of black pixels or zeros in the data
matrix. In general only 5%-10% of pixels are white in this dataset.
&lt;/p&gt;

&lt;p&gt;
So my first attempt was to use scipy&apos;s spare matrix instead of numpy&apos;s
dense matrix: I save the sparse matrix, loaded it, and then convert it
back to a dense matrix for training CNN model.
&lt;/p&gt;

&lt;p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;o&quot;&gt;%%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img_sparse&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;csr_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;save_npz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/tmp/sparse_matrix.npz&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img_sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img_sparse_2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_npz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/tmp/sparse_matrix.npz&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img_sparse_2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/p&gt;

&lt;pre class=&quot;example&quot;&gt;
967 µs ± 4.99 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
&lt;/pre&gt;


&lt;p&gt;
It reduces the IO time to 1ms, so about half of the time, not bad,
but I was expecting a lot more as the sparseness is high.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org1e75759&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org1e75759&quot;&gt;Numpy Bites&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org1e75759&quot;&gt;
&lt;p&gt;
Then I realised the data behind images is just 0 and 1, in fact, a lot
of zeros, and only some are 1. So I can ignore the 0s and only need to
save those 1s, then reconstruct the images using those 1.
&lt;/p&gt;

&lt;p&gt;
It is so simple that numpy has functions for this type of data
processing already.  The &lt;a href=&quot;https://numpy.org/doc/stable/reference/generated/numpy.packbits.html&quot;&gt;numpy.packbites&lt;/a&gt; function converts the image
matrix of 0 and 1 into a 1D array whose values indicate where the 1s
are. Then the &lt;a href=&quot;https://numpy.org/doc/stable/reference/generated/numpy.unpackbits.html&quot;&gt;numpy.unpackbits&lt;/a&gt; does the inverse: it reconstructs the
image matrix by using the 1D location array.
&lt;/p&gt;

&lt;p&gt;
This process reduces the time of loading one image to 0.2
milliseconds, that&apos;s 10 times faster than the on-the-fly method with
only a few lines of code.
&lt;/p&gt;

&lt;p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;o&quot;&gt;%%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeit&lt;/span&gt; 
&lt;span class=&quot;n&quot;&gt;temp_file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/tmp/img_np_bites.npy&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;packbits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;astype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uint8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;save&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temp_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temp_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unpackbits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reshape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img_np_bites&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/p&gt;

&lt;pre class=&quot;example&quot;&gt;
194 µs ± 3.95 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
&lt;/pre&gt;


&lt;p&gt;
Another benefit is the file size is much smaller: it is 188 bytes
compared to 1104 bytes using sparse matrix. So it takes only 226MB of
disk space to save 1.2 million images!
&lt;/p&gt;

&lt;p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/tmp/img_np_bites.npy&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;st_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;/tmp/sparse_matrix.npz&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;st_size&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/p&gt;

&lt;pre class=&quot;example&quot;&gt;
188, 1104
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;


&lt;div id=&quot;outline-container-org1c0f8fd&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org1c0f8fd&quot;&gt;Problems of Having Millions of Files&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org1c0f8fd&quot;&gt;
&lt;p&gt;
It takes a couple of minutes to generate 1.2 million files on my Debian
machine. It is so quick! But then I release this approach is not
scalable without modification because there&apos;s a limited number of
files the OS can accommodate. The technical term is &lt;a href=&quot;https://en.wikipedia.org/wiki/Inode&quot;&gt;Inode&lt;/a&gt;. According
to &lt;a href=&quot;https://unix.stackexchange.com/questions/26598/how-can-i-increase-the-number-of-inodes-in-an-ext4-filesystem&quot;&gt;this StackExchange question&lt;/a&gt;, once the filesystem is created, one
cannot increase the limit (Yes, I was there).
&lt;/p&gt;

&lt;p&gt;
Without going down to the database route, one quick workaround is to
bundle the images together, for example, 256 images in one file. So
later in training, load 256 images in one go, then split them into
chunks. Just ensure the number of images is a multiple of the batch
size used in training so I don&apos;t have to deal with unequal batch
sizes. Since those bundled images are trained together, it reduces the
randomness of &lt;a href=&quot;https://en.wikipedia.org/wiki/Stochastic_gradient_descent&quot;&gt;SGD&lt;/a&gt;, so I won&apos;t bundle too many images together, 256
sounds about right.
&lt;/p&gt;

&lt;p&gt;
The &lt;a href=&quot;https://en.wikipedia.org/wiki/Language_Server_Protocol&quot;&gt;LSP&lt;/a&gt; and other tools can cause problems when they are monitoring
folders with a large number of files. Moving them out of the project
folder is the way to go so Emacs won&apos;t complain or freeze.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>PoorMan's CI in Emacs</title>
   <link href="http://yitang.uk/2022/12/16/poor-mans-ci-in-emacs/"/>
   <updated>2022-12-16T00:00:00+00:00</updated>
   <id>http://yitang.uk/2022/12/16/poor-mans-ci-in-emacs</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;
I have been working on the &lt;a href=&quot;https://dlsyscourse.org&quot;&gt;Deep Learning System course&lt;/a&gt;. It is the
hardest course I ever studied after university. I would never thought
that I need CI for a personal study project. It just shows how
complex this course is.
&lt;/p&gt;

&lt;p&gt;
Here is the setup: the goal is to develop a pytorch-like DL library
that supports ndarray ops, autograd, and to implement DL models, LSTM
for example, from scratch. That&apos;s the exciting math part. The tricky
part is it supports both CPU devices with C++11 and GPU devices with
Cuda. On the user front, the interface is written in Python. I worked
on my M1 laptop most of the time, and switch to my Debian desktop for
Cuda implementation.
&lt;/p&gt;

&lt;p&gt;
It was a fine Saturday afternoon, I made a breakthrough in implementing
the gradient of Convolution Ops in Python after couple of hours of
tinkering in a local coffee shop. I rushed home, boosted up Debian
to test the Cuda backend, only to find &quot;illegible memory access&quot;
error!
&lt;/p&gt;

&lt;p&gt;
It took me a few cycles of rolling back to the previous change in git to
find where the problems are.  It made me think about the needs of
CI. In the ideal scenario, I would have a CI that automatically runs
the tests on the CPU and Cuda devices to ensure one bug-fix on CPU
side doesn&apos;t introduce new bugs on the Cuda, and vice versa. But I
don&apos;t have this setup at home.
&lt;/p&gt;

&lt;div id=&quot;outline-container-org654a604&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org654a604&quot;&gt;Two Components of PoorMan CI&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org654a604&quot;&gt;
&lt;p&gt;
So I implemented what I call PoorMan CI. It is a semi-automated
process that gives me some benefits of the full CI. I tried hard to
refrain from doing anything fancy because I don&apos;t have
time. The final homework is due in a few days. The outcome is simple yet
powerful.
&lt;/p&gt;

&lt;p&gt;
The PoorMan CI consists of two parts:
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;&lt;p&gt;
a bunch of bash functions that I can call to run the tests, capture
the outputs, save them in a file, and version control it
&lt;/p&gt;

&lt;p&gt;
For example, wrap the below snippet in a single function
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;pytest &lt;span class=&quot;nt&quot;&gt;-l&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-v&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;not training and cuda&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
       &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log
git add test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;&lt;p&gt;
a log file where I keep track of the code changes, and if the new
change fixes anything, or breaks anything.
&lt;/p&gt;

&lt;p&gt;
In the example below, I have a bullet point for each change committed
to git with a short summary, and a link to the test results. The
&lt;i&gt;fce5edb&lt;/i&gt; and &lt;i&gt;f43d7ab&lt;/i&gt; are the git commit hash values. 
&lt;/p&gt;
&lt;pre class=&quot;example&quot; id=&quot;org520bf91&quot;&gt;
- fix grid setup, from (M, N) to (P, M)!
[[file:test_results/2022_12_11_12_48_44__fce5edb__fast_and_cuda.log]]

- ensure all data/parameters are in the right device. cpu and cuda, all pass! milestone.
[[file:test_results/2022_12_11_13_51_22__f43d7ab__fast_and_cuda.log]]
&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;
As you can see, it is very simple!
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org0466ec2&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org0466ec2&quot;&gt;Benefits&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org0466ec2&quot;&gt;
&lt;p&gt;
It changed my development cycle a bit: each time before I can claim
something is done or fixed, I run this process which takes about 2
mins for two fast runs. I would use this time to reflect on what I&apos;ve
done so far, write down a short summary about what&apos;s got fixed and
what&apos;s broken, check in the test results to git, update the test log
file etc.
&lt;/p&gt;

&lt;p&gt;
It sounds tedious, but I found myself enjoying doing it, it
gives me confidence and reassurance about the progress I&apos;m making. The
time in reflecting also gives my brain a break and provides clarity on
where to go next.
&lt;/p&gt;

&lt;p&gt;
During my few hours of using it, it amazes me how easy it is to
introduce new issues while fixing existing ones.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgdc23fad&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgdc23fad&quot;&gt;Implement in Org-mode&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgdc23fad&quot;&gt;
&lt;p&gt;
I don&apos;t have to use Org-mode for this, but I don&apos;t want to leave Emacs
:) Plus, Org-mode shines in literate programming where code and
documentation are put together.
&lt;/p&gt;

&lt;p&gt;
This is actually how I implemented it in the first place. This section
is dedicated to showing how to do it in Org-mode. I&apos;m sure I will come
back to this shortly, so it serves as documentation for myself.
&lt;/p&gt;

&lt;p&gt;
Here is what I did: I have a file called poorman_ci.org, a full
example can be found at &lt;a href=&quot;https://gist.github.com/yitang/7894b6b6c4cf15686c55e6f970e9a9ea&quot;&gt;this gist&lt;/a&gt;. An extract is demonstrated below.
&lt;/p&gt;


&lt;p&gt;
I group all the tests logistically together into &quot;fast and cpu&quot;, &quot;fast
and cuda&quot;, &quot;slow and cuda&quot;, &quot;slow and cuda&quot;. I have a top level header
named &lt;i&gt;group tests&lt;/i&gt;, Each group has their 2nd-level header.
&lt;/p&gt;

&lt;p&gt;
The top header has a property drawer where I specify the shell session
within which the tests are run so that
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;org7ed5d70&quot;&gt;
* grouped tests
:PROPERTIES:
:CREATED:  [2022-12-10 Sat 11:32]
:header-args:sh:    :session *hw4_test_runner* :async :results output :eval no
:END:
&lt;/pre&gt;


&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;it is persistent. I can switch to the shell buffer named
&lt;i&gt;hw4_test_runner&lt;/i&gt; and do something if needed&lt;/li&gt;
&lt;li&gt;it runs asynchronically on the background&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
All the shell code block under the &lt;i&gt;grouped tests&lt;/i&gt; inherits those
attributes.
&lt;/p&gt;

&lt;p&gt;
The first code block defines variables that used to create a run
id. It uses the timestamp and the git commit hash value. The run id is
used for all the code blocks.
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;orgfa7efa7&quot;&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;#+begin_src sh :eval no&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;wd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;./test_results/&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;date&lt;/span&gt; +&lt;span class=&quot;s2&quot;&gt;&quot;%Y_%m_%d_%H_%M_%S&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;git_hash&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;git rev-parse &lt;span class=&quot;nt&quot;&gt;--verify&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--short&lt;/span&gt; HEAD&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;run id: &quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;__&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;git_hash&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;$&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#+end_src&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/pre&gt;

&lt;p&gt;
To run the code block, move the cursor inside the code block, and hit &lt;code&gt;C-c
C-c&lt;/code&gt; (control c control c).
&lt;/p&gt;

&lt;p&gt;
Then I define the first code block to run all the tests on CPU except
language model training. I name this batch of tests &quot;fast and cpu&quot;.
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;orga13bdee&quot;&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;c&quot;&gt;#+begin_src sh :var fname=&quot;fast_and_cpu.log&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;fname_full&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;wd&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ts&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;__&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;git_hash&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;__&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fname&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
pytest &lt;span class=&quot;nt&quot;&gt;-l&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-v&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;not language_training and cpu&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
     2&amp;gt;&amp;amp;1 | &lt;span class=&quot;nb&quot;&gt;tee&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;fname_full&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#+end_src&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/pre&gt;


&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;It creates the full path of the test results. The &lt;code&gt;fname&lt;/code&gt; variable
is set at the code clock header, this is a nice feature of
Org-mode.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pytest&lt;/code&gt; provides an intuitive interface for filtering tests, here
I use &quot;not language_training and cpu&quot;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;tee&lt;/code&gt; program is used to show the outputs and errors and at the
same time save them to a file.&lt;/li&gt;
&lt;/ol&gt;



&lt;p&gt;
Similarly, I define code blocks for &quot;fast and cuda&quot;, &quot;slow and cpu&quot;,
&quot;slow and cuda&quot;.
&lt;/p&gt;

&lt;p&gt;
So at the end of the development cycle, I open the poorman_ci.org
file, run the code blocks sequentially, and manually update the change
log. That&apos;s all.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Machine Learning in Emacs - Copy Files from Remote Server to Local Machine</title>
   <link href="http://yitang.uk/2022/07/31/mle-copy-files-to-local-machine/"/>
   <updated>2022-07-31T00:00:00+01:00</updated>
   <id>http://yitang.uk/2022/07/31/mle--copy-files-to-local-machine</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt; is a great additional to my Machine Learning workflow in
Emacs&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#org6c487b1&quot;&gt;File Manager GUI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#orgfe31ac5&quot;&gt;Rsync Tool in CLI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#orgf0ebfb2&quot;&gt;Emacs’ Way: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org23d9482&quot;&gt;Setup for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired=rsync&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org0566895&quot;&gt;Enhance &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt; with compilation mode&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For machine learning projects, I tweaked my workflow so the
interaction with remote server is kept as less as possible.  I prefer
to do everything locally on my laptop (M1 Pro) where I have all the
tools for the job to do data analysis, visualisation, debugging etc
and I can do all those without lagging or WI-FI.&lt;/p&gt;

&lt;p&gt;The only usage of servers is running computation extensive tasks like
recursive feature selection, hyperparameter tuning etc. For that I ssh
to the server, start &lt;a href=&quot;https://en.wikipedia.org/wiki/Tmux&quot;&gt;tmux&lt;/a&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git pull&lt;/code&gt; to update the codebase, run a
bash script that I prepared locally to fire hundreds of
experiments. All done in Emacs of course thanks to Lukas Fürmetz’s
&lt;a href=&quot;https://github.com/akermu/emacs-libvterm&quot;&gt;vterm&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The only thing left is getting the experiment results back to my
laptop. I used two approaches for copying the data to local: file
manager GUI and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt; tool in CLI.&lt;/p&gt;

&lt;p&gt;Recently I discovered &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt; that works like a charm - it
combines the two approaches above, providing a interactive way of
running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt; tool in Emacs. What’s more, it is integrated
seamlessly into my current workflow.&lt;/p&gt;

&lt;p&gt;They all have their own use case. In this post, I brief describe those
three approaches for coping files with a focus on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt; in
terms of how to use it, how to setup, and my thoughts on how to
enhance it.&lt;/p&gt;

&lt;p&gt;Note the RL stands for remote location, i.e. a folder a in remote
server, and LL stands for local location, the RL’s counterpart. The
action in discussion is how to efficiently copying files from RL to
LL.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org6c487b1&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;file-manager-gui&quot;&gt;File Manager GUI&lt;/h1&gt;

&lt;p&gt;This is the simplest approach requires little technical skills. The RL
is mounted in the file manager which acts as an access point so it can
be used just like a local folder.&lt;/p&gt;

&lt;p&gt;I usually have two tabs open side by side, one for RL, and one for LL,
compare the differences, and then copy what are useful and exists in
RL but not in LL.&lt;/p&gt;

&lt;p&gt;I used this approach on my Windows work laptop where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt; is not
available so I have to copy files manually.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;orgfe31ac5&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;rsync-tool-in-cli&quot;&gt;Rsync Tool in CLI&lt;/h1&gt;

&lt;p&gt;The &lt;a href=&quot;https://linux.die.net/man/1/rsync&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt;&lt;/a&gt; tool is similar to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cp&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scp&lt;/code&gt; but it is much more
power:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It copies files incrementally so it can stop at anytime without
losing progress&lt;/li&gt;
  &lt;li&gt;The output shows what files are copied, what are remaining, copying
speed, overall progress etc&lt;/li&gt;
  &lt;li&gt;Files and folders can be included/excluded by specifying
patterns&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I have a bash function in the project’s script folder as a shorthand
like this&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;copy_from_debian_to_laptop &lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# first argument to this function&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;folder_to_sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# define where the RL is &lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;remote_project_dir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;debian:~/Projects/2022-May
    &lt;span class=&quot;c&quot;&gt;# define where the LL is &lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;local_project_dir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;~/Projects/2022-May          
    rsync &lt;span class=&quot;nt&quot;&gt;-avh&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--progress&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	  &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;remote_project_dir&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;folder_to_sync&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;/ &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	  &lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;local_project_dir&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;/&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;folder_to_sync&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;To use it, I firstly &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cd&lt;/code&gt; (change directory) to the project directory
in terminal, call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;copy_from_debian_to_laptop&lt;/code&gt; function, and use the
TAB completion to quickly get the directory I want to copy, for
example&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;copy_from_debian_to_laptop experiment/2022-07-17-FE&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This function is called more often from a org-mode file where I kept
track of all the experiments.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;orgf0ebfb2&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;emacs-way-dired-rsync&quot;&gt;Emacs’ Way: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt;&lt;/h1&gt;

&lt;p&gt;This approach is a blend of the previous two, enable user to enjoy the
benefits of GUI for exploring and the power of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What’s more, it integrates so well into the current workflow by simply
switching from calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-copy&lt;/code&gt; to calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt;, or
pressing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;r&lt;/code&gt; key instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt; key by using the configuration in this
post.&lt;/p&gt;

&lt;p&gt;To those who are not familiar with copying files using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired&lt;/code&gt; in
Emacs, here is the step by step process:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Open two &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired&lt;/code&gt; buffer, one at RL and one at LL, either manually
or using &lt;a href=&quot;https://www.gnu.org/software/emacs/manual/html_node/emacs/Bookmarks.html&quot;&gt;bookmarks&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Mark the files/folders to copy in the RL &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired&lt;/code&gt; buffer&lt;/li&gt;
  &lt;li&gt;Press &lt;em&gt;r&lt;/em&gt; key to invoke &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;It asks for what to copy to. The default destination is LL so press
Enter to confirm.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After that, a unique process buffer, named *rsync with a timestamp
suffix, is created to show the rsync output. I can stop the copying by
killing the process buffer.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org23d9482&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;setup-for-diredrsync&quot;&gt;Setup for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired=rsync&lt;/code&gt;&lt;/h1&gt;

&lt;p&gt;The &lt;em&gt;dired-rsync-options&lt;/em&gt; control the output shown in the process
buffer. It defaults to “-az –info=progress2”. It shows the overall
progress in one-line, clean and neat (not in MacOS though, see &lt;a href=&quot;https://github.com/stsquad/dired-rsync/issues/36&quot;&gt;Issue
36&lt;/a&gt;). Sometimes I prefer “-azh –progress” so I can see exactly which
files are copied.&lt;/p&gt;

&lt;p&gt;There are other options for showing progress in modeline
(&lt;em&gt;dired-rsync-modeline-status&lt;/em&gt;), hooks for sending notifications on
failure/success (&lt;em&gt;dired-rsync-failed-hook&lt;/em&gt; and
&lt;em&gt;dired-rsync-success-hook&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;Overall the library is well designed, and the default options work for
me, so I can have a bare-minimal configuration as below (borrowed from
&lt;a href=&quot;https://www.reddit.com/r/emacs/comments/g0jkkj/comment/fnc68iq/?utm_source=share&amp;amp;utm_medium=web2x&amp;amp;context=3&quot;&gt;ispinfx&lt;/a&gt;):&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-elisp&quot; data-lang=&quot;elisp&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;use-package&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dired-rsync&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:demand&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;t&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:after&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dired&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:bind&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:map&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dired-mode-map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;r&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dired-rsync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:config&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;add-to-list&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;mode-line-misc-info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:eval&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dired-rsync-modeline-status&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&apos;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There are two more things to do on the system side:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;In macOS, the default rsync is a 2010 version. It does not work
with the latest rsync I have on Debian server so I upgrade it using
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;brew install rsync&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There no way of typing password as a limitation of using process
buffer so I have to ensure I can rsync without remote server asking
for password. It sounds complicated but fortunately it takes few
steps to do as in &lt;a href=&quot;https://fedingo.com/setup-rsync-between-two-servers-without-password/&quot;&gt;Setup Rsync Between Two Servers Without Password&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a id=&quot;org0566895&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;enhance-dired-rsync-with-compilation-mode&quot;&gt;Enhance &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dired-rsync&lt;/code&gt; with compilation mode&lt;/h1&gt;

&lt;p&gt;It’s such a great library that makes my life much easier. It can be
improved further to provide greater user experience, for example, keep
the process buffer alive as a log after the coping finished because
the user might want to have a look later.&lt;/p&gt;

&lt;p&gt;At the moment, there’s no easy way of changing the arguments send to
rsync. I might want to test a dry-run (adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-n&lt;/code&gt; argument) so I can
see exactly what files are going to be copied before running, or I
need to exclude certain files/folders, or rerun the coping if there’s
new files generated on RL.&lt;/p&gt;

&lt;p&gt;If you used compilation buffer before, you know where I am
going. That’s right, I am thinking of turning the rsync process buffer
into &lt;a href=&quot;https://www.gnu.org/software/emacs/manual/html_node/emacs/Compilation-Mode.html&quot;&gt;compilation mode&lt;/a&gt;, then it would inherit these two features:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Press &lt;em&gt;g&lt;/em&gt; to rerun the rsync command when I know there are new
files generated on the RL&lt;/li&gt;
  &lt;li&gt;Press &lt;em&gt;C-u g&lt;/em&gt; (g with prefix) to change the rsync arguments before
running it for dry-run, inclusion or exclusion&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I don’t have much experience in elisp but I had a quick look at source
code, it seems there’s no easy of implementing this idea so something
to add to my ever-growing Emacs wish-list.&lt;/p&gt;

&lt;p&gt;In fact, the limitation comes from using lower level elisp
functions. The &lt;a href=&quot;https://www.gnu.org/software/emacs/manual/html_node/elisp/Process-Buffers.htmlnov:&quot;&gt;Emacs Lisp manual on Process Buffers&lt;/a&gt; states that&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Many applications of processes also use the buffer for editing input
to be sent to the process, but this is not built into Emacs Lisp.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What a pity. For now I enjoy using it and look for opportunities to
use it.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Move Between Windows in Emacs using windmoveMD</title>
   <link href="http://yitang.uk/2022/07/05/move-between-window-using-builtin-package/"/>
   <updated>2022-07-05T00:00:00+01:00</updated>
   <id>http://yitang.uk/2022/07/05/move-between-window--using-builtin-package</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;h1 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#org225f23b&quot;&gt;Started Seeing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org37e639e&quot;&gt;Ace-Window&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org71d4c11&quot;&gt;Windmove&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#org20d1569&quot;&gt;Back to where it started&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a id=&quot;org225f23b&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;started-seeing&quot;&gt;Started Seeing&lt;/h1&gt;

&lt;p&gt;The good thing about Emacs is that you can always tweak it to suit
your needs. For years I’ve been doing it for productivity reasons. Now
for the first time, I’m doing it for health reasons.&lt;/p&gt;

&lt;p&gt;Life can be sht sometimes, when I was in my mid 20s, I was reshaping
every aspects of my life for good. But optician told me my vision can
only get worse. I wasn’t paying much attention, busy with my first
job and learning.&lt;/p&gt;

&lt;p&gt;Last month, I was told my right eye’s vision got whole point worse,
whatever that means. Now I’m wearing a new pair of glasses, seeing the
world in 4K using both eyes, noticing so much details. It makes the
world so vibrate and exciting. It comes with a price though, my eyes
get tired quickly, and it become so easy to get annoyed by little
things.&lt;/p&gt;

&lt;p&gt;One of them is switching windows in Emacs. Even though I am in the
period of calibrating to the new glasses, I decided to take some
actions.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org37e639e&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;ace-window&quot;&gt;Ace-Window&lt;/h1&gt;

&lt;p&gt;Depends on the complexity of the tasks, I usually have about 4-8
windows laid on my 32 inch monitor. If that’s not enough, I would have
an additional frame of similar windows layout, doubling the number of
windows to 8-16.&lt;/p&gt;

&lt;p&gt;So I found myself switching between windows all the time. The action
itself is straightforward with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ace-window&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The process can be breakdwon into five steps:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Invoke &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ace-window&lt;/code&gt; command by pressing F2 key,&lt;/li&gt;
  &lt;li&gt;The Emacs buffers fade-in,&lt;/li&gt;
  &lt;li&gt;A red number pops-up at the top left corner of each window,&lt;/li&gt;
  &lt;li&gt;I press the number key to switch the window it associates with,&lt;/li&gt;
  &lt;li&gt;After that, the content in each Emacs buffer are brought back.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This &lt;a href=&quot;http://oremacs.com/download/ace-window.gif&quot;&gt;gif&lt;/a&gt; from &lt;a href=&quot;https://github.com/abo-abo/ace-window&quot;&gt;ace-window git repo&lt;/a&gt; demonstrates the process.
&lt;img src=&quot;http://oremacs.com/download/ace-window.gif&quot; alt=&quot;img&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This approach depends on visual feedback - I have to look at the
corner of the window to see the number. Also, the screen flashes
twice during the process.&lt;/p&gt;

&lt;p&gt;I tried removing the background dimming, increase the font size of the
number to make it easier to see, and bunch of other tweaks.&lt;/p&gt;

&lt;p&gt;In the end, my eyes were not satisfied.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org71d4c11&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;windmove&quot;&gt;Windmove&lt;/h1&gt;

&lt;p&gt;So I started looking for alternative approaches and found &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;windmove&lt;/code&gt;
which is built-in.&lt;/p&gt;

&lt;p&gt;The idea is simple - keep move to the adjacent window by move left,
right, up, or down until it arrives at the window I want.&lt;/p&gt;

&lt;p&gt;So it uses the relative location between windows instead of assigning
each window a unique number and then using the number for switching.&lt;/p&gt;

&lt;p&gt;Is it really better? Well with this approach, I use my eyes a lot less
as I do not have to look for the number. Plus, I feel this is more
nature as I do not need to work out the directions, somehow I just
know I need to move right twice or whatever to get to the destination.&lt;/p&gt;

&lt;p&gt;The only issue I had so far is the conflicts with org-mode’s
calendar. I like the keybinding in org-mode, so I disabled &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;windmove&lt;/code&gt;
in org-mode’s calendar with the help from &lt;a href=&quot;https://emacs.stackexchange.com/questions/22286/shiftarrow-to-change-window-does-not-work-in-org-mode&quot;&gt;this stackoverflow question&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The following five lines of code is all I need to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;windmove&lt;/code&gt;.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs-lisp&quot; data-lang=&quot;emacs-lisp&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;windmove-default-keybindings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;define-key&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-read-date-minibuffer-local-map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&amp;lt;left&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-eval-in-calendar&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;calendar-backward-day&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;define-key&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-read-date-minibuffer-local-map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&amp;lt;right&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-eval-in-calendar&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;calendar-forward-day&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;define-key&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-read-date-minibuffer-local-map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&amp;lt;up&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-eval-in-calendar&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;calendar-backward-week&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;define-key&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;org-read-date-minibuffer-local-map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kbd&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&amp;lt;down&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;interactive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;org-eval-in-calendar&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;calendar-forward-week&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I created an git branch for switching from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ace-window&lt;/code&gt; to
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;windmove&lt;/code&gt;. I would try it for a month before merge it into master
branch.&lt;/p&gt;

&lt;p&gt;&lt;a id=&quot;org20d1569&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;back-to-where-it-started&quot;&gt;Back to where it started&lt;/h1&gt;

&lt;p&gt;After using it for few days, I realised this is the very package I
used for switch windows back in 2014 when I started learning Emacs.  I
later then switched to ace-window because it looks pretty cool.&lt;/p&gt;

&lt;p&gt;Life is changing, my perspectives are changing, so is my Emacs
configuration. This time, it goes back to where I started 8 years ago.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Wireless Backup Solution Using Raspberry Pi for MacOS</title>
   <link href="http://yitang.uk/2022/03/11/wireless-backup-solution-using-raspberry-pi-for-macos/"/>
   <updated>2022-03-11T00:00:00+00:00</updated>
   <id>http://yitang.uk/2022/03/11/wireless-backup-solution-using-raspberry-pi-for-macos</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;blockquote&gt;
&lt;p&gt;
If you need automated backups for Time Machine and have a Raspberry
Pi, You will find this post useful.
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#orga8e5648&quot;&gt;Motivation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgb9bcbb8&quot;&gt;Wireless Backup Solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orga70f9b2&quot;&gt;Set Up Raspberry Pi&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org685a349&quot;&gt;Time Machine Backup frequency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org60e5ab9&quot;&gt;Backup for Backups&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;



&lt;div id=&quot;outline-container-orga8e5648&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orga8e5648&quot;&gt;Motivation&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orga8e5648&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
After 3 months using my brand new &lt;a href=&quot;https://www.apple.com/uk/macbook-pro-14-and-16/&quot;&gt;MacBook Pro 14 M1 Pro&lt;/a&gt;, one of the
USB-C port stopped working. I will have to send it back, not sure what
Apple will do with it but I can&apos;t bear the risk of losing data. So I
need a backup.
&lt;/p&gt;

&lt;p&gt;
In fact, I need to backup regularly for situation like this so that&apos;s
why I worked on it.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;



&lt;div id=&quot;outline-container-orgb9bcbb8&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgb9bcbb8&quot;&gt;Wireless Backup Solution&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgb9bcbb8&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The easiest solution is to get a USB-C portable SSD, plug it into my
laptop and open &lt;a href=&quot;https://support.apple.com/en-gb/HT201250&quot;&gt;Time Machine&lt;/a&gt; to start back up, do it once a week and
call it a day.
&lt;/p&gt;

&lt;p&gt;
But I&apos;m reluctant to add more devices to my already cluttered home
lab. There are a few hard drives in the drawers, it would be good to
utilise them.
&lt;/p&gt;

&lt;p&gt;
So I decided to set up a Time Machine backup solution using on my
&lt;a href=&quot;https://www.raspberrypi.com/products/raspberry-pi-4-model-b/&quot;&gt;Raspberry Pi 4&lt;/a&gt;. The benefits are
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;no additional costs, save me about £50-£100&lt;/li&gt;
&lt;li&gt;no need to buy new stuff, so fewer things to care of&lt;/li&gt;
&lt;li&gt;wireless backup to keep my desk clean&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;
Later I realised the benefits of having a wireless backup is
overlooked. It can backup anytime and anywhere in my house. Also,
because of convenience, I can have more granular backups - instead of
weekly backup, I have hourly backup without getting the cables and
hard drives. I do less but get more value out of it.
&lt;/p&gt;

&lt;p&gt;
The only concern I had was the speed. It turns out with SAMBA 3
protocol, I can get 55 MB/s write speed and 40 MB/s read speed from
laptop to Raspberry Pi. So in theory, it would take around 2.5 hours
to backup my 500 GB laptop. It might be a lot but only for the first
backup, the subsequent incremental backup would be much simpler and
faster, for example, as of now, the Time Machine completed a new
backup within 3 minutes in the background without my notice.
&lt;/p&gt;

&lt;p&gt;
A portal USB-C SSD can finish the backup within minutes but it&apos;s an
overkill for an ordinary user like me and it&apos;s inconvenient.
&lt;/p&gt;

&lt;p&gt;
So I&apos;m satisfied with the current solution.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orga70f9b2&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orga70f9b2&quot;&gt;Set Up Raspberry Pi&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orga70f9b2&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I read a few guides on setting up Raspberry Pi for Time Machine, and I
found &lt;a href=&quot;https://mudge.name/2019/11/12/using-a-raspberry-pi-for-time-machine/&quot;&gt;this guide&lt;/a&gt; most accurate and useful.
&lt;/p&gt;

&lt;p&gt;
One thing I noticed is the AFP (Apple File Protocol) is deprecated, so
make sure you use SAMBA as the protocol.
&lt;/p&gt;

&lt;p&gt;
Additionally, I followed this &lt;a href=&quot;https://superuser.com/questions/336665/how-to-automount-smb-shared-network-drives-in-mac-os-x-lion&quot;&gt;stack overflow answer&lt;/a&gt; to auto-mount the
SAMBA server so that every time I reboot my laptop, the Time Machine
will be ready to back up.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org685a349&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org685a349&quot;&gt;Time Machine Backup frequency&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org685a349&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
By default, Time Machine does hourly backup. 
&lt;/p&gt;

&lt;p&gt;
If you feel hourly backup is not necessary, you can change it by
updating this file
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;/System/Library/LaunchDaemons/com.apple.backupd-helper.plist&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
for example, to change the frequency from hourly to daily backup,
change the interval value from 3600 to 43200.
&lt;/p&gt;

&lt;p&gt;
In the end, I left it with the default hourly backup so it does many
small backup hourly instead of one big backup daily.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org60e5ab9&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org60e5ab9&quot;&gt;Backup for Backups&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org60e5ab9&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
After couple of hours of work, I managed to get a wireless backup
solution for my laptop so I won&apos;t have to worry about data loss. Plus
I can time-travel files at hourly intervals.
&lt;/p&gt;

&lt;p&gt;
One concern that occurred to me was the backup sits on my local hard
drive. If the hard drive died, I would lose all my backups.
&lt;/p&gt;

&lt;p&gt;
To solve that problem, I will have to go through the rabbit hole of
doing backup for backups, or backup to a remote location or cloud, or
setup a Raspberry Pi RAID.
&lt;/p&gt;

&lt;p&gt;
At the moment, I&apos;m not very concerned - I have Apple iCloud to back up
my photos, videos, notes etc and I use GitHub to host my org-files and
code. So having a backup for backups is not necessary for me for now.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Managing Emacs Server as Systemd Service</title>
   <link href="http://yitang.uk/2021/06/18/managing-emacs-server-as-systemd-service/"/>
   <updated>2021-06-18T00:00:00+01:00</updated>
   <id>http://yitang.uk/2021/06/18/managing-emacs-server-as-systemd-service</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#orgb5acc86&quot;&gt;Using Emacs Server Without Systemd&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orga98051c&quot;&gt;How to Implement As Systemd Service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org4436194&quot;&gt;Enhance User Experience&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#orgcee690d&quot;&gt;sudo Privilege&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org41b9132&quot;&gt;Environment Variables&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org8c51bf1&quot;&gt;Start Emacs Server Before Login?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgb5acc86&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgb5acc86&quot;&gt;Using Emacs Server Without Systemd&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgb5acc86&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I live in Emacs entirely apart from using browser for googling. Having
an Emacs server running on the background makes Emacs available all
the time. So I won&apos;t worry about closing it accidental.
&lt;/p&gt;

&lt;p&gt;
It is not hard to do that, just run 
&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;emacs &lt;span class=&quot;nt&quot;&gt;--daemon&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
in command line to start the Emacs server. It will load user
configuration file as usual. Then run
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;emacsclient &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; &amp;amp; &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
to open an Emacs GUI instance that uses the Emacs server. That&apos;s
how I have been doing for a while. 
&lt;/p&gt;

&lt;p&gt;
An better approach is using systemd. It is the services manager of
Linux. Whenever my Debian 11 laptop boot up, systemd would start a
bunch of services in parallel, for example, Networking manager
connects WIFI, Bluetooth connects wireless keyboard so everything
would be ready after I login. And I want Emacs to be ready as well.
&lt;/p&gt;

&lt;p&gt;
I can achieve that by simply having an shell script automatically
running after login. But there are benefits of using systemd. It
has bunch of sub-commands for managing services, for example,
checking logs, status etc. 
&lt;/p&gt;

&lt;p&gt;
It&apos;s a nice tool to have, I can use it for example Jupyter Notebook
server.
&lt;/p&gt;

&lt;p&gt;
That&apos;s why I pulled the trigger and spent 2 hours in implementing
and testing it. Here&apos;s the technical bit.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orga98051c&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orga98051c&quot;&gt;How to Implement As Systemd Service&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orga98051c&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
In order to use systemd to manage Emacs server, I firstly need a
configuration file (which is called unit file). &lt;a href=&quot;https://wiki.debian.org/systemd/Services&quot;&gt;Debian Wiki&lt;/a&gt;
provides a short description of the syntax and parameter of unit
file.
&lt;/p&gt;

&lt;p&gt;
I found an simple one in &lt;a href=&quot;file:///www.emacswiki.org/emacs/EmacsAsDaemon&quot;&gt;Emacs Wiki&lt;/a&gt;. It looks like this
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;orgfaf60d1&quot;&gt;
[Unit]
Description=Emacs text editor
Documentation=info:emacs man:emacs(1) https://gnu.org/software/emacs/

[Service]
Type=forking
ExecStart=/usr/bin/emacs --daemon
ExecStop=/usr/bin/emacsclient --eval &quot;(kill-emacs)&quot;
Environment=SSH_AUTH_SOCK=%t/ssh-agent.socket
Restart=on-failure

[Install]
WantedBy=default.target
&lt;/pre&gt;


&lt;p&gt;
The important parameters are
&lt;/p&gt;
&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;ExecStart&lt;/dt&gt;&lt;dd&gt;It tells systemd what to do when starting Emacs&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;
service, in this case it runs &lt;code&gt;/usr/bin/emacs --daemon&lt;/code&gt; command.
&lt;/p&gt;
&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;ExecStop&lt;/dt&gt;&lt;dd&gt;it tells systemd what to do when shutting down Emacs&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;
service, in this case it runs &lt;code&gt;/usr/bin/emacsclient --eval
&quot;(kill-emacs)&quot;&lt;/code&gt; command.
&lt;/p&gt;

&lt;p&gt;
If you are using an Emacs built in a difference directory, you have
to change &lt;i&gt;/usr/bin/emacs&lt;/i&gt; to wherever your Emacs is located.
&lt;/p&gt;

&lt;p&gt;
Then save the configuration file as
&lt;i&gt;~&lt;/i&gt;.config/systemd/user/emacs.service/.
&lt;/p&gt;

&lt;p&gt;
After that run
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;systemctl &lt;span class=&quot;nb&quot;&gt;enable&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--user&lt;/span&gt; emacs&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
so systemd would copy the configuration file into central places
and it would start Emacs service at boot time.
&lt;/p&gt;

&lt;p&gt;
To run Emacs service right now, use
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;systemctl start &lt;span class=&quot;nt&quot;&gt;--user&lt;/span&gt; emacs&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
This is what I see in my console
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;orgf995d64&quot;&gt;
emacs.service - Emacs text editor
Loaded: loaded (/home/yitang/.config/systemd/user/emacs.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2021-06-14 09:12:26 BST; 24h ago
Docs: info:emacs
man:emacs(1)
https://gnu.org/software/emacs/
Main PID: 5222 (emacs)
Tasks: 5 (limit: 19027)
Memory: 154.7M
CPU: 3min 25.049s
CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/emacs.service
├─ 5222 /usr/bin/emacs --daemon
└─16086 /usr/bin/aspell -a -m -d en_GB -p /home/yitang/git/.emacs.d/local/ispell-dict --encoding=utf-8

Jun 14 09:11:57 7270 emacs[5222]: No event to add
Jun 14 09:11:57 7270 emacs[5222]: Package dash-functional is obsolete; use dash 2.18.0 instead
Jun 14 09:12:01 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/config/org-mode.el (source)...done
Jun 14 09:12:01 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/config/refile.el (source)...
Jun 14 09:12:01 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/config/refile.el (source)...done
Jun 14 09:12:01 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/config/scripting.el (source)...
Jun 14 09:12:26 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/config/scripting.el (source)...done
Jun 14 09:12:26 7270 emacs[5222]: Loading /home/yitang/git/.emacs.d/load_config.el (source)...done
Jun 14 09:12:26 7270 emacs[5222]: Starting Emacs daemon.
Jun 14 09:12:26 7270 systemd[4589]: Started Emacs text editor.
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org4436194&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org4436194&quot;&gt;Enhance User Experience&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org4436194&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
For far, I have the following two tweaks to make the usage of systemd
more pleasant.
&lt;/p&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgcee690d&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;orgcee690d&quot;&gt;sudo Privilege&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-orgcee690d&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The Emacs server is started using my own account, so it doesn&apos;t have
the sudo privilege. In order to edit files that requires sudo
permission, simple open the file in Emacs, or in command line with
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;emascclient &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; FILENAME&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
then type &lt;code&gt;M-x sudo&lt;/code&gt; inside Emacs, type the sudo password. If the
password is correct, I can edit and save the file as sudo user.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;


&lt;div id=&quot;outline-container-org41b9132&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;org41b9132&quot;&gt;Environment Variables&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-org41b9132&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The customised shell configuration in &lt;i&gt;.bashrc&lt;/i&gt; are loaded when
opening an interactive shell session. So the Emacs server managed by
systemd would not have the environment variables, alias, functions or
whatever defined in &lt;i&gt;.bashrc&lt;/i&gt;.
&lt;/p&gt;

&lt;p&gt;
&lt;a href=&quot;https://stackoverflow.com/questions/49764993/using-a-users-bashrc-in-a-systemd-service&quot;&gt;This stackoverflow post&lt;/a&gt; provides the rationale and how to tweak
the unit file so systemd would load &lt;i&gt;.bashrc&lt;/i&gt;.
&lt;/p&gt;

&lt;p&gt;
This problem can solved a lot easier on the Emacs side, by using
&lt;a href=&quot;https://github.com/purcell/exec-path-from-shell&quot;&gt;exec-path-from-shell&lt;/a&gt; package. It will ensure the environment
variables inside Emacs are the same as in the user&apos;s interactive
shell.
&lt;/p&gt;

&lt;p&gt;
Simply put the following in your .emacs would do the trick.
&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(exec-path-from-shell-initialize)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;


&lt;div id=&quot;outline-container-org8c51bf1&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org8c51bf1&quot;&gt;Start Emacs Server Before Login?&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org8c51bf1&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The systemd services under my account would only start after I
login. Because I have tons of Emacs configuration, I still have to
wait few seconds before Emacs server is ready. So it would be awesome
to have the Emacs server starting to load before I login.
&lt;/p&gt;

&lt;p&gt;
This doesn&apos;t seems to be simple to implement, because technically,
it would require the Emacs server to be defined on system level,
but it will load files in my personal home drive without me being
logged in. It might be still okay since I&apos;m the sole user of my
laptop, but I have to tweak the permissions and would probably end
up with non-secure permission setting.
&lt;/p&gt;

&lt;p&gt;
So I leave this idea here.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Kaggle Avito Demand Prediction Challenge - 22th Solution</title>
   <link href="http://yitang.uk/2018/07/01/Kaggle-Avito-Demand-Prediction-Challenge/"/>
   <updated>2018-07-01T00:00:00+01:00</updated>
   <id>http://yitang.uk/2018/07/01/Kaggle-Avito-Demand-Prediction-Challenge</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#org863dda2&quot;&gt;Final Ensemble Models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org75b07d7&quot;&gt;Collaboration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orge8275fc&quot;&gt;Some Kagglers to Avoid&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;a href=&quot;https://www.kaggle.com/c/avito-demand-prediction&quot;&gt;Avito Demand Prediction Challenge&lt;/a&gt; asks Kagglers to predict the
&quot;demand&quot; likelihood of an advertisement. If an listed 2nd-hand Iphone
6 is selling for £20,000, then the &quot;demand&quot; is likely to be very low.
This is the my first competition to build model using tabular data,
text, and also images.
&lt;/p&gt;

&lt;p&gt;
I teamed up with &lt;a href=&quot;https://www.kaggle.com/rashmibanthia&quot;&gt;Rashmi&lt;/a&gt;, &lt;a href=&quot;https://www.kaggle.com/abhimanyud&quot;&gt;Abhimanyu&lt;/a&gt;, &lt;a href=&quot;https://www.kaggle.com/peterzheng&quot;&gt;Yiang&lt;/a&gt;, &lt;a href=&quot;https://www.kaggle.com/samratp&quot;&gt;Samrat&lt;/a&gt; and we finished at
22 among 1917 teams. So far, I have four silver medals and my rank is
542 among 83,588 Kaggler.
&lt;/p&gt;

&lt;p&gt;
This is an interesting competition for me. I was about to quit this
competition and Kaggle because of other commitments in life/work. Just
one day before team merge deadline, Rashmi asked me to join, at that
time, my position is 880-th, about 50%, and Rashmi&apos;s team is about
82-th. So I decided to join and finish this competition which I
already spent about many hours.
&lt;/p&gt;
&lt;div id=&quot;outline-container-org863dda2&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org863dda2&quot;&gt;Final Ensemble Models&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org863dda2&quot;&gt;
&lt;p&gt;
As part of this team, I worked on final ensemble models. Immediately
after join, i completed 5 tasks:
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;make sure everyone uses the same agreed cross validation schema.
This is essential for building ensemble model.&lt;/li&gt;
&lt;li&gt;provide model_zoo.md document to keep track of all level 1
models, their train/valid/lb scores, feature used, and file path
to their oof/test prediction.&lt;/li&gt;
&lt;li&gt;write merge_oof.py to combine all oof/test predictions together.&lt;/li&gt;
&lt;li&gt;write R scripts for glmnet ensemble.&lt;/li&gt;
&lt;li&gt;write python scripts for LightGBM ensemble.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;
Once new model is built, other team member update the model_zoo.md
and upload the data to a private github repo. Then I update the
merge_oof.py to include new models&apos; result, and run glmnet and
LightGBM ensemble. We had this ensemble workflow automated so it
takes little effort to see the ensemble model&apos;s performance.
&lt;/p&gt;

&lt;p&gt;
I spent some times analysing the coefficients/weights of L1 model
and tried to exclude models with negative and lower weights. To my
surprise it doesn&apos;t help at all. The final submission is a glmnet
ensemble with 41 models (lgb + xgb + NN).
&lt;/p&gt;

&lt;p&gt;
Also, LightGBM ensemble has much better cv score but the LB score
is worse. I suspect it is because there are leakage in L1 models
and glmnet is more robust to leakage since it&apos;s linear model.
Unfortunately, there&apos;s no enough time to identify which models have
leakage.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org75b07d7&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org75b07d7&quot;&gt;Collaboration&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org75b07d7&quot;&gt;
&lt;p&gt;
This is my 2nd time work in a team, although there&apos;s a lot space
for improvement collaborating when compared with a professional
data scientist team but as night/weekend project, we have done a
really good job as a team.
&lt;/p&gt;

&lt;p&gt;
The setup for collaboration:
&lt;/p&gt;
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;Slack for discussion. we have channel for general,
final_ensemble, random for cat photos etc.&lt;/li&gt;
&lt;li&gt;we also used Slack for sharing features which i personal don&apos;t
like.&lt;/li&gt;
&lt;li&gt;Private github repo for sharing code and oof/test predictions.&lt;/li&gt;
&lt;li&gt;Monday.com for managing tasks. it gives a nice overview of what
everyone&apos;s up to.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
we tried very hard to get a gold, but other teams work even more
harder. At one point we were at 17, and finished at 22.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orge8275fc&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orge8275fc&quot;&gt;Some Kagglers to Avoid&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orge8275fc&quot;&gt;
&lt;p&gt;
Finally, when we waited 1 hour for the final deadline, we had a
lovely discussion about our past disqualification experience. We
were all shocked when we were at different team in Toxic
competition but team up with the same person. We shared their
person&apos;s multiple Kaggle accounts and added to our personal
block-list.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Build  Notification Features</title>
   <link href="http://yitang.uk/2016/07/02/email-from-python-an-kiss-tutorial-for-sending-emails/"/>
   <updated>2016-07-02T00:00:00+01:00</updated>
   <id>http://yitang.uk/2016/07/02/email-from-python--an-kiss-tutorial-for-sending-emails</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;
Data processing time will becomes longer and longer as the increasing
rate of data volumes. Users may check-in frequently to see the whether
it is finished. Most of the time they will found it hasn&apos;t, doing this
the users make contact switch which break the flow of whatever the
using was doing.
&lt;/p&gt;

&lt;p&gt;
Sometimes the user can&apos;t stop doing so, either because they are
impatient, or because they really have a deadline to catch. Also, with
less likelihood, they might find errors in the processing, either
because of the QC check fails, or running out of computational
resources. 
&lt;/p&gt;

&lt;p&gt;
Giving this fact, it really makes sens to have your program &lt;i&gt;actively&lt;/i&gt;
inform the user on the process so that they doesn&apos;t need to check-in
at all. Because users will be &lt;i&gt;notified immediately&lt;/i&gt; whenever the
whole progressing is completed, or there&apos;s error that the user needs
to take action onup.
&lt;/p&gt;

&lt;p&gt;
This blog posts walk though the basics of sending Emails in Python,
composing and sending out Emails. Each component is broken down into
small piceses. It helps you debug/tests Email program, and personalise
your Emails. In the end, you should be able to build a email robot.
&lt;/p&gt;

&lt;div id=&quot;outline-container-orga3720d6&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orga3720d6&quot;&gt;Prerequisite&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orga3720d6&quot;&gt;
&lt;p&gt;
Before going into the technical details, you have to check that you
are able to send out emails. You need the
&lt;/p&gt;

&lt;ul class=&quot;org-ul&quot;&gt;
&lt;li class=&quot;off&quot;&gt;&lt;code&gt;[&amp;#xa0;]&lt;/code&gt; SMTP server,&lt;/li&gt;
&lt;li class=&quot;off&quot;&gt;&lt;code&gt;[&amp;#xa0;]&lt;/code&gt; user name,&lt;/li&gt;
&lt;li class=&quot;off&quot;&gt;&lt;code&gt;[&amp;#xa0;]&lt;/code&gt; password,&lt;/li&gt;
&lt;li class=&quot;off&quot;&gt;&lt;code&gt;[&amp;#xa0;]&lt;/code&gt; port number, and&lt;/li&gt;
&lt;li class=&quot;off&quot;&gt;&lt;code&gt;[&amp;#xa0;]&lt;/code&gt; communication protocol.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
You could easily found out these information from your Email service
provider. An example of Gmail is at &lt;a href=&quot;https://support.google.com/a/answer/176600?hl=en&quot;&gt;Here&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
To check if you have all the information correct, run the following
snippet. It will try to send out an empty Email to yourself. Make
sure you fill in the &lt;i&gt;username&lt;/i&gt; and &lt;i&gt;password&lt;/i&gt; before hit go.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;  &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;smtplib&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;username&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fill&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;In&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;password&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fill&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;In&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;smtplib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SMTP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;smtp.gmail.com&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;starttls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# set connection to TLS mode
&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;login&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Log in to the remote server
&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendmail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;For testing&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Send emails
&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;quit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# close connection.
&lt;/span&gt;  &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
If you don&apos;t see error messages, that&apos;s great, you are all set.
There should be an Email in your Inbox. It has no subject and &lt;i&gt;for
testing&lt;/i&gt; in the main body.
&lt;/p&gt;

&lt;p&gt;
If you do, you need to double check the information, and try
again. If you are sure that the information are correct, btu still
can&apos;t send out, check your network configuration, maybe the firewall
block the connection.
&lt;/p&gt;

&lt;p&gt;
Once you are able to send out an empty email the next is to compose
an full Email.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org9f7a6e1&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org9f7a6e1&quot;&gt;Compose An Email in Python&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org9f7a6e1&quot;&gt;
&lt;p&gt;
An Email is consisted of multiple parts, the Subject, Body,
attachments, Signature, and also some meta-data including from, to,
and date.
&lt;/p&gt;

&lt;p&gt;
Firstly, create an object of &lt;code&gt;MIMEMutiple()&lt;/code&gt; class. It will be the
building block of your Email. The approached described here is to
add each components into it.
&lt;/p&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org6f58a46&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;org6f58a46&quot;&gt;Email meta data&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-org6f58a46&quot;&gt;
&lt;p&gt;
Start with meta data.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEMultipart&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;From&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;your_email_address@somewhere.com&apos;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;To&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;email_friend_#28473@somewhere.com, email_friend_#122xs2212@somewhere.com&apos;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Date&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;formatdate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;localtime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# standard.
&lt;/span&gt;   &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgc8e9ef8&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;orgc8e9ef8&quot;&gt;Email body&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-orgc8e9ef8&quot;&gt;
&lt;p&gt;
For plain text, simply add 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;   &lt;span class=&quot;n&quot;&gt;body_txt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&apos;&apos;
   Hello,

   Just to tell you Python is awesome.
   &apos;&apos;&apos;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;plain_body&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEText&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body_txt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;plain&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
   &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
You could also try to compose an complex HTML email in Python, or
use diffenret tool to generate the HTML and import in Python.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;   &lt;span class=&quot;n&quot;&gt;html_txt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&apos;&apos;\
   &amp;lt;html&amp;gt;
   &amp;lt;head&amp;gt;&amp;lt;/head&amp;gt;
   &amp;lt;body&amp;gt;
   &amp;lt;p&amp;gt;Hi!&amp;lt;br&amp;gt;
   How are you?&amp;lt;br&amp;gt;
   Here is the &amp;lt;a href=&quot;http://www.python.org&quot;&amp;gt;link&amp;lt;/a&amp;gt; you wanted.
   &amp;lt;/p&amp;gt;
   &amp;lt;/body&amp;gt;
   &amp;lt;/html&amp;gt;
   &apos;&apos;&apos;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;html_body&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;email&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MIMEText&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;html_txt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;html&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

   &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org1cc6ba4&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;org1cc6ba4&quot;&gt;Attachment&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-org1cc6ba4&quot;&gt;
&lt;p&gt;
You can attach files of any type in an Email, specially, for image,
you could use &lt;code&gt;MIMEImage&lt;/code&gt;, and for audio, you could use
&lt;code&gt;MIMEAudio&lt;/code&gt;.
&lt;/p&gt;

&lt;p&gt;
But you don&apos;t have to be specific. &lt;code&gt;MIMEApplication&lt;/code&gt; would be
sufficeincy for all cases. It will configure the file type of the
attached file. Use it as follows:
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt; &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;rb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
     &lt;span class=&quot;n&quot;&gt;part&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEApplication&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# file content as string
&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;part&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Content-Disposition&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;attachment; filename=&quot;%s&quot;&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# attachment description.
&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;attach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;part&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# attach to the msg.
&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org1bf8b00&quot; class=&quot;outline-3&quot;&gt;
&lt;h3 id=&quot;org1bf8b00&quot;&gt;Put Everything Together&lt;/h3&gt;
&lt;div class=&quot;outline-text-3&quot; id=&quot;text-org1bf8b00&quot;&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;   &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os.path&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;
   &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;smtplib&lt;/span&gt;
   &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;email.mime.application&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEApplication&lt;/span&gt;
   &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;email.mime.multipart&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEMultipart&lt;/span&gt;
   &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;email.mime.text&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEText&lt;/span&gt;
   &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;email.utils&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;formatdate&lt;/span&gt;

   &lt;span class=&quot;c1&quot;&gt;# Configure your email
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;username&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fill&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;password&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fill&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;recipents&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fill&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;Hey&apos;&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;attachments&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# add attachment here.
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;body_txt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&apos;&apos;
      Hello,

      Just to tell you Python is awesome.
      &apos;&apos;&apos;&lt;/span&gt;

   &lt;span class=&quot;c1&quot;&gt;# Email - meta data
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEMultipart&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;From&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;To&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;, &apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recipents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Subject&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subject&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Date&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;formatdate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;localtime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# standard.
&lt;/span&gt;
   &lt;span class=&quot;c1&quot;&gt;# Email - main body
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;plain_body&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEText&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body_txt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;plain&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;attach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plain_body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

   &lt;span class=&quot;c1&quot;&gt;# attachments
&lt;/span&gt;   &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attachments&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]:&lt;/span&gt;
       &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;rb&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
	   &lt;span class=&quot;n&quot;&gt;part&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MIMEApplication&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# file content as string
&lt;/span&gt;	   &lt;span class=&quot;n&quot;&gt;part&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;Content-Disposition&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;attachment; filename=&quot;%s&quot;&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# attachment description.
&lt;/span&gt;	   &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;attach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;part&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# attach to the msg.
&lt;/span&gt;

   &lt;span class=&quot;c1&quot;&gt;# send out email
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;smtplib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SMTP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;smtp.gmail.com&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;starttls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# set connection to TLS mode
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;login&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Log in to the remote server
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendmail&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Send emails
&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;quit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# close connection.
&lt;/span&gt;
   &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgb95831d&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgb95831d&quot;&gt;Wrap everything in a Class&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgb95831d&quot;&gt;
&lt;p&gt;
At JBARML, we are planing to send users emails for a progress
update. Many of the data generating process are lured together and
automated by a workflow manager. The whole programcan takes upto
weeks to complete. Actively sending update progeress to user is
much more senssible then user loggin in to a remote machien and
check it now and then. 
&lt;/p&gt;

&lt;p&gt;
In this case, 
&lt;/p&gt;

&lt;p&gt;
email to notify user the progress of the data
generating workflow. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>etags - Build a TAG for Multiple R Packages</title>
   <link href="http://yitang.uk/2016/05/04/etags-build-a-tag-for-multiple-r-packages/"/>
   <updated>2016-05-04T00:00:00+01:00</updated>
   <id>http://yitang.uk/2016/05/04/etags--build-a-tag-for-multiple-r-packages</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;
Here is what tried to build a TAG for multiple R packages. It enable
me to jump to a location where the function/variable is defined and
modify if I want to.
&lt;/p&gt;

&lt;p&gt;
Useful variable and functions 
&lt;/p&gt;
&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;ess-r-package-library-path&lt;/dt&gt;&lt;dd&gt;default path to find packages, should
be a list&lt;/dd&gt;
&lt;dt&gt;ess-r-package-root-file&lt;/dt&gt;&lt;dd&gt;if the folder has DESCRIPTION file, then
the folder is a R package.&lt;/dd&gt;
&lt;dt&gt;(ess-build-tags-for-directory DIR TAGFILE)&lt;/dt&gt;&lt;dd&gt;build tag on DIR to TARGET.&lt;/dd&gt;
&lt;dt&gt;tags-table-list&lt;/dt&gt;&lt;dd&gt;List of file names of tags tables to search.&lt;/dd&gt;
&lt;dt&gt;(visit-tags-table FILE &amp;amp;optional LOCAL)&lt;/dt&gt;&lt;dd&gt;Tell tags commands to use
tags table file.&lt;/dd&gt;
&lt;/dl&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;;; new variable 
(defvar ess-r-package-library-tags nil
  &quot;A TAG file for multiple R packages.&quot;)

(setq ess-r-package-library-path &apos;(&quot;~/tmp/feather/R&quot; &quot;~/tmp/RPostgres/&quot;))
(setq ess-r-package-library-tags &quot;~/tmp/all_tags&quot;)

(dolist (pkg-path ess-r-package-library-path)
  (let ((pkg-name (ess-r-package--find-package-name pkg-path)))
    (unless (and pkg-name pkg-path
                 (file-exists-p (expand-file-name ess-r-package-root-file pkg-path)))
      (error &quot;Not a valid package. No &apos;%s&apos; found in `%s&apos;.&quot; ess-r-package-root-file pkg-path))
    (ess-build-tags-for-directory pkg-path ess-r-package-library-tags)
    ))&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
Note the workhorse is &lt;code&gt;ess-build-tags-for-directory&lt;/code&gt; which does what
it means. The core of this function use &lt;code&gt;find&lt;/code&gt; and &lt;code&gt;etags&lt;/code&gt; program.
The &lt;code&gt;find&lt;/code&gt; program will find files with extension .cpp, R, nw etc, and
then feed to (using pipe) to the &lt;code&gt;etags&lt;/code&gt; program which generate a TAG
table. These two steps are demonstrated in the following snippet,
which is grabbed from the source code of
&lt;code&gt;ess-build-tags-for-directory&lt;/code&gt;.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(setq find-cmd (format &quot;find %s -type f -size 1M \\( -regex \&quot;.*\\.\\(cpp\\|jl\\|[RsrSch]\\(nw\\)?\\)$\&quot; \\)&quot; (car ess-r-package-library-path)))

(setq regs (delq nil (mapcar (lambda (l)
                               (if (string-match &quot;&apos;&quot; (cadr l))
                                   nil ;; remove for time being
                                 (format &quot;/%s/\\%d/&quot;
                                         (replace-regexp-in-string &quot;/&quot; &quot;\\/&quot; (nth 1 l) t)
                                         (nth 2 l))))
                             imenu-generic-expression)))
(setq tags-cmd (format &quot;etags -o %s --regex=&apos;%s&apos; -&quot; &quot;~/lala&quot;
                       (mapconcat &apos;identity regs &quot;&apos; --regex=&apos;&quot;)))

(setq sh-cmd (format &quot;%s | %s&quot; find-cmd tags-cmd))
(shell-command sh-cmd)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
Note when they are used in Emacs, the &lt;i&gt;tags-table-list&lt;/i&gt; variable is
appended with the path to the new TAG table. So that the user can use
&lt;code&gt;xref-find-definitions&lt;/code&gt; (&lt;code&gt;M-.&lt;/code&gt;) to jump (if the point is under a word) or
select which function/variable to jump to. The users then check the
function/variable definition, or modify it if it is necessary. Then
call &lt;code&gt;xref-pop-marker-stack&lt;/code&gt; (&lt;code&gt;M-,&lt;/code&gt;) to jump back.
&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Compare RPostgres and RPostgreSQL Package</title>
   <link href="http://yitang.uk/2016/04/14/compare-rpostgres-and-rpostgresql-package/"/>
   <updated>2016-04-14T00:00:00+01:00</updated>
   <id>http://yitang.uk/2016/04/14/compare-rpostgres-and-rpostgresql-package</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot;
    src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;
R is a great language for R&amp;amp;D. It&apos;s fast to write prototypes, and has great
visualisation tools. One of constraints of R is it stores the data in
system memory. When the data becomes too big to fit in the memory, we
asked the user has to manually split the dataset and then aggregate
the output later. This process is inefficient and error prone for a
non-technical user.
&lt;/p&gt;

&lt;p&gt;
I started an R development project to automate this split-aggregate
process. A viable solution is to store the whole data in PostgreSQL,
and let R to fetch one small chunk of the data at a time, do the
calculation, and then save the output to PostgreSQL. This solution
requires frequently data transferring between these two systems,
which could be a bottleneck in performance. So I did a comparison of
two R packages that interface R and PostgreSQL.
&lt;/p&gt;

&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;&lt;a href=&quot;https://cran.r-project.org/web/packages/RPostgreSQL/index.html&quot;&gt;RPosrgreSQL&lt;/a&gt;&lt;/dt&gt;&lt;dd&gt;is supported and developed in the Google Summer of
Code 2008 program. It is currently out of development. The last
publication is in 2013.&lt;/dd&gt;
&lt;dt&gt;&lt;a href=&quot;https://github.com/rstats-db/RPostgres&quot;&gt;RPostgres&lt;/a&gt;&lt;/dt&gt;&lt;dd&gt;is a new package which provides similar functionality
to RPostgreSQL but rewrite using C++ and Rcpp. The development is
led by &lt;a href=&quot;https://github.com/krlmlr&quot;&gt;Kirill Müller&lt;/a&gt;.&lt;/dd&gt;
&lt;/dl&gt;


&lt;p&gt;
Based on my testing, the RPostgres package is about 30% faster than
 RPostgreSQL.
&lt;/p&gt;

&lt;p&gt;
The testing set-up is quite simple: I write an R script to send data to
and get data out from a remote PostgreSQL database. It logs how long
each task takes to complete in R. To avoid other factors that can
affect the speed, it repeats this process 20 times and use the
minimal run-time as the final score. The dataset transferred between
R and PostgreSQL is a flat table with three columns and the number of
rows varies from ten thousand to one million.
&lt;/p&gt;

&lt;p&gt;
The run-time in seconds are plotted against number for rows for each
package and operation.
&lt;/p&gt;

&lt;p&gt;
&lt;img src=&quot;/assets/figure-1510lYU.png&quot; alt=&quot;nil&quot;/&gt;
&lt;/p&gt;

&lt;p&gt;
Here is a summary of what I observed: 
&lt;/p&gt;
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;RPostgreSQL is slower than RPostgres. For getting data out, it&apos;s 75%
slower, which is massive! For writing, difference is closer, it&apos;s
about 20%. When combine both scores together, it is about 33% slower.&lt;/li&gt;
&lt;li&gt;Particularly, it&apos;s slower to read than to write for RPostgreSQL
package, the ratio is about 1.5. While as it&apos;s quicker to read than
to write for RPostgres, the ratio is about 0.8. This is an interesting
observation.&lt;/li&gt;
&lt;li&gt;Both package has a nice feature - the reading/writing time
linearly depends on the number of rows. This makes the time
estimation reliable. I would be confident to say that for 2
millions rows, it takes RPostgres package about 6 seconds to
read.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
I don&apos;t why which part of implementation makes the RPostgres faster.
I guess its the usage of C++ and the magical Rcpp package.
&lt;/p&gt;

&lt;p&gt;
Here is the script just in case you want to your own tests.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;                     
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;microbenchmark&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RPostgreSQL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DBI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   
                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# config for PostgreSQL database&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host.name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;database.name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.user&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.passwd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.port&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary.table.name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# config for testing&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrows&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1e3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1e6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeats&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;


                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# open PostgreSQL connection&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.RPostgreSQL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbConnect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbDriver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;PostgreSQL&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;database.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.passwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.port&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.RPostgres&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbConnect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RPostgres&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Postgres&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;database.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.user&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.passwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;postgres.port&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadWriteWarpper&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.connection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# helper function &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbWriteTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.connection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary.table.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;overwrite&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbReadTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.connection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;temporary.table.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# create a dataset&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dt&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;LETTERS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# character&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rnorm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# double&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample.int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# integer&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# read and write once first.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# run and log run-time&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;microbenchmark&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                             &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                             &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;times&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# parse &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_row&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# aggregate and return&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rbindlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

                                        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# run&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.RPostgres&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pg.RPostgreSQL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pacakge&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;RPostgres&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;package&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;RPostgreSQL&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rbind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot.df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1e9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;package&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## generate plot&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot.df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gsub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\\(|\\)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot.df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;col&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;package&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;facet_wrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme_bw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Number of rows&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Run time (sec)&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
</content>
 </entry>
 
 <entry>
   <title>How to Create a Screencast GIF in Emacs</title>
   <link href="http://yitang.uk/2015/09/24/how-to-create-a-screencast-gif-in-emacs/"/>
   <updated>2015-09-24T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/09/24/how-to-create-a-screencast-gif-in-emacs</id>
   <content type="html">&lt;p&gt;
&lt;img src=&quot;/assets/TDD_Final.gif&quot; alt=&quot;nil&quot;/&gt;
&lt;/p&gt;

&lt;p&gt;
I&apos;ve always wanted to create a GIF using Emacs to demonstrate some
features, it just looks so cool. I finally got a chance after
attending the &lt;a href=&quot;http://leedscodedojo.github.io/&quot;&gt;Leeds Code Dojo&lt;/a&gt;. The final exercise is bit unusual; we
have to write a basic expression evaluation program without using the
&lt;code&gt;eval&lt;/code&gt; function in whatever language we choose. The first problem we
had was to figure out the order of sub-expression to evaluate. For
example, in (5 * (2 + 1) ) expression, we know we firstly add 2 to 1
to get the 3, and then multiply 3 by 5. It sounds trivial but it is
actually hard to write a program to do that.
&lt;/p&gt;

&lt;p&gt;
I used regular expression&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; to locate the most inner
expression to evaluate, then replaced the expression with its
evaluating result, and continued these two steps until there was no
expression&lt;sup&gt;&lt;a id=&quot;fnr.2&quot; class=&quot;footref&quot; href=&quot;#fn.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.
&lt;/p&gt;

&lt;p&gt;
The above GIF shows each step in a expression evaluation program
written in Emacs Lisp.
&lt;/p&gt;

&lt;p&gt;
This post show how to make GIF in Emacs on Ubuntu system.
&lt;/p&gt;


&lt;div id=&quot;outline-container-org0572864&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org0572864&quot;&gt;Dependencies&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org0572864&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
There are three packages to install first. We need &lt;code&gt;recordmydesktop&lt;/code&gt;
to capture the motion of the screen, &lt;code&gt;mplayer&lt;/code&gt; to view the video, and
&lt;code&gt;imagemagic&lt;/code&gt; to convert the recorded video into GIF file. They can be
installed easily using the &lt;code&gt;apt-get&lt;/code&gt; command, as in the following bash
shell script:
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;recordmydesktop mplayer imagemagick&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
On Emacs side, I use &lt;code&gt;camcorder&lt;/code&gt; package to control the
workflow. It is hosted in MELPA repository, and can be installed by
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(package-install &apos;camcorder)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
Then everything should work nicely together.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org0fbe074&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org0fbe074&quot;&gt;Workflow&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org0fbe074&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
After these packages are installed, creating a GIF is simply, only
requiring three steps.
&lt;/p&gt;

&lt;p&gt;
&lt;b&gt;1. Initiate the recording&lt;/b&gt; 
&lt;/p&gt;

&lt;p&gt;
In Emacs, 
&lt;/p&gt;
&lt;ul class=&quot;org-ul&quot;&gt;
&lt;li&gt;Switch to the buffer we want to record, let&apos;s call this buffer the
recording buffer,&lt;/li&gt;
&lt;li&gt;Initiate the recording by &lt;code&gt;M-x camcorder-record&lt;/code&gt; command,&lt;/li&gt;
&lt;li&gt;Choose where to save the video file, then&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
A new frame with the recording buffer will pop up. It is wrapped inside
a white rectangular box. Everything inside the box will be recorded and
saved in the video file. Note, if we move the window or overlay it
with other windows, we probably get undesired results.
&lt;/p&gt;

&lt;p&gt;
&lt;b&gt;2. Record&lt;/b&gt;
Choose the recording buffer/frame, 
&lt;/p&gt;
&lt;ul class=&quot;org-ul&quot;&gt;
&lt;li&gt;Press &lt;code&gt;F-11&lt;/code&gt; to pause/resume,&lt;/li&gt;
&lt;li&gt;Show some cool things,&lt;/li&gt;
&lt;li&gt;Press &lt;code&gt;F-12&lt;/code&gt; to stop,&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
Note the demonstration must have an effect on the recording buffer, and
we can use &lt;code&gt;with-current-buffer&lt;/code&gt; function to dump the output for a
particular buffer, for example,
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(with-current-buffer &quot;Demo_Buffer&quot;
  (insert &quot;Start to demo: &quot;))&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
will insert &quot;Start to demo: &quot; into the Demo_Buffer. 
&lt;/p&gt;

&lt;p&gt;
I found it is useful to wrap the demonstration into a function and
bind to a key because I will probably run it many times.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(defun yt/camcorder-show-off ()
  (interactive)
  (goto-char (point-min))
  (insert &quot;going to show you something cool, don&apos;t blink your eyes.&quot;)
  (sleep-for 2)
  ;;;; apply some functions
  (insert &quot;\nExciting isn&apos;t?&quot;))

(define-key camcorder-mode-map [f5] &apos;yt/camcorder-show-off)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
There are two functions that are helpful control the flow. Use
&lt;code&gt;sleep-for&lt;/code&gt; function to let the program wait few seconds, and use
&lt;code&gt;y-or-n-p&lt;/code&gt; to let us choose whether to proceed or switch flow.
&lt;/p&gt;

&lt;p&gt;
&lt;b&gt;3. Make gif&lt;/b&gt;
&lt;/p&gt;

&lt;p&gt;
After the demo is finished, 
&lt;/p&gt;

&lt;ul class=&quot;org-ul&quot;&gt;
&lt;li&gt;Type &lt;code&gt;M-x camcorder-convert&lt;/code&gt; to convert a video file to a GIF file,&lt;/li&gt;
&lt;li&gt;Choose a file name for the GIF file,&lt;/li&gt;
&lt;li&gt;Select convert method, and choose use &lt;code&gt;mplay&lt;/code&gt; with &lt;code&gt;imagicstick&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
We probably repeat the step 1-3 multiple times until we are happy
with the GIF.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org28647c7&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org28647c7&quot;&gt;Reference&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org28647c7&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
&lt;a href=&quot;http://emacs.stackexchange.com/questions/798/recording-a-gif-screencast-of-emacs&quot;&gt;Recording a GIF screencast of Emacs&lt;/a&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;footnotes&quot;&gt;
&lt;h2 class=&quot;footnotes&quot;&gt;Footnotes: &lt;/h2&gt;
&lt;div id=&quot;text-footnotes&quot;&gt;

&lt;div class=&quot;footdef&quot;&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; class=&quot;footnum&quot; href=&quot;#fnr.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;div class=&quot;footpara&quot; role=&quot;doc-footnote&quot;&gt;&lt;p class=&quot;footpara&quot;&gt;Regular expression might not be
suitable for this task, and it works&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;footdef&quot;&gt;&lt;sup&gt;&lt;a id=&quot;fn.2&quot; class=&quot;footnum&quot; href=&quot;#fnr.2&quot; role=&quot;doc-backlink&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; &lt;div class=&quot;footpara&quot; role=&quot;doc-footnote&quot;&gt;&lt;p class=&quot;footpara&quot;&gt;Everything is actually an expression&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;


&lt;/div&gt;
&lt;/div&gt;</content>
 </entry>
 
 <entry>
   <title>Migrate to Ubuntu</title>
   <link href="http://yitang.uk/2015/09/06/migrate-to-ubuntu/"/>
   <updated>2015-09-06T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/09/06/migrate-to-ubuntu</id>
   <content type="html">&lt;p&gt;
My MacBookPro&apos;s hard drive stooped working last week and I managed to
recover most of the data from a Time Machine back-up 6 months ago. But
I couldn&apos;t get the mu4e and mu working. I feed up with googling,
trying, and decide to immigrate to Ubuntu. It would save me from a
lot of frustrations and time in making my Mac and office PC work the same
way. 
&lt;/p&gt;

&lt;p&gt;
Ideally, I will built a Ubuntu on Mac which is exactly the same as the
one on my office PC, by just copy over everything &lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. As a minimalist, I
decided to build the system from scratch and install software one by
one so that I can have an better understanding of what are the
necessities for me.
&lt;/p&gt;

&lt;p&gt;
In the last few days, I become extra mindful about the what and how I
used the Ubuntu system in the office, and realise the things I need
can be grouped into three categories:
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;Configuration,
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;the &lt;i&gt;.ssh&lt;/i&gt; folder for the &lt;i&gt;ssh-agent&lt;/i&gt;,&lt;/li&gt;
&lt;li&gt;the &lt;i&gt;.fonts&lt;/i&gt; folder for new fonts,&lt;/li&gt;
&lt;li&gt;the &lt;i&gt;.mbsynrc&lt;/i&gt; file for sync emails,&lt;/li&gt;
&lt;li&gt;the &lt;i&gt;.ledgerrc&lt;/i&gt;.&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;

&lt;li&gt;Software for
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;Development: like git, gcc, Emacs, and R.&lt;/li&gt;
&lt;li&gt;Writing: org-mode, LaTeX,&lt;/li&gt;
&lt;li&gt;Email: mu, mu4e, and mbsync.&lt;/li&gt;
&lt;li&gt;Finance: ledger.&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;

&lt;li&gt;Personal git repositories
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;public reposity on GitHub,&lt;/li&gt;
&lt;li&gt;private reposities on BitBucket&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
For 1), since they are small, I can zip up and copy over, or even
better, create a git repository so that sync on two machines becomes better
easier. 
&lt;/p&gt;

&lt;p&gt;
For 2), I need to find the software&apos;s package name in the Ubuntu&apos;s
software repository, and then install all of them by a script. The
dependencies should be resolved automatically.
&lt;/p&gt;

&lt;p&gt;
For 3), I need to create a shared folder between the host system and the
Ubuntu system, and then copy over the &lt;i&gt;~/git/&lt;/i&gt; folder. 
&lt;/p&gt;

&lt;p&gt;
It really sounds like a plan! I am going to download the Ubuntu
installation file now and hopefully the transition will be very smooth.
&lt;/p&gt;
&lt;div id=&quot;footnotes&quot;&gt;
&lt;h2 class=&quot;footnotes&quot;&gt;Footnotes: &lt;/h2&gt;
&lt;div id=&quot;text-footnotes&quot;&gt;

&lt;div class=&quot;footdef&quot;&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; class=&quot;footnum&quot; href=&quot;#fnr.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;div class=&quot;footpara&quot; role=&quot;doc-footnote&quot;&gt;&lt;p class=&quot;footpara&quot;&gt;
see &lt;a href=&quot;http://askubuntu.com/questions/111236/how-to-migrate-the-whole-system-to-a-new-machine&quot;&gt;http://askubuntu.com/questions/111236/how-to-migrate-the-whole-system-to-a-new-machine&lt;/a&gt;
&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;


&lt;/div&gt;
&lt;/div&gt;</content>
 </entry>
 
 <entry>
   <title>My Expeirence with Repetitive Strain Injury (RSI)</title>
   <link href="http://yitang.uk/2015/07/19/my-expeirence-with-repetitive-strain-injury-rsi/"/>
   <updated>2015-07-19T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/07/19/my-expeirence-with-repetitive-strain-injury-rsi</id>
   <content type="html">&lt;p&gt;
Someday I typed more than 80 thousand times just in Emacs. This is
pretty awesome at first sight but it can cause serious health problem.
&lt;/p&gt;

&lt;p&gt;
Last month, I felt burning pain of my forearms. It is an symptom
of &lt;a href=&quot;http://www.nhs.uk/conditions/Repetitive-strain-injury/Pages/Introduction.aspx&quot;&gt;Repetitive strain injury (RSI)&lt;/a&gt;. I realised that if continue typing
like that, one day I will never able to do programming, like the Emacs
celebrities in Xah Lee&apos; &lt;a href=&quot;http://ergoemacs.org/emacs/emacs_hand_pain_celebrity.html&quot;&gt;article&lt;/a&gt; about RSI.
&lt;/p&gt;

&lt;p&gt;
Since then I&apos;ve deliberately tried to avoid aimless and unproductive
typing, take more typing breaks, think though things before trying,
write more on paper. 
&lt;/p&gt;

&lt;p&gt;
Conditions are getting better: I don&apos;t feel server pain any more, only
sometimes uncomfortable.
&lt;/p&gt;

&lt;p&gt;
But I need to find a better way to improve it. Because sometimes I got
the idea, but can&apos;t touch the keyboard. This feeling really suck.
&lt;/p&gt;

&lt;p&gt;
So I investigated the &lt;a href=&quot;https://github.com/abo-abo/hydra&quot;&gt;Hydra package&lt;/a&gt; and use it to group related
commands together so that use only two keys are needed to perform
frequent tasks.
&lt;/p&gt;

&lt;p&gt;
For example, to search something in current project, instead of typing
&lt;code&gt;M-x helm proj grep&lt;/code&gt;, that&apos;s 16 keystrokes, I only need &lt;code&gt;F5 G&lt;/code&gt; with
Hydra. The implementation is listed in &lt;a href=&quot;http://blog.yitang.uk/2015/04/17/group-emacs-search-functions-using-hydra/&quot;&gt;this post&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
But calling functions/commands in Emacs counts only a small proportion
of my typing; most of the time, I write code and report. 
&lt;/p&gt;

&lt;p&gt;
This is where &lt;a href=&quot;https://github.com/capitaomorte/yasnippet&quot;&gt;Yasnippets&lt;/a&gt; kicks in, it enable me to type less without
losing quality. For example, I use this snippet quite often when
writing R code,
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sapply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;seq_len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
That&apos;s more than 40 keystrokes. Yasnippets can short it to only 6s!
After I type &lt;code&gt;sapply&lt;/code&gt; and then hit &lt;code&gt;TAB&lt;/code&gt;, it will expand to the region
above. 
&lt;/p&gt;

&lt;p&gt;
I will investigate the Yasnippet package next week. If you know any
good tutorials for Yasnippet or snippets for writing R code, please
share your resources.
&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Start Enjoying Regular Expression In Emacs</title>
   <link href="http://yitang.uk/2015/06/30/start-enjoying-regular-expression-in-emacs/"/>
   <updated>2015-06-30T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/06/30/start-enjoying-regular-expression-in-emacs</id>
   <content type="html">&lt;blockquote&gt;
&lt;p&gt;
The &lt;code&gt;search-forward-regexp&lt;/code&gt;, &lt;code&gt;replace-match&lt;/code&gt;, and &lt;code&gt;match-string&lt;/code&gt;
functions work together nicely, and makes my job much easier and enjoyable!
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;
I am writing a release notes for the a software updates. Part of the
process is to associate the SVN Revision number that relates to
important changes, so that others can backtrack and review the code and
see what exactly has been implemented.
&lt;/p&gt;

&lt;p&gt;
In &lt;a href=&quot;http://phabricator.org/&quot;&gt;Phabricator&lt;/a&gt;, the revision number will be render automatically.
Clicking them takes me to the exact revision, showing the difference
with previous version. But the documentation will be eventually built
by &lt;a href=&quot;http://sphinx-doc.org/index.html#&quot;&gt;Sphinx&lt;/a&gt; and hosted on a remote server. So I have to manually add the
URL to all the SVN revision number. For example, to replace rS1234 to
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;org6d0d84b&quot;&gt;
[[http://phabricator.domain.co.uk/rS1234][rS1234]]
&lt;/pre&gt;

&lt;p&gt;
There are 31 revision number in the whole document. I could do it
manually but for the long term benefits, it would be more efficient
write a function to process it automatically, maybe others can use it
as well.
&lt;/p&gt;

&lt;div id=&quot;outline-container-orgc83cd90&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgc83cd90&quot;&gt;Implementation&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgc83cd90&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The first thing I noticed is each SVN revision numbers consist of two
letters (&lt;i&gt;rS&lt;/i&gt;) and few digits. Because the four digits I don&apos;t know
beforehand, I have to use regular expression to do the pattern search.
&lt;/p&gt;

&lt;p&gt;
The tricky bit here is to retrieve the values that matched the
pattern, because of it is needed to construct the URL that points to
the commits, and I also need to replace the it with differnet values.
&lt;/p&gt;

&lt;p&gt;
The procedure can be summarised as: 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;Find the revision number that match the patterns described above. I
use &lt;code&gt;search-forward-regexp()&lt;/code&gt; to search the pattern &quot;rS[0-9]+&quot;, which
means a string that starts with &lt;i&gt;rS&lt;/i&gt; with one or more digits.a&lt;/li&gt;
&lt;li&gt;retrieve the values that matched the pattern. This is done by &lt;code&gt;match-string()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;replace the revision number with the constructed URL. This is done by
&lt;code&gt;replace-match()&lt;/code&gt;, and I use &lt;code&gt;concat()&lt;/code&gt; to combine the IP address with the
revision number.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;
The following is a workable implementation: 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(defvar revision-pattern &quot;rS[0-9]+&quot;
  &quot;The RegExp pattern of the SVN revision number&quot;)

(defvar repo-url &quot;http://10.0.0.11/&quot;
  &quot;The IP address of the SVN repository&quot;)

(defun yt/add-link-to-SVN-revision-number ()
  &quot;add links to svn commits identifier&quot;
  (interactive)
  (while (search-forward-regexp revision-pattern)
    (let* ((commit (match-string 0))
           (link (concat repo-url commit)))
      (replace-match &quot;&quot;)
      (org-insert-link nil link commit))))&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
Note the last two lines of the function can be simplified as 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(replace-match (concat &quot;[[&quot; link
                       &quot;][&quot; commit &quot;]]&quot;))                       &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
You can easily adopt the code and make it applicable to your case,
just modify the &lt;code&gt;revision-pattern&lt;/code&gt; and &lt;code&gt;repo-url&lt;/code&gt; variables. But
beware that you should not apply the function to the same buffer more than
once, otherwise you will get something crazy like this:
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;org128b034&quot;&gt;
[[http://10.0.0.11/[[http://10.0.0.11/rS1234][rS1234]]][[[http://10.0.0.11/rS1234][rS1234]]]]
&lt;/pre&gt;

&lt;p&gt;
One way to make it better is to have a test before replacing: if the
revision number is already associated with a URL, then do nothing. If
you have figure out how to do it, please let me know and I&apos;ve happy to
update this post.
&lt;/p&gt;

&lt;p&gt;
My posts published last year showed my frustration with regular expression in
Emacs. But now I am looking forward doing more text processing with
it, because it will be fun! 
&lt;/p&gt;

&lt;p&gt;
The &lt;code&gt;search-forward-regexp&lt;/code&gt;, &lt;code&gt;replace-match&lt;/code&gt;, and &lt;code&gt;match-string&lt;/code&gt;
functions work together nicely and make the my job much easier and
enjoyable!
&lt;/p&gt;

&lt;p&gt;
What&apos;s your favourite functions in regular expression? Do you have
something to recommend? 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Import Irregular Data Files Into R With Regular Expression - an BODC Example</title>
   <link href="http://yitang.uk/data%20science/2015/06/25/import-irregular-data-files-into-r-with-regular-expression-an-bodc-example/"/>
   <updated>2015-06-25T00:00:00+01:00</updated>
   <id>http://yitang.uk/data%20science/2015/06/25/import-irregular-data-files-into-r-with-regular-expression--an-bodc-example</id>
   <content type="html">&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#org8352591&quot;&gt;The Irregular Data Files&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgfe0c9d8&quot;&gt;First Attempt - Skip Lines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org89b7ad0&quot;&gt;Second Attempt - Remove Lines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org12355a8&quot;&gt;Third Attempt - Capture Lines (RegExp)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org6ed2929&quot;&gt;Code and Sample Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;
The first step in data analysis is to get the data into the modelling
platform. But it may not be as straightforward as it used to be since
nowadays statistician are more likely face the data files that are not
in CSV or others format that can feed directly to the &lt;code&gt;read.table()&lt;/code&gt;
function in R, in which cases, we need to understand the data files in
terms of the structure and apply pre-process first. My general
strategy is to discard the unnecessary information in the data files
and hopefully leave a regular data files.
&lt;/p&gt;

&lt;p&gt;
In my last week&apos;s post, &lt;a href=&quot;http://blog.yitang.uk/2015/06/18/why-i-should-explore-regular-expression-and-why-i-havent/&quot;&gt;Why I Should Explore Regular Expression and
Why I Haven&apos;t&lt;/a&gt;, I expressed my interests in Regular Expression and lucky
I got a chance to use it for getting the data into R. It provides me a
different strategy: pick only what I am interested in.
&lt;/p&gt;

&lt;div id=&quot;outline-container-org8352591&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org8352591&quot;&gt;The Irregular Data Files&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org8352591&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The task is simple: I have about 1,800 &lt;code&gt;.text&lt;/code&gt; data files downloaded
from &lt;a href=&quot;http://www.bodc.ac.uk/about/what_is_bodc/&quot;&gt;British Oceanographic Data Centre (BODC)&lt;/a&gt;. They are the historical
tidal data and are separated by year and by port. I need to combine all
the data into one giant table in R, and save it later for modelling.
&lt;/p&gt;

&lt;p&gt;
One sample data file looks like this:
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;org22829ea&quot;&gt;
Port:              P035
Site:              Wick
Latitude:          58.44097
Longitude:         -3.08631
Start Date:        01JAN1985-00.00.00
End Date:          03OCT1985-19.00.00
Contributor:       National Oceanography Centre, Liverpool
Datum information: The data refer to Admiralty Chart Datum (ACD)
Parameter code:    ASLVZZ01 = Surface elevation (unspecified datum) of the water body                      
  Cycle    Date      Time      ASLVZZ01     Residual  
 Number yyyy mm dd hh mi ssf           f            f 
     1) 1985/01/01 00:00:00      1.0300      -0.3845  
     2) 1985/01/01 01:00:00      1.0400      -0.3884  
     3) 1985/01/01 02:00:00      1.2000      -0.3666
&lt;/pre&gt;

&lt;p&gt;
The first 9 lines are the metadata, which describes the port ID, name
and location of the port, and other information about the data. The
line 10 and 11 are the headers of the data matrix.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgfe0c9d8&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgfe0c9d8&quot;&gt;First Attempt - Skip Lines&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgfe0c9d8&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
After the glimpse of the data sample, my first thought was to
skip the first 12 lines and treat the rest as a regular data files
that has space as separator. It can be easily done by using
&lt;code&gt;read.table()&lt;/code&gt; with &lt;code&gt;skip = 12&lt;/code&gt; option.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;skip&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## error&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;p&gt;
It turned out this approach won&apos;t work for some files because
when the way of measuring tidal were changed, the date and port were
highlighted, leaving a second chunk of data matrix but again with
metadata and few other characters. It looks like this:
&lt;/p&gt;

&lt;pre class=&quot;example&quot; id=&quot;org406e04d&quot;&gt;
;; end of first chunk 

########################################
 Difference in instrument
########################################

Port: P035
;; other metadata 
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org89b7ad0&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org89b7ad0&quot;&gt;Second Attempt - Remove Lines&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org89b7ad0&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
Although the first attempt isn&apos;t success, I&apos;ve learnt a bit about the
structure of the data files. And based on that, I came up with a
second approach: read the data files into R as a vector of string, one
element for a line, and then remove all the lines which are metadata.
They start with &lt;i&gt;Port:&lt;/i&gt;, &lt;i&gt;Site:&lt;/i&gt; or &lt;i&gt;Longitude:&lt;/i&gt; etc or the &lt;i&gt;###&lt;/i&gt;
chunk. It can be done using &lt;code&gt;grep&lt;/code&gt; function, which tells me exactly
which element of the vector contains the metadata.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;readLines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metainfo.list&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Port:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Site:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Latitude:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Longitude:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Start Date:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;End Date:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Contributor:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Datum information:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Parameter code:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta.line.num&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sapply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metainfo.list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res.2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta.line.num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
This approach works well as long as the &lt;i&gt;metainfo.list&lt;/i&gt; contains &lt;b&gt;all&lt;/b&gt;
the lines I&apos;d like to remove. The downside is that I won&apos;t able to
know I&apos;ve includes all of them until the whole process is finished. So
when I was waiting for the program to finish, I came up with a third
approach, a better one. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org12355a8&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org12355a8&quot;&gt;Third Attempt - Capture Lines (RegExp)&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org12355a8&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The above two approaches are to discard the unnecessary information,
but I may be in the situation that there are other lines that should
be discard but I haven&apos;t encounter yet, then the process becomes
tedious try-error and takes quite long.
&lt;/p&gt;

&lt;p&gt;
Equally, another approach is to select exactly what I am interested in
by using regular expression. But first, I have to identify pattern.
Each data point was recorded at a certain point, and therefore must be
associated with a timestamp, for example, the first data point is
recorded at &lt;i&gt;1926-01-01 00:00:00&lt;/i&gt;. They also has an ID values with an
closing parentage&apos;s, for example &lt;i&gt;1&lt;/i&gt;.
&lt;/p&gt;


&lt;pre class=&quot;example&quot; id=&quot;org8c85054&quot;&gt;
1) 1985/01/01 00:00:00      1.0300      -0.3845  
&lt;/pre&gt;

&lt;p&gt;
So the content of my interests are have a common pattern that can
be summarised as: the lines that start with a number of spaces, and
also have 
&lt;/p&gt;
&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;observation ID&lt;/dt&gt;&lt;dd&gt;few integers, and an ending parentheses,&lt;/dd&gt;
&lt;dt&gt;observation date&lt;/dt&gt;&lt;dd&gt;few integers with forward slashes that
means year, month and day, and then a space,&lt;/dd&gt;
&lt;dt&gt;observation time&lt;/dt&gt;&lt;dd&gt;few integers with colons, means hour, minutes and
seconds.&lt;/dd&gt;
&lt;/dl&gt;

&lt;p&gt;
The patterns in RegExp can be formulated as the &lt;i&gt;roi.pattern&lt;/i&gt; variable
and the whole process can be implemented as:
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;roi.pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;[[:space:]]+[[:digit:]]+\\) [[:digit:]]{4}/[[:digit:]]{2}/[[:digit:]]{2}&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.line.num&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res.3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.line.num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
To me, there isn&apos;t an absolute winner between the second and third
approach, but I prefer to use regular expression because it has more
fun with it; I am a statistician and like to spot patterns.
&lt;/p&gt;

&lt;p&gt;
Also, it is an direct approach and more flexible. Note I can continue
to add components to the regular expression to increase the confidence
in selecting the right data matrix. For example, there are spaces and
then few integers at the timestamp. But it will presumably increase
the run-time.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org6ed2929&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org6ed2929&quot;&gt;Code and Sample Data&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org6ed2929&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
You can download the &lt;a href=&quot;https://www.copy.com/s/t%253A602pmEv8mDBozscz%253Bp%253A%25252F1985WIC.txt&quot;&gt;exmaple data&lt;/a&gt; and run the scripts listed below
in R to reproduce all the results. 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;c1&quot;&gt;#### * Path&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;~/Downloads/1985WIC.txt&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## to the downloaded data file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### * Approach 1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;skip&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## error&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### * Approach 2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;readLines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metainfo.list&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Port:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Site:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Latitude:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Longitude:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Start Date:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;End Date:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Contributor:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Datum information:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Parameter code:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta.line.num&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sapply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metainfo.list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res.2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta.line.num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### * Approach 3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;[[:space:]]+[[:digit:]]+\\) [[:digit:]]{4}/[[:digit:]]{2}/[[:digit:]]{2}&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.line.num&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res.3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;roi.line.num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Why I Should Explore Regular Expression and Why I Haven't</title>
   <link href="http://yitang.uk/data%20sciences/2015/06/18/why-i-should-explore-regular-expression-and-why-i-havent/"/>
   <updated>2015-06-18T00:00:00+01:00</updated>
   <id>http://yitang.uk/data%20sciences/2015/06/18/why-i-should-explore-regular-expression-and-why-i-havent</id>
   <content type="html">&lt;p&gt;
Like many R users who are not actually programmer, I am afraid of
regular expression (RegExp), whenever I saw something like
&lt;/p&gt;

&lt;p&gt;
I&apos;d told myself I won&apos;t be able to understand it and gave up on the
sight. 
&lt;/p&gt;

&lt;p&gt;
But I&apos;ve collected few RegExp patterns that do magical
jobs. My favourites are the dot (&lt;code&gt;.&lt;/code&gt;) and dollar (&lt;code&gt;$&lt;/code&gt;) sign and I usually
use them with &lt;code&gt;list.files()&lt;/code&gt; to filter the file names in a directory. For
example,
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;list.files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;.RData$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list.files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;.text$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
The first line returns all the R image files, which have file names
ending with &lt;i&gt;RData&lt;/i&gt;, and for the second all the text files which have
file names ended with &lt;i&gt;text&lt;/i&gt;. Basically in regular expression, dot
sign (&lt;code&gt;.&lt;/code&gt;) means anything, and dollar sign (&lt;code&gt;$&lt;/code&gt;) means the end of a
string. By combining these two, I am able to select multiple files
with certain patterns, without manually picking one by
one. 
&lt;/p&gt;

&lt;p&gt;
How powerful is that! It is an inspirational example that motivates
myself from time to time to look deeper and get my head on the topic
of regular expression. But I just couldn&apos;t have a clear picture of how to
us it.
&lt;/p&gt;

&lt;p&gt;
I think the main problems for me to understand RegExp in R are
&lt;/p&gt;

&lt;div id=&quot;outline-container-org57668b3&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org57668b3&quot;&gt;The syntax is content-sensitive&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org57668b3&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
A subtle change can lead to random results. For example, the above
pattern can also be &lt;code&gt;\\.RData$&lt;/code&gt;, which means file names ended with
&lt;code&gt;.RData&lt;/code&gt;. The dot (.) sign here literally means &quot;.&quot;. Adding two
backslashes &lt;code&gt;\\&lt;/code&gt; changes the meaning of the pattern completely, but
both gives the same results. It gave me so much frustration when
extrapolating a pattern that works in one case to a similar case
but get random results. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org9645066&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org9645066&quot;&gt;The syntax is hard to read&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org9645066&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The RegExp pattern above are reasonably easy to understand, if one
spent 10 minutes reading the manual, but the following is just crazy.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;regexec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
There are 12 parentheses, 6 square brackets and many other symbols.
Even same symbol have different meanings, and it&apos;s hard to find out
exactly what they means because 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org840a77e&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org840a77e&quot;&gt;There isn&apos;t enough learning materials&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org840a77e&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I&apos;ve never seen an R book that mentioned regular expression. This topic
is certainly not a teaching content in university courses or training
workshops. 
&lt;/p&gt;

&lt;p&gt;
Even google fails to find any meaningful resource except for the &lt;a href=&quot;https://en.wikibooks.org/wiki/R_Programming/Text_Processing&quot;&gt;Text
Processing&lt;/a&gt; in Wiki, which is the best I could find. 
&lt;/p&gt;

&lt;p&gt;
Although there are related questions in StackOverflow, most of the
answers were set in a very specific situation. It&apos;s hard make it
applicable to other situations or learn this topic from the discrete Q&amp;amp;As.
&lt;/p&gt;

&lt;p&gt;
It has created a mental barrier that statistician shouldn&apos;t teach nor
learn RegExp at all, or at least for me. But my limited experience
suggests that it is such a powerful feature that I&apos;ve missed a lot. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgc3c2c81&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgc3c2c81&quot;&gt;But&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgc3c2c81&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I believe there will be more chances to process text files, for example,
parse the log files of this blog. RegExp can improve the efficiency to
a great extent. So I am considering to invest the time to learn it properly.
&lt;/p&gt;

&lt;p&gt;
Are you a R user? What&apos;s your experience with regular expression? Do
you have good learning materials to recommend? If so, please  share
your experience on the less-talked area.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Use Emacs's Org-mode to Effectively Manage Small Projects</title>
   <link href="http://yitang.uk/productivity/2015/06/14/use-emacss-orgmode-to-effectively-manage-small-projects/"/>
   <updated>2015-06-14T00:00:00+01:00</updated>
   <id>http://yitang.uk/productivity/2015/06/14/use-emacss-orgmode-to-effectively-manage-small-projects</id>
   <content type="html">&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#org21b9575&quot;&gt;Organising&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgba3f15d&quot;&gt;Managing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgd1cc217&quot;&gt;Monitor&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;
DEADLINE: &lt;span class=&quot;timestamp-wrapper&quot;&gt;&lt;span class=&quot;timestamp&quot;&gt;&amp;lt;2015-06-09 Tue 20:00&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
Org-mode is great to serve as knowledge management tool, it also has helped
me increase my personal effectiveness. Recently I have been exploring
org-mode for managing small projects in the business environment,
in which collaboration happens occeasionally between me and the project team members.
&lt;/p&gt;

&lt;p&gt;
In this post I summarised my workflow to organise, manage and monitor a
project. The implementation of this workflow revolves around the
collaboration. I have been practise this workflow for a while and can
see my growth in planing and managing skills. 
&lt;/p&gt;

&lt;div id=&quot;outline-container-org21b9575&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org21b9575&quot;&gt;Organising&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org21b9575&quot;&gt;
&lt;p&gt;
I use a broad definition of project: as long as a task that requires a series
of sub-tasks to be done, then it is a project. Normally I categories
any tasks that relates to a project into three groups:
&lt;/p&gt;

&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;Project Tasks&lt;/dt&gt;&lt;dd&gt;the major tasks that must to been done in order to
deliver the project product.&lt;/dd&gt;
&lt;dt&gt;Tasks&lt;/dt&gt;&lt;dd&gt;administrative or miscellaneous tasks that keep the project goes
on, like sent out the invoice.&lt;/dd&gt;
&lt;dt&gt;Notes&lt;/dt&gt;&lt;dd&gt;anything that is important to the project and therefore
worthy keeping a record, like meeting notes or decision
made that that impacts the project progress.&lt;/dd&gt;
&lt;/dl&gt;

&lt;p&gt;
Each category has a corresponding top level section or heading. Once this outline is setup, it
is very convenient to view content under these categories,
regardless of what tasks I was working on, either reading emails,
coding, or writing report. Org-mode can scan all the .org
files in a direcotry, and creates a tree-structure, with the file name
being the root, and headings being the nodes. 
&lt;/p&gt;

&lt;p&gt;
An intuitive way to locate a any node is to start from the beginning,
the process is same as finding a section in a text book. It can be summarised as: 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;first, find the right book by its name,&lt;/li&gt;
&lt;li&gt;then find the right part,&lt;/li&gt;
&lt;li&gt;then narrow down to the right section,&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
and continue to the section I am interested in. An more pleasure way is
to use &lt;a href=&quot;http://en.wikipedia.org/wiki/Approximate_string_matching&quot;&gt;fuzzy match&lt;/a&gt; supported by &lt;a href=&quot;https://github.com/emacs-helm/helm&quot;&gt;Helm package&lt;/a&gt; - I can narrow down the
selection by random nodes. For example, as the images below shows, to
locate headline under this article among 40 org files, I only need to search &quot;Small pro&quot;,
because there are only three headlines has &quot;Small&quot; in its name, &quot;small
changes&quot;, &quot;small talk&quot;, and &quot;small project&quot;, and &quot;pro&quot; narrow down to
the unique headline. 
&lt;/p&gt;

&lt;p&gt;
It saves me a lot of time in remembering where I saved one notes, and
wandering around the files to find something. I only explain a bit of
the features of Helm, if you want to try out, you can find my
configuration &lt;a href=&quot;https://github.com/yitang/.emacs.d/blob/master/init.org#helm---fuzzy-match&quot;&gt;here&lt;/a&gt;. I recommend a good &lt;a href=&quot;http://tuhdo.github.io/helm-intro.html&quot;&gt;tutorial&lt;/a&gt; if you want to know
more.  
&lt;/p&gt;

&lt;p&gt;
&lt;img src=&quot;/assets/Use_org_mode_to_manage_a_small_project.png&quot; alt=&quot;nil&quot;/&gt;
&lt;/p&gt;


&lt;div id=&quot;orgb7ccdec&quot; class=&quot;figure&quot;&gt;
&lt;p&gt;&lt;img src=&quot;file:///assets/Use_org_mode_to_manage_a_small_project.png&quot; alt=&quot;Use_org_mode_to_manage_a_small_project.png&quot; /&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;figure-number&quot;&gt;Figure 1: &lt;/span&gt;Test image&lt;/p&gt;
&lt;/div&gt;


&lt;p&gt;
We usually a couple of projects at the same time. Also, create a new
tasks or notes is easy. &lt;code&gt;org-capture-mode&lt;/code&gt; would create a temporary
node and by default it will be saved as a subtree in refile.org, or I
can directly re-locate the headline directly to this project using the
locating mechanism above.
&lt;/p&gt;

&lt;p&gt;
These two features are most enjoyable to use, and make me away from
wandering in multiple directories, trying to find the right files, and
therefore increase my productivity. Never under estimate how long you
will spent in finding in one file. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgba3f15d&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgba3f15d&quot;&gt;Managing&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgba3f15d&quot;&gt;
&lt;p&gt;
Projects usually come with hard deadlines about the product delivery.
Setting and change deadlines in org-mode is pleasurable with &lt;code&gt;org-deadline&lt;/code&gt; &lt;code&gt;C-c C-d&lt;/code&gt;.
&lt;/p&gt;

&lt;p&gt;
It brings up a mini-calendar buffer (shown below), I can use
 &lt;code&gt;shift+left&lt;/code&gt; and &lt;code&gt;shfit+right&lt;/code&gt; to move forward and backward for a
 day, or &lt;code&gt;shift-up&lt;/code&gt; and &lt;code&gt;shift-down&lt;/code&gt; to move between weeks, and hit
 &lt;code&gt;RET&lt;/code&gt; to select a deadline. Apart from navigating, I can also choose to
 type the exact date directly, like &quot;2015-07-25&quot; and hit &lt;code&gt;RET&lt;/code&gt;.
&lt;/p&gt;

&lt;p width=&quot;800&quot;&gt;
&lt;img src=&quot;/assets/mini-calendar-buffer.png&quot; alt=&quot;nil&quot;/&gt;
&lt;/p&gt;

&lt;p&gt;
Once the deadline is set it will show up in that day&apos;s calendar. I
don&apos;t want to suddenly realise there is a deadline I had on that day.
So it makes sense to have an early warning period to show the tasks if
it is due in days. This behaviour is governed by the
&lt;code&gt;org-deadline-warning-days&lt;/code&gt; variable. In my Emacs configuration, I set
to 30 days. It gives me plenty of time to do any tasks. 
&lt;/p&gt;

&lt;p&gt;
I also set deadlines for sub-tasks since it is quite easy to do in
org-mode. But coming up with realistic deadlines is difficult. To me,
it must give enough time to do the task properly, to the PM, it must
be fit in the whole project plan and resource. Both are likely to have
different opinion on how long to implement the new features with
documentation. It is quite important skills to have: to me, it
reflects my understand on the problem and also my own technical
capability, to the manager, it is part of their project plan.
&lt;/p&gt;

&lt;p&gt;
My initial estimation may be far from the actual effort, especially
when the problem domain is new to me, or I haven&apos;t done similar tasks
before. The more I do, the better I am good at estimating. At this
stage, I practise this skill seriously, and like to have someone with
more experienced to review my estimation.
&lt;/p&gt;

&lt;p&gt;
To make this task easy for them, I&apos;d present an overall view of the
project time-lines, which clearly shows the period allocate to the
specific tasks. &lt;code&gt;org-timeline&lt;/code&gt; will generate a time-sorted view for
all the tasks. The recent feedback I received is that I tend to
overlook the time spent on documentation and tests. Someone with more
than 10 years in software development says they usually takes about 3x
times on these two tasks together than actually coding.
&lt;/p&gt;

&lt;p&gt;
time-line view also provides benchmark to the progress and I check it
frequently to make sure I am on track. It gives the PM a reference for
swapping tasks if some becomes urgent. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgd1cc217&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgd1cc217&quot;&gt;Monitor&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgd1cc217&quot;&gt;
&lt;p&gt;
Additional to have the early warning system to prevent sudden surprise,
org-mode provides another way of monitoring the project in terms of resource -
the actual time I spent on the project. This feature is quite useful
when I am given a quite loose deadline but with limited resource,
say 150 hours.  
&lt;/p&gt;

&lt;p&gt;
Since the sub-tasks are mostly defined in the early stage, whenever I
start to do it, I clock in first by &lt;code&gt;org-clock-in&lt;/code&gt;. The clocking will
be stopped once I manually clock out, or clock in to another task, or
the tasks is completed (marked as DONE.) For each clock entry, it
shows start time, end time and duration. 
&lt;/p&gt;

&lt;p&gt;
Multiple clocking logs are accumulated, and each entry shows the start
time, end time, and duration. The durations can be added up and tells
me exactly how much time I spent on each tasks. The whole tasks under
the project and aggregated across the whole project, by one single
function &lt;code&gt;org-clock-report&lt;/code&gt; (&lt;code&gt;C-c C- C-r&lt;/code&gt;).
&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 1:&lt;/span&gt; Clock summary at &lt;span class=&quot;timestamp-wrapper&quot;&gt;&lt;span class=&quot;timestamp&quot;&gt;[2015-06-14 Sun 11:17]&lt;/span&gt;&lt;/span&gt;&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Headline&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Time&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Effort&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&lt;b&gt;Total time&lt;/b&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&lt;b&gt;10:41&lt;/b&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;TODO Use Emacs&apos;s org-mode to Manage a Small Project&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;10:41&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; TODO Tasks&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:45&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp;&amp;emsp; DONE add example for org-refile&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:35&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:30&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp;&amp;emsp; NEXT add example for org-clock-report&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:13&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:15&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp;&amp;emsp; NEXT proof read&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:11&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:15&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp;&amp;emsp; NEXT proof read - 2&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:46&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;


&lt;p&gt;
It is normal to underestimate the complexity of an tasks, and spent
too much time in resolve them, and usually I can catch up the in the
later stage, however if I had the feeling the overall progress has
been affected, I need require more sources from the PM, and the quote
I will give is extra hours I had based on my initial estimation.
That&apos;s an quick reaction.
&lt;/p&gt;

&lt;p&gt;
Also, the clock-report table tells me the different between my effort
estimation and the actual time I spent on that tasks. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Control the Plotting Order in ggplot2</title>
   <link href="http://yitang.uk/2015/05/06/control-the-plotting-order-in-ggplot2/"/>
   <updated>2015-05-06T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/05/06/control-the-plotting-order-in-ggplot2</id>
   <content type="html">&lt;p&gt;
&lt;img src=&quot;/assets/ggplot_factor.png&quot; alt=&quot;nil&quot;/&gt;
&lt;/p&gt;

&lt;p&gt;
The above two plots show the same data (included below), and if you are going to present one to summarise your findings, which will you choose? It is very likely you are going to pick the right one, because
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;the linear increasing feature of bars is pleasant to see,&lt;/li&gt;
&lt;li&gt;it is easier to compare the categories, the ones on the right has higher value than the ones on the left, and&lt;/li&gt;
&lt;li&gt;categories with lowest and highest value are clearly shown,&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
In this article I am trying to explain how to specify the plotting orders in ggplot to whatever you want and encourage R starters to use ggplot2. 
&lt;/p&gt;

&lt;p&gt;
To create a bar plot is dead easy in R, take this dataset as an example, 
&lt;/p&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;mode&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;ssh-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2361&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;fundamental-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4626&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;git-commit-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4869&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;mu4e-compose-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4964&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;emacs-lisp-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;6205&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;shell-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;10046&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;minibuffer-inactive-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;12624&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;inferior-ess-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;25774&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;ess-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;47115&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;org-mode&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;78195&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;
to get the plot on the right side, reorder the table by count (it is already been done), then 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;barplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;names.arg&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
will do the job. That&apos;s simple and easy, it does what you provide. This is completely different to &lt;code&gt;ggplot()&lt;/code&gt; paradigm, which does a lot computation behind the scene. 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
will give you the first plot; the categories are in alphabetically order. In order to get a pleasant increasing order that depends on the count or any other variable, or even manually specified order, you have to explicitly change the &lt;i&gt;level of factors&lt;/i&gt;. 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode.ordered&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;levels&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
create another variable mode.oredered which looks the same as mode, except for the underlying &lt;i&gt;levels&lt;/i&gt; are in different. It is set to the order of counts. Run the same ggplot code again will give you the plot on the right. How does it work? 
&lt;/p&gt;

&lt;p&gt;
First, every factor in R is mapped into an integer, and the default mapping algorithm is 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;sort the factor vector alphabetically,&lt;/li&gt;
&lt;li&gt;map the first factor to 1, and last to 10.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
So &lt;i&gt;emacs-lisp-mode&lt;/i&gt; is mapped to 1 and &lt;i&gt;ssh-mode&lt;/i&gt; is mapped to 10. 
&lt;/p&gt;

&lt;p&gt;
What the reorder script can do is to sort the factors by count, so that &lt;i&gt;ssh-mode&lt;/i&gt; is mapped to 1 and &lt;i&gt;org-mode&lt;/i&gt; is mapped to 10, I.e. the factor order which are set to the order of count. 
&lt;/p&gt;

&lt;p&gt;
How does this affects ggplot? I presume ggplot do the plotting on the order of levels, or let&apos;s say on the integer space, I.e. do the plotting from 1 to 10, and then add the labels for each. 
&lt;/p&gt;

&lt;p&gt;
In this example, the default &lt;code&gt;barplot&lt;/code&gt; function did the job. Usually we need to do extra data manipulation so that ggplot will do what we want, in exchange for the plot good better and may fits in the other plots. Without considering the time constraints, I would encourage people to stick with ggplot because like many other things in life, once you understand, it becomes easier to do. For example, it is actually very easy to specify the order manually with only two steps: 
&lt;/p&gt;

&lt;ul class=&quot;org-ul&quot;&gt;
&lt;li&gt;first, sort the whole data.frame to a variable,&lt;/li&gt;
&lt;li&gt;then change the &lt;code&gt;levels&lt;/code&gt; options in &lt;code&gt;factor()&lt;/code&gt; to what ever you want.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
To show a decreasing trends - the reverse order of increasing, just use &lt;code&gt;levels = rev(mode)&lt;/code&gt;. How neat! 
&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Group Emacs Search Functions using Hydra</title>
   <link href="http://yitang.uk/2015/04/17/group-emacs-search-functions-using-hydra/"/>
   <updated>2015-04-17T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/04/17/group-emacs-search-functions-using-hydra</id>
   <content type="html">&lt;p&gt;
I am a search-guy: when I want to know something, I use the search functionality to locate to where has the keyword, and I didn&apos;t use my eyes to scan the page, it&apos;s too slow and harmful.
&lt;/p&gt;

&lt;p&gt;
Emacs provides powerful functionality to do searching. For example, I use these commands very often (with the key-binds),
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;&lt;code&gt;isearch&lt;/code&gt; (&lt;code&gt;C-s&lt;/code&gt;), search for a string and move the cursor to there,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;helm-swoop&lt;/code&gt; (&lt;code&gt;C-F1&lt;/code&gt;), find all the occurrences of a string, pull out the lines containing the string to another buffer where I can edit and save,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;helm-multi-swoop&lt;/code&gt; &lt;code&gt;M-X&lt;/code&gt;, apply &lt;code&gt;helm-swoop&lt;/code&gt; to multiple buffers, very handy if I want to know where a function is called in different buffers.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;projectile-grep&lt;/code&gt; or &lt;code&gt;helm-projectile-grep&lt;/code&gt; &lt;code&gt;C p s g&lt;/code&gt;, find which files in current project contains a specific string, similar to &lt;code&gt;helm-multi-swoop&lt;/code&gt; limits the search to files in project directory.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
I love doing searching in Emacs, but the problem is to have to remember all the key-binds for different tasks. Also, sometimes, I forgot about what alternatives I have and usually go with the one that I most familiar with, which usually means not the right one. I sometimes realise I use &lt;code&gt;isearch&lt;/code&gt; multiple times to do what &lt;code&gt;ace-jump-word-mode&lt;/code&gt; can achieve by just once.
&lt;/p&gt;

&lt;p&gt;
&lt;a href=&quot;http://oremacs.com/2015/04/14/hydra-org-mode/&quot;&gt;Org-mode Hydras incoming!&lt;/a&gt; gives me some idea to group all these functions together, and press a single key to perform different tasks, so this can free my mind from remembering all the key-binds. Also, I can write the few lines of text to reminds myself when to do what, and this potentially can solve problem two.
&lt;/p&gt;

&lt;p&gt;
Here is the hydra implementation for searching:
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;(defhydra hydra-search (:color blue
                               :hint nil)
  &quot;
Current Buffer : _i_search helm-_s_woop _a_ce-jump-word
Multiple Buffers : helm-multi-_S_woop
Project Directory: projectile-_g_rep helm-projectile-_G_rep
&quot;
  (&quot;i&quot; isearch-forward)
  (&quot;s&quot; helm-swoop)
  (&quot;a&quot; ace-jump-word-mode)
  (&quot;S&quot; helm-multi-swoop)
  (&quot;g&quot; projectile-grep)
  (&quot;G&quot; helm-projectile-grep))
(global-set-key [f4] &apos;hydra-search/body)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
So next time, when I want to search something, I just press F4, and then it brings up all the choices I have, and I don&apos;t need to worry about the key-binds or which to use! That&apos;s cool!
&lt;/p&gt;

&lt;p&gt;
I am looking forward simplifying my Emacs workflow using &lt;code&gt;hydra&lt;/code&gt; package, the key challenge is to identify the logical similarities among the tasks and then group them together accordingly. For &lt;code&gt;hydra-search()&lt;/code&gt;, it is &quot;search something on somewhere&quot;.
&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>A Workflow for Using Git to Track SVN Repository</title>
   <link href="http://yitang.uk/2015/04/16/use-git-for-svn/"/>
   <updated>2015-04-16T00:00:00+01:00</updated>
   <id>http://yitang.uk/2015/04/16/use-git-for-svn</id>
   <content type="html">&lt;p&gt;
Version control system is a complex issues, and hard to understand the idea of branching and different types of merging. I merely understand the basic of Git, and it already makes my life a lot easier, I am managing about 10 repositories at this moment without much effort.
&lt;/p&gt;

&lt;p&gt;
But my collages are using SVN as the centre storage for scripts. Switching to SVN is not a problem, I just need few weeks to transfer the knowledge and start to use it. I am reluctant to learn something basic and have duplicated knowledge, also, I use GitHub and Bitbucket which are Git based. But sticking to Git make mine work impossible to work with collauges.
&lt;/p&gt;

&lt;p&gt;
Then I found out the Git developer has already made effort to bridge Git and other version control system, like SVN. The &lt;code&gt;git svn&lt;/code&gt; allows me to just Git commands for staging, cherry-picking, pull etc, and then upload to the SVN remote repository with just one command line. I really like the idea of transferring the skills from one system to another without any cost, it makes me believe Git is great and I can continue to use Magit in Emacs!
&lt;/p&gt;

&lt;p&gt;
Here is the basic steps and comments for this work flow:
&lt;/p&gt;
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;Create a folder &lt;code&gt;mkdir ProjRepo&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Create an empty Git repository &lt;code&gt;git init&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Add the following to &lt;code&gt;.git/config&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

[svn-remote &quot;svn&quot;]
url = https://your.svn.repo
fetch = :refs/remotes/git-svn

&lt;p&gt;
and change the URL to right repository, 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;pull from SVN central repository to this folder, &lt;code&gt;git svn fetch svn&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;switch to SVN remote branch, &lt;code&gt;git checkout -b svn git-svn&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;modify or add files&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;git add&lt;/code&gt; and &lt;code&gt;git commit&lt;/code&gt; for snapshot local changes&lt;/li&gt;
&lt;li&gt;sometimes need to update local repository, &lt;code&gt;git svn rebase&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;finally upload local changes to SVN central repository &lt;code&gt;git svn dcommit&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
See the official manual &lt;a href=&quot;http://git-scm.com/book/en/v1/Git-and-Other-Systems-Git-and-Subversion&quot;&gt;8.1 Git and Other Systems - Git and Subversion&lt;/a&gt;
&lt;a href=&quot;http://git-scm.com/docs/git-svn&quot;&gt;git-svn documentation&lt;/a&gt; for more details.
&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Why Use Emacs 1 - Emacs Speaks Statistics</title>
   <link href="http://yitang.uk/2015/01/28/why-use-emacs-1-emacs-speaks-statistics-ess/"/>
   <updated>2015-01-28T00:00:00+00:00</updated>
   <id>http://yitang.uk/2015/01/28/why-use-emacs-1--emacs-speaks-statistics-ess</id>
   <content type="html">&lt;p&gt;
I am a Statistician, coding in R and write report is what I do most of the day. I have been though a long way of searching the perfect editor for me, tried Rstudio, SublimeText, TextMate and settled down happily with ESS/Emacs, for both coding and writing.
&lt;/p&gt;

&lt;p&gt;
There three features that have me made the decision:
&lt;/p&gt;

&lt;div id=&quot;outline-container-org6ecc18b&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org6ecc18b&quot;&gt;Auto Formatting&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org6ecc18b&quot;&gt;
&lt;p&gt;
Scientists has reputation of being bad programmers, who wrote code that is unreadable and therefore incomprehensible to others. I have intention to become top level programmer and followed a style guide strictly. It means I have to spent sometime in adding and removing space in the code.
&lt;/p&gt;

&lt;p&gt;
To my surprise, Emacs will do it for me automatically, just by hitting the TAB and it also indents smartly, which make me conformable to write long function call and split it into multiple lines. Here&apos;s an example. Also, if I miss placed a &apos;)&apos; or &apos;]&apos; the formatting will become strange and it reminders me to check.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-r&quot; data-lang=&quot;r&quot;&gt;&lt;span class=&quot;n&quot;&gt;rainfall.subset&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;london&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rainfall.pairs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rainfall.dublin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgd7c58c5&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgd7c58c5&quot;&gt;Search Command History&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgd7c58c5&quot;&gt;
&lt;p&gt;
I frequently search the command history. Imaging I was produce a plot and I realised there was something miss in the data, so I go back and fix the data first, then run the ggplot command again, I press Up/Down bottom many times, or just search once/two times. &lt;code&gt;M-x ggplot(&lt;/code&gt; will give me the most recent command I typed containing the keyword &lt;i&gt;ggplot(&lt;/i&gt;, then I press &lt;code&gt;RET&lt;/code&gt; to select the command, which might be &lt;code&gt;ggplot(gg.df, aes(lon, lat, col = city)) + geom_line() + .....&lt;/code&gt;. If it is not I want, I press &lt;code&gt;C-r&lt;/code&gt; again to choose the second most recent one and repeat until I find right one.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org557e18e&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org557e18e&quot;&gt;Literate Programming&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org557e18e&quot;&gt;
&lt;p&gt;
I am a supporter of literate statistical analysis and believe we should put code, results and discoveries together in developing models. Rstudio provides an easy to use tool for this purpose, but it does not support different R sessions, so if I need to generate a report, I have to re-run all the code from beginning, which isn&apos;t particle for me with volumes data because it will take quit long.
&lt;/p&gt;

&lt;p&gt;
ESS and org-mode works really well via Babel, which is more friendly to use. I can choose to run only part of the code and have the output being inserted automatically, no need to copy/paste. Also, I can choose where to execute the code, on my local machine or the remote server, or both at the same time.
&lt;/p&gt;

&lt;p&gt;
These are only the surface of ESS and there are lot more useful features like spell checking for comments and documentation templates, that makes me productive and I would recommend anyone uses R to learn ESS/Emacs. The following is my current setting.
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;;; Adapted with one minor change from Felipe Salazar at
;; http://www.emacswiki.org/emacs/EmacsSpeaksStatistics
(require &apos;ess-site)
(setq ess-ask-for-ess-directory nil) ;; start R on default folder
(setq ess-local-process-name &quot;R&quot;)
(setq ansi-color-for-comint-mode &apos;filter) ;;
(setq comint-scroll-to-bottom-on-input t)
(setq comint-scroll-to-bottom-on-output t)
(setq comint-move-point-for-output t)
(setq ess-eval-visibly-p &apos;nowait) ;; no waiting while ess evalating
(defun my-ess-start-R ()
(interactive)
(if (not (member &quot;*R*&quot; (mapcar (function buffer-name) (buffer-list))))
(progn
(delete-other-windows)
(setq w1 (selected-window))
(setq w1name (buffer-name))
(setq w2 (split-window w1 nil t))
(R)
(set-window-buffer w2 &quot;*R*&quot;)
(set-window-buffer w1 w1name))))
(defun my-ess-eval ()
(interactive)
(my-ess-start-R)
(if (and transient-mark-mode mark-active)
(call-interactively &apos;ess-eval-region)
(call-interactively &apos;ess-eval-line-and-step)))
(add-hook &apos;ess-mode-hook
&apos;(lambda()
(local-set-key [(shift return)] &apos;my-ess-eval)))
(add-hook &apos;inferior-ess-mode-hook
&apos;(lambda()
(local-set-key [C-up] &apos;comint-previous-input)
(local-set-key [C-down] &apos;comint-next-input)))
(add-hook &apos;ess-mode-hook
(lambda ()
(flyspell-prog-mode)
(run-hooks &apos;prog-mode-hook)
;; (prog-mode)
))

;; REF: http://stackoverflow.com/questions/2901198/useful-keyboard-shortcuts-and-tips-for-ess-r
;; Control and up/down arrow keys to search history with matching what you&apos;ve already typed:
(define-key comint-mode-map [C-up] &apos;comint-previous-matching-input-from-input)
(define-key comint-mode-map [C-down] &apos;comint-next-matching-input-from-input)&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Send Stylish MIME in Emacs</title>
   <link href="http://yitang.uk/2015/01/15/Send-Stylish-MIME-in-Emacs/"/>
   <updated>2015-01-15T00:00:00+00:00</updated>
   <id>http://yitang.uk/2015/01/15/Send-Stylish-MIME-in-Emacs</id>
   <content type="html">&lt;p&gt;
&lt;i&gt;Last Updated&lt;/i&gt;: 18 Jan 2015
&lt;/p&gt;

&lt;p&gt;
This is the first technical article in this blog, however the main purpose is not to analyse the problem and provide the solutions, but to tell a story of an ordinary person trying to pursuit his vision in a multi-languages environment (Emacs and HTML) that he only knows the basis.  Hope you find it is interesting to read and for those who care the solution more than problem-solving approach, please see the last section.  
&lt;/p&gt;
&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#orgeced853&quot;&gt;The Problem&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgfb4d1c1&quot;&gt;HTML Attachment Solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org1447c88&quot;&gt;Paradise of MIME&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org1119684&quot;&gt;Hack org-mime&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgd5a8bd1&quot;&gt;MIME Solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgbc51ce1&quot;&gt;Emacs Configuration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgeced853&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgeced853&quot;&gt;The Problem&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgeced853&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
The first time I thought I need an fancy Email is when I sent an quick model update to my colleague; I have a table like this
&lt;/p&gt;

&lt;table id=&quot;org6bf4ffd&quot; border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Conditioning Variable&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Dependent Variable&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Probability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;k &amp;gt;= 50&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;t &amp;gt;= 50&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.154&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;k &amp;gt;= 50&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;t &amp;gt;= 100&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.111&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;k &amp;gt;= 50&lt;/td&gt;
&lt;td class=&quot;org-left&quot;&gt;t &amp;gt;= 200&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.078&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;


&lt;p&gt;
It was written in org-mode in which I can do the formatting quickly and nicely.  But once copied over to Outlook, it looks messy, and the columns does not lineup. 
&lt;/p&gt;

&lt;p class=&quot;verse&quot;&gt;
| Conditioning Variable | Dependent Variable | Probability |&lt;br /&gt;
|-----------------------&lt;del&gt;--------------------&lt;/del&gt;--------&amp;#x2013;&amp;#x2014;|&lt;br /&gt;
| k &amp;gt;= 50               | t &amp;gt;= 50            |       0.154 |&lt;br /&gt;
| k &amp;gt;= 50               | t &amp;gt;= 100           |       0.111 |&lt;br /&gt;
| k &amp;gt;= 50               | t &amp;gt;= 200           |       0.078 |&lt;br /&gt;
&lt;/p&gt;

&lt;p&gt;
The correct way is to insert a &lt;i&gt;table&lt;/i&gt; in Outlook.  First, I have to export the table to a CSV file, than open it in Excel, and finally copy it over to Outlook which will recognised it as a &lt;i&gt;table&lt;/i&gt;.  
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgfb4d1c1&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgfb4d1c1&quot;&gt;HTML Attachment Solution&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgfb4d1c1&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I guess the purpose of that email is to give my colleague few numbers, in a way that he can compare and gain a feeling of the model.  So the format is really necessary, but the workaround is really tedious.
&lt;/p&gt;

&lt;p&gt;
I have another colleague who is an HTML expert and produced an company CSS style-sheet. He was kindly customised it to match the &lt;a href=&quot;http://orgmode.org/manual/CSS-support.html#CSS-support&quot;&gt;org-export class&lt;/a&gt;, i.e. &lt;i&gt;org-ur&lt;/i&gt;, &lt;i&gt;org-table&lt;/i&gt;, &lt;i&gt;org-list&lt;/i&gt;.
&lt;/p&gt;

&lt;p&gt;
So what I did was to export to org-file as a HTML and attached it in the email so that my colleague can simply click and open it in a browser, which will gives him a nicely formatted table.  But people have hundreds email per day and seems to dislike attachments.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org1447c88&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org1447c88&quot;&gt;Paradise of MIME&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org1447c88&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I noticed Bernt Hansen pointed out in his famous &lt;a href=&quot;http://doc.norang.ca/org-mode.html#License&quot;&gt;Org Mode - Organize Your Life In Plain Text!&lt;/a&gt; that he use &lt;b&gt;org-mime&lt;/b&gt; to sent HTML Email. MIME, standards for Multi-Purpose Internet Mail Extensions, is an extension to plain email and enable user to exchange rich data includes image, table, video etc.  
&lt;/p&gt;

&lt;p&gt;
The &lt;b&gt;org-mime&lt;/b&gt; can parse the org file into an HTML code, in a way that the email server like Office365 or Gmail can recognise and render it with pre-defined styles.  
&lt;/p&gt;

&lt;p&gt;
The default style looks awful: the font, the colour, size, basically nothing is right.  Recently I sent about 3-5 emails using this style, I doubt the reader will spent less time in reading and comprehend it, therefore the message is not conveyed. 
&lt;/p&gt;

&lt;p&gt;
But the workflow is fascinating: I call &lt;code&gt;org-mime-subtree&lt;/code&gt; function, then I just type few email address, no need to switch to system or Outlook, everything is done in Emacs and at the exact point where the main content is generated.
&lt;/p&gt;

&lt;p&gt;
So I was thinking, what if the email is look as good as the attachment? What if I can apply the style to the email, that would be looks fanatic! 
&lt;/p&gt;

&lt;p&gt;
I did my research, the &lt;b&gt;org-mime&lt;/b&gt; indeed provides feature to let user to change the HTML style, two example are showed on  &lt;a href=&quot;http://orgmode.org/worg/org-contrib/org-mime.html&quot;&gt;worg&lt;/a&gt;. The package first generate the HTML file, and than search-and-replace a certain chunk, for example, 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;p&amp;gt;&lt;/span&gt; 
  this is a paragraph 
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
will becomes something like this, depends on users specification,
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;p&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;style=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;blue&quot;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
  this is a paragraph 
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;



&lt;p&gt;
The search-replace mechanics works fine, for a small email.  It takes a pair value &lt;i&gt;(element, style)&lt;/i&gt;, where element can be paragraph, table, list and style can be colour, font, size etc.  The problem is this pair is not quick match the standard CSS file, 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-css&quot; data-lang=&quot;css&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;nt&quot;&gt;body&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nl&quot;&gt;font-family&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&quot;Helvetica Neue&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&quot;Lucida Grande&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&quot;Lucida Sans Unicode&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Helvetica&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Arial&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sans-serif&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;!important&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;nl&quot;&gt;font-size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;14px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;body&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;#content&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nl&quot;&gt;padding-top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;70px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;p&gt;
One can processing the CSS file, and feed the package a long list of pairs.  But this approach seems not safe.   I quickly skim the CSS file and found something I couldn&apos;t understand, for example the &lt;i&gt;body #content&lt;/i&gt; block above. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org1119684&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org1119684&quot;&gt;Hack org-mime&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org1119684&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
I think the most problem-free approach is to follow &lt;i&gt;org-export-html&lt;/i&gt; and ensure the generated Email has same style as exported HTML and &lt;b&gt;org-MIME&lt;/b&gt; package will eventually implement this, but I don&apos;t to wait and decide to hack.
&lt;/p&gt;

&lt;p&gt;
The &lt;a href=&quot;http://orgmode.org/w/?p%3Dorg-mode.git%3Ba%3Dblob_plain%3Bf%3Dcontrib/lisp/org-mime.el%3Bhb%3DHEAD&quot;&gt;script&lt;/a&gt; is formatted in a nice way, and looks like a textbook C program: it first declares variable and functions, with concise documentation so one can visualise the structure after reading 5-10 minutes.  But the implementation is way beyond my knowldge on Emacs-Lisp language.  I almost looked up each function that be used, take this snippet for example, 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt; 
(with-temp-buffer
  (insert html)
  (goto-char (point-min))
  (run-hooks &apos;org-mime-html-hook)
  (buffer-string))
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
I have no idea of what does it means.  You know the feeling when you try to learn an foreign language but took the wrong book that way above your level, and you find there no single word you could understand, and you was like &lt;i&gt;What The Hell&lt;/i&gt;?  That was my feeling.  
&lt;/p&gt;

&lt;p&gt;
The strategy I came up was to build up my Emacs-Lisp vocabulary: try to understand the functions/processes and translate it into a plain English, for example, 
&lt;/p&gt;

&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;with-temp-buffer&lt;/dt&gt;&lt;dd&gt;create a temporary buffer&lt;/dd&gt;
&lt;dt&gt;insert&lt;/dt&gt;&lt;dd&gt;insert the string, in this case, called html, at point.&lt;/dd&gt;
&lt;dt&gt;goto-char&lt;/dt&gt;&lt;dd&gt;move the cursor, which is called &lt;i&gt;point&lt;/i&gt; in emacs, to somewhere&lt;/dd&gt;
&lt;dt&gt;point-min&lt;/dt&gt;&lt;dd&gt;means the begining of a buffer/file&lt;/dd&gt;
&lt;dt&gt;run-hook&lt;/dt&gt;&lt;dd&gt;run &lt;i&gt;functions&lt;/i&gt; that links to org-mime-html-hooks&lt;/dd&gt;
&lt;dt&gt;buffer-string&lt;/dt&gt;&lt;dd&gt;return a &lt;i&gt;buffer&lt;/i&gt; as a &lt;b&gt;string&lt;/b&gt;&lt;/dd&gt;
&lt;/dl&gt;

&lt;p&gt;
Now that I understand each &lt;i&gt;words&lt;/i&gt;, I need to comprehense it and combine than together to understand the mean of this snippet. I tried to write in a plain English and the first attempt is like this
&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;
create temporary buffer, insert the generated html file, than move the cursor to the very start, and than apply other functions that links to org-mime-htmize
&lt;/p&gt;
&lt;/blockquote&gt;


&lt;p&gt;
I continue this &lt;i&gt;word-sentence-paragraph&lt;/i&gt; process and I understand few functions.  But it can goes on and on, and the more I learn about Emacs lisp, the future away I digress from my original goal: apply the style sheet to HTML email.  I guess this is a common dilemma in working with multi-languages.  Usually I follow my interests but this time I choose to focus on achieving the goal.  
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgd5a8bd1&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgd5a8bd1&quot;&gt;MIME Solution&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgd5a8bd1&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
It turned out it is a right decision.  The concept of &quot;inline-CSS&quot; is mentioned int the script, I googled and found out the solution within 10 minutes.   I realised that what I need to do is add a block in beginning of the HTML mail!! BINGO! 
&lt;/p&gt;
&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-html&quot; data-lang=&quot;html&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;head&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;&amp;lt;style&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;&amp;lt;/style&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
;; html email content starts here 
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-orgbc51ce1&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgbc51ce1&quot;&gt;Emacs Configuration&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgbc51ce1&quot;&gt;
&lt;p&gt;

&lt;/p&gt;

&lt;p&gt;
Here&apos;s the settings:
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-emacs&quot; data-lang=&quot;emacs&quot;&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;code&quot;&gt;&lt;pre&gt;(require &apos;org-mime)
(add-hook &apos;org-mime-html-hook
          (lambda ()
            (insert
             &quot;       
&amp;lt;head&amp;gt;
&amp;lt;style&amp;gt;
;; content of the .css file 
&amp;lt;/style&amp;gt;
&amp;lt;/head&amp;gt;&quot;
             ))
          t)
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Emacs for Writing</title>
   <link href="http://yitang.uk/2014/12/26/emacs-for-writing/"/>
   <updated>2014-12-26T00:00:00+00:00</updated>
   <id>http://yitang.uk/2014/12/26/emacs-for-writing</id>
   <content type="html">&lt;p&gt;
&lt;i&gt;Last Updated&lt;/i&gt;: 31 Dec 2014
&lt;/p&gt;

&lt;p&gt;
Do you use Emacs for writing the LaTeX, Markdown, or org documents?  Do you have a set of specific settings only for writing?  In this article I will share my experience of configuring a writing mode in Emacs that make it the most efficient writing tool for me.
&lt;/p&gt;

&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#org94515ca&quot;&gt;Word Count&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org4b2fc12&quot;&gt;Variable-width Font&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#orgca66819&quot;&gt;Sentence Highlight&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org6fae444&quot;&gt;Wrap-up&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org94515ca&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org94515ca&quot;&gt;Word Count&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org94515ca&quot;&gt;
&lt;p&gt;
I try to write as concise as possible and I use word count as a benchmark.  Counting the words does not sounds like a trivial task in my cases because I have a habit to comment, even for general writing.  I may comment out the whole paragraph, and leave a note aside about why, which are kept as it will be helpful in edit/review.  These comments and notes should not be counted since the reader can&apos;t  see them. 
&lt;/p&gt;

&lt;p&gt;
Addition to comments, there is a full list that does not count for technical articles, like source code, tables, figure captions etc.  Some people may add reference section to the list as well.  
&lt;/p&gt;

&lt;p&gt;
&lt;a href=&quot;https://github.com/dato/org-wc&quot;&gt;org-wc&lt;/a&gt; provides the &lt;code&gt;org-wc-subtree&lt;/code&gt; function that know what to count and what not to count.  Also, &lt;code&gt;org-wc-display&lt;/code&gt; will loop though all sections and overlay the number of words to each section headline.  It is particularly useful when I need to know which sections needs to trim down and which to add more.
&lt;/p&gt;

&lt;p&gt;
One of my daily achievement is to complete a writing challenge, which is about either to have write about 500 words or 45 minutes, whichever comes first.  It is like a racing game for me, knowing the time or number of words is important.  Tracking time is simple in Org-mode but words is problematic: I have to call the &lt;code&gt;org-wc-subtree&lt;/code&gt; function manually.  I raised a issues on GitHub and guided to &lt;code&gt;nanowrimo&lt;/code&gt; mode, which updates the word counts while I am typing and shows it on mode-line.
&lt;/p&gt;

&lt;p&gt;
It works out of box for me.  The number of words is adjacent to the time I spent, which make it is very convenient to compare.  Also, it calculate the average number of words per minute. It use this number to predict how long I need to achieve my daily goals (which is 500 words).
&lt;img src=&quot;https://dl.dropboxusercontent.com/u/43889494/Screenshot%202014-12-26%2014.39.07.png&quot; alt=&quot;Screenshot%202014-12-26%2014.39.07.png&quot; /&gt;
The picture above shows that I spent 30 minutes editing and there are 254 words in this section.  
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-org4b2fc12&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org4b2fc12&quot;&gt;Variable-width Font&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org4b2fc12&quot;&gt;
&lt;p&gt;
I have a little OCD about font since university. I use Time News Rome for formal report and any other serif font for general writing because they make paragraph and text easier to read.  
&lt;/p&gt;

&lt;p&gt;
There was a time my friend passed me a PDF file and asked me to review it.  The problem was it was in Arif font (I think) which looks terrible, and also writing became unpleasant. This experience makes me to think what is the best font for writing.
&lt;/p&gt;

&lt;p&gt;
I did some research and come across the concept of variable-width font.  As a programmer, I use Adobe&apos;s &lt;a href=&quot;https://github.com/adobe-fonts/source-code-pro&quot;&gt;Source Code Pro&lt;/a&gt; font as default which means I face monospaced font all day.  For a monospaced font,  each character has same space.  
&lt;/p&gt;

&lt;p&gt;
While for variable-doth font, each cahracter takes width corresponding to it&apos;s shape.  For example, the length of &quot;i&quot; is about 1 of 4th of &quot;w&quot;.  Needless to say, variable-width font is more close the nature of hand-writing.  Emacs has a built-in &lt;code&gt;variable-pitch-mode&lt;/code&gt; that could change the font. 
&lt;/p&gt;

&lt;p&gt;
But will it make any different to my writing?  I am not sure at this moment, but I would like to have a special font that I solely use in writing.  The  link between the font and my write mind will gradually become firm, and eventually increase my productivity in writing.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;outline-container-orgca66819&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orgca66819&quot;&gt;Sentence Highlight&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orgca66819&quot;&gt;
&lt;p&gt;
Writing requires thinking and concentration.  People have their own tips that help them to stay focus and &lt;a href=&quot;http://www.lifehack.org/articles/communication/tried-tested-and-true-3-ways-to-get-writing-done.html&quot;&gt;get writing done&lt;/a&gt;, it may relates to a place, time or tools.  
&lt;/p&gt;


&lt;p&gt;
I tired many tips, like mediate before write, drink coffee, cut off internet but none of them works very well, the effects seems random.  One problem I have in writing is that I jump between the sections quite often. 
&lt;/p&gt;

&lt;p&gt;
I tried to highlight the one sentence at a time so that I can focus on the one I am writing.  I found &lt;a href=&quot;https://github.com/milkypostman/hl-sentence&quot;&gt;hl-sentence&lt;/a&gt; package does exactly what I want.  Also, I followed the author&apos;s suggestion and tweak the configuration to blur the other sentences to reduce the &lt;i&gt;noise&lt;/i&gt;. 
&lt;/p&gt;


&lt;p&gt;
The current setting has two folder and helps me in a way that I can focus naturally: I don&apos;t need to force myself not looking other sentence. 
&lt;/p&gt;

&lt;p&gt;
The sentence highlight feature also has an big impact on my writing process by making the editing easier.  One thing I want to achieve is to have proper length for each sentence/paragraph: If it is too short, I will merge it.  If it is too long, I will break up into short sentences.  The highlights give me a sense of the length visually which I used to get by reading or counting.  To check how many sentences exactly for each paragraph,  I move the cursor to end of a &lt;a href=&quot;http://www.gnu.org/software/emacs/manual/html_node/emacs/Sentences.html&quot;&gt;sentence&lt;/a&gt; by &lt;code&gt;M-e&lt;/code&gt;, and then count how many flashes I have to reach the end of a paragraph.  
&lt;/p&gt;


&lt;div id=&quot;org25b6124&quot; class=&quot;figure&quot;&gt;
&lt;p&gt;&lt;img src=&quot;https://dl.dropboxusercontent.com/u/43889494/Screenshot%202014-12-26%2018.54.03.png&quot; alt=&quot;Screenshot%202014-12-26%2018.54.03.png&quot; /&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org6fae444&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org6fae444&quot;&gt;Wrap-up&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org6fae444&quot;&gt;
&lt;p&gt;
I am fairly happy about the &lt;code&gt;nanowrimo&lt;/code&gt;, &lt;code&gt;hl-sentence&lt;/code&gt; and &lt;code&gt;variable-pitch&lt;/code&gt; mode and the powerful Emacs.  Thanks to all the authors who wrote the scripts, because of their quality work,  many things work out of box and I am able to have an seamless integration to the current workflow.  It has became more efficient and productive, and makes me believe the Emacs is the best writing tool for me. 
&lt;/p&gt;

&lt;p&gt;
Which program do you use for writing? which feature do you like most?  
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>How Do I Build This Blog</title>
   <link href="http://yitang.uk/2014/12/17/jekyll/"/>
   <updated>2014-12-17T00:00:00+00:00</updated>
   <id>http://yitang.uk/2014/12/17/jekyll</id>
   <content type="html">&lt;div id=&quot;table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id=&quot;text-table-of-contents&quot; role=&quot;doc-toc&quot;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;#orge110b7f&quot;&gt;Why I Learn Jekyll/Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org08e807a&quot;&gt;Process Raw Data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org91ce073&quot;&gt;Analysis 1 - Mixture of levels&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org8b68f0f&quot;&gt;Analysis 2 - Time distribution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org729b875&quot;&gt;Analysis 3 - Benefits&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;#org10fa780&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;b&gt;To Reader&lt;/b&gt; 
&lt;/p&gt;

&lt;p&gt;
The learning skill becomes more and more important nowdays because there are just so much to learn, either from daily job or personal interest.  But have you ever thought about the way you learn or how good is your learning skill?   In this article, I want to share my experience in learning how to build this blog using &lt;a href=&quot;http://jekyllrb.com&quot;&gt;Jekyll&lt;/a&gt; and how I exam the way I learn via data and statistical analysis. It is worth reading if you: 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;want to improve your learning skill,&lt;/li&gt;
&lt;li&gt;are trying to learn Jekyll or want to build a personal website,&lt;/li&gt;
&lt;li&gt;are interested in quantified-self project.&lt;/li&gt;
&lt;/ol&gt;


&lt;div id=&quot;outline-container-orge110b7f&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;orge110b7f&quot;&gt;Why I Learn Jekyll/Blog&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-orge110b7f&quot;&gt;
&lt;p&gt;
All my high-school classmate known that I am really bad at writing. Things becomes worse when I was in university studying mathematics. I start to void writing at all the cost: I deliberately &lt;i&gt;volunteed&lt;/i&gt; the do the coding or maths bit and let others do the writing. Everybody in the group seems like me. 
&lt;/p&gt;

&lt;p&gt;
During my study in Warwick University, I fancy the Statistics and really enjoy telling people the relationship I between all sort of facts. Then I come across The Guardian&apos;s Data Journalism program which is a term in use since 2009/2010, to describe a journalistic process based on analyzing and filtering large data sets for the purpose of creating a news story&lt;sup&gt;&lt;a id=&quot;fnr.1&quot; class=&quot;footref&quot; href=&quot;#fn.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.  I found it is really cool! I tried to analyst a new dataset from World Bank and want to write an article but when I sit down my mind just completely blank. I managed to write a few paragraph but it was really awaful. That was the first time that I thought I wish I have a proper writing skill.  I won&apos;t bother to learn because I planed to go back to China soon but I decided to work in the UK at the last minute. 
&lt;/p&gt;

&lt;p&gt;
Last week, I was doing a statistical consutlantcy project.  I was do the analysis, writing code in the morning then try to write it up in the afternoon.  I realise it pretty easy for me to do the analysis and coding, but it was really a pain to write it up, but I do enjoy it. So I decided to build a personal blog so that I can practise my writing skill and at the same time, to promote the usage of statistics in daily life.
&lt;/p&gt;

&lt;p&gt;
I have no prior knowledge of building a website and spent couple of evenings/weekends sitting in front of computer at library/caffee and try to learn. This project accros two months and  takes me about 20 hours in total to finally build this blog.  Motivation really is crucial to learning.  What make this learning expeirence really unique is that I have data about my learning. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org08e807a&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org08e807a&quot;&gt;Process Raw Data&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org08e807a&quot;&gt;
&lt;p&gt;
I have a very good habit: I record every single tasks I do in terms of how much time I spent and what I did.  Take this task for example, 
&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-text&quot; data-lang=&quot;text&quot;&gt;PM (strcture)
SCHEDULED: &amp;lt;2014-12-11 Thu 17:45&amp;gt;
:LOGBOOK:  
CLOCK: [2014-12-11 Thu 17:41]--[2014-12-11 Thu 18:28] =&amp;gt;  0:47
:END:      
:PROPERTIES:
:Effort:   0:45
:END:
[2014-12-10 Wed 22:28]
wait for 4 minutes 
- [ ] go though all the headlines,
- [ ] group into few categories, like 1) learn, 2) apply, 3) improve etc.
- [ ] start to edit &lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;
This note includes all the info I need to know: 
&lt;/p&gt;
&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;I want to do it On Wed and  estimate it takes 45 minutes to compltete&lt;/li&gt;
&lt;li&gt;I planed to do on Thurs,&lt;/li&gt;
&lt;li&gt;I started at Thur 17:41, 4 minutes before the scheduled time,&lt;/li&gt;
&lt;li&gt;It takes me 47 minutes to do the job, 2 minutes longer then expected.&lt;/li&gt;
&lt;li&gt;what i did, the main body&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
I gathered all the notes that are relevant to this project and accumulate the time I spent for each sub-tasks.  It can be summaries as a table: 
&lt;/p&gt;
&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;
&lt;caption class=&quot;t-above&quot;&gt;&lt;span class=&quot;table-number&quot;&gt;Table 1:&lt;/span&gt; Clock summary at &lt;span class=&quot;timestamp-wrapper&quot;&gt;&lt;span class=&quot;timestamp&quot;&gt;[2014-12-10 Wed 21:42]&lt;/span&gt;&lt;/span&gt;&lt;/caption&gt;

&lt;colgroup&gt;
&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Headline&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Time&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&lt;b&gt;Total time&lt;/b&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&lt;b&gt;17:22&lt;/b&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;TODO blog&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;17:22&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; DONE Jekyll official guide on github&amp;#x2026;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:22&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; workflow&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:45&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; TODO how to change the look of html&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2:11&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; org-jeykll workflow (final)&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:26&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; DONE blog, does not looks good on&amp;#x2026;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:10&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; discovery jeykll template&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:57&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; DONE Jeykll disaster&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:00&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; NEXT tweak jkyll (add social content&amp;#x2026;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:17&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; DONE Jekyll (disqus and google&amp;#x2026;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:25&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; Jekll,&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2:23&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; Jekyll general search&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:53&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; NEXT intro to jekyll&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0:28&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; Jekyll code highlight&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:03&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;&amp;emsp; DONE Jekyll - non-doing action&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1:02&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;
It tells that I spent 17 hours and 22 minutes in total on this project.  The first question is: what does this 17 hours mean to me?  I spent 500hours plus in playing video games between 2013-2014 and I really don&apos;t get nothing out of it.  I really enjoy this learning experience because I have a definitive goal I want to achieve, then I put effort and I can get some results.  
&lt;/p&gt;

&lt;p&gt;
The rest of the table tells me tasks I did in chronological order.  The first task was to read the official Jekyll guide and I spent 1 hour 22 minutes.  This table looks really awfully and gives you an insight of what a real world data looks like.  I don&apos;t really know what to do with it. So I have to refers to the way I learn as a child. 
&lt;/p&gt;

&lt;p&gt;
From our education system, the way we are learning is that we start on a basic level, we study the topic, apply it, make mistakes, correct it, and continue this process to the next level.  It gives me an idea to define &quot;levels&quot;.  So I skim the whole project, from beginning to end, and grouped the data into five categories:
&lt;/p&gt;

&lt;dl class=&quot;org-dl&quot;&gt;
&lt;dt&gt;Basis&lt;/dt&gt;&lt;dd&gt;the foundation of Jekyll, website and HTML language,&lt;/dd&gt;
&lt;dt&gt;Features&lt;/dt&gt;&lt;dd&gt;he extended features provided by Jekyll, for example, add social network link, add a discussion/comments,&lt;/dd&gt;
&lt;dt&gt;Workflow&lt;/dt&gt;&lt;dd&gt;integrate the publishing process to my current workflow and try to automate as much process as possible,&lt;/dd&gt;
&lt;dt&gt;Try&lt;/dt&gt;&lt;dd&gt;try to impelement something new that for my own needs, or try other people ideas,&lt;/dd&gt;
&lt;dt&gt;Fix&lt;/dt&gt;&lt;dd&gt;fix problems along the way I build up the website&lt;/dd&gt;
&lt;/dl&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org91ce073&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org91ce073&quot;&gt;Analysis 1 - Mixture of levels&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org91ce073&quot;&gt;
&lt;p&gt;
There is a linear increasing level, and one dependence the previos level. Ideally, the most cost-effective approach for a person to learn, is to focus on step, master it, and then goes to the next. This is the way of our education system. But it won&apos;t be the case of a self-study project, which is most likely to be interest-driven. it wold be very interesting to see that how I was jumping between these five levels. I sort out the timeline for each tasks I did and plot the time with levels. 
&lt;/p&gt;


&lt;div id=&quot;orga25e5f2&quot; class=&quot;figure&quot;&gt;
&lt;p&gt;&lt;img src=&quot;https://dl.dropboxusercontent.com/u/43889494/a.gg.plot.png&quot; alt=&quot;a.gg.plot.png&quot; width=&quot;800&quot; /&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;figure-number&quot;&gt;Figure 1: &lt;/span&gt;timeline view&lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;
The x-axis is the time line in minutes for this project and y-axis is the level I defined.  I can tell which level I am in a studying time.  For example, for the first 200 minutes I was studying the basis knowledge, then I spent 35 minutes on Feature and so on. 
&lt;/p&gt;

&lt;p&gt;
I am very surprised that I went to workflow level so early, in about 20% of this project.  It is actually make a lot sense because this project across 2 months and having a workflow that suitable for me really accelerate this project because It takes no time to pick up after leaving this project for few days.  
&lt;/p&gt;

&lt;p&gt;
There was a time that the website is broken and I cannot figure out why.  It turned out I missed spelled an configuration file.  
&lt;/p&gt;

&lt;p&gt;
Finally I spent some time on googling about how other people use Jekyll and tried quit a few and never goes back to lower level. 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org8b68f0f&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org8b68f0f&quot;&gt;Analysis 2 - Time distribution&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org8b68f0f&quot;&gt;
&lt;p&gt;
It would be also interesting to see how long I spent on each levels, i.e. 
&lt;/p&gt;

&lt;div id=&quot;orgf26579b&quot; class=&quot;figure&quot;&gt;
&lt;p&gt;&lt;img src=&quot;https://dl.dropboxusercontent.com/u/43889494/a.pie.chart.2.png&quot; alt=&quot;a.pie.chart.2.png&quot; width=&quot;600&quot; /&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;figure-number&quot;&gt;Figure 2: &lt;/span&gt;time distritbuoin&lt;/p&gt;
&lt;/div&gt;

&lt;table border=&quot;2&quot; cellspacing=&quot;0&quot; cellpadding=&quot;6&quot; rules=&quot;groups&quot; frame=&quot;hsides&quot;&gt;


&lt;colgroup&gt;
&lt;col  class=&quot;org-left&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;

&lt;col  class=&quot;org-right&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-left&quot;&gt;Level&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Time (Min)&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Percentage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;Basis&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;184&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.18&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;Feature&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;276&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.26&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;Workflow&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;191&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.18&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;Fix&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;70&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.07&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;org-left&quot;&gt;Try&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;321&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;0.31&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;
The pie chart should be read in clocking-wise direction and it is ordered by levels. I spent 3 hours in Basis level and to be honest I understand much  about Jekyll.  It is interesting to see that I spent 8% more time on the expanded features than the basic knowledge, i.e. using it rather study the knowledge.
&lt;/p&gt;

&lt;p&gt;
Thanks to all the volunteers that working on connect Jekyll and Org mode.  I only spent 3 hours in setting up the  workflow that includes write article in Org mode (plain text), convert it to HTML web page, and then upload to my blog.  Jekyll is a sophisticated and well tested software that it can be configured easier and I didn&apos;t running into any problems.
&lt;/p&gt;

&lt;p&gt;
The rest 5.4 hours was the most inefficient in this project.  At that time I was keen to include Table of Content and Code Syntax Highlight in my blog.  I did a lot further research for solutions and tried a lot.  But none fits well with the foundation I have built up.  Some is buggy and create more problems.  If you really need these feature, see this blog.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org729b875&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org729b875&quot;&gt;Analysis 3 - Benefits&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org729b875&quot;&gt;
&lt;p&gt;
The simple question: is it worth spending 18 hours building a website side? In timewise, I have spent 20 hours on writing, but this number does not count, because I can write without a blog at all. But I feel that having this blog promote me writing because: 
&lt;/p&gt;

&lt;ol class=&quot;org-ol&quot;&gt;
&lt;li&gt;the writings has their destination. It is not a digital file in my computer that I will forget in few weeks or a page in a notebook I hardly look back with potentially lose. All my writings will be in this single website that has a unique address that everybody can visit it whenever or whatever they are.&lt;/li&gt;
&lt;li&gt;the aim for writing has been changed. It is not only to express myself, like a daily, but to share an journal to other. I have to consider the reader&apos;s feel, will they like it? will they understand it? So, the writing becomes more proactive, more thinking on human relationship, and thus more fun.&lt;/li&gt;
&lt;li&gt;Writing is linked to this website, which is linked to quantifies-self project and Emacs/org-mode, which then linked  back to statistics and programming, which are the two main passion of me. Writing is not something that occurs to my mind one day and then I swear I will master it, but something in my passion network and extended my passion to another area.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id=&quot;outline-container-org10fa780&quot; class=&quot;outline-2&quot;&gt;
&lt;h2 id=&quot;org10fa780&quot;&gt;Conclusion&lt;/h2&gt;
&lt;div class=&quot;outline-text-2&quot; id=&quot;text-org10fa780&quot;&gt;
&lt;p&gt;
&lt;span class=&quot;timestamp-wrapper&quot;&gt;&lt;span class=&quot;timestamp&quot;&gt;[2014-12-19 Fri 13:21]&lt;/span&gt;&lt;/span&gt;
First, quantify my time on learning is 
&lt;/p&gt;

&lt;p&gt;
Words: 1657, Write: 5 Hours
&lt;/p&gt;







&lt;p&gt;
&lt;a href=&quot;https://www.coursera.org/course/learning&quot;&gt;Learning How to Learn: Powerful mental tools to help you master tough subjects&lt;/a&gt;
&lt;a href=&quot;http://datajournalismhandbook.org&quot;&gt;The Data Journalism Handbook&lt;/a&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&quot;footnotes&quot;&gt;
&lt;h2 class=&quot;footnotes&quot;&gt;Footnotes: &lt;/h2&gt;
&lt;div id=&quot;text-footnotes&quot;&gt;

&lt;div class=&quot;footdef&quot;&gt;&lt;sup&gt;&lt;a id=&quot;fn.1&quot; class=&quot;footnum&quot; href=&quot;#fnr.1&quot; role=&quot;doc-backlink&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;div class=&quot;footpara&quot; role=&quot;doc-footnote&quot;&gt;&lt;p class=&quot;footpara&quot;&gt;wiki&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;


&lt;/div&gt;
&lt;/div&gt;</content>
 </entry>
 

</feed>

