Git repository via Apache – Take 3

I revisited an old post that I wrote to see how to host my own private git repository with Apache, but I didn’t write any details and made it hard on myself, so I’ll try to fix that here.

My setup

  • Anonymous read access, user/password protected write access
  • Ubuntu 14.04
  • Apache 2.4
  • Git 1.9.1

Step 1. Enable the prerequisite modules.

I had to add symlinks in /etc/apache2/mods-enabled for the rewrite and cgi modules. These are already available with a vanilla install, so look for rewrite.load and cgi.load in the mods-available folder.

In addition, you need to explicitly enable the rewrite module in the virtual host setup in the vanilla apache2 install. I added these lines right under the log defines in sites-enabled/000-default.conf:

RewriteEngine On
RewriteOptions Inherit

Step 2. Set up the git repositories.

I set up all my repositories under /path/to/git, e.g. /path/to/git/project1.git/, etc. The repositories have to be set up in a specific way to enable push. Here’s a script I use:

if [[ $# < 1 ]]; then
    echo Usage: `basename $0` repo-name
    exit 0
cd /path/to/git
if [[ -e ${1}.git ]]; then
    echo ${1}.git exists
    exit 0
mkdir ${1}.git
cd ${1}.git
git init --bare
git config http.receivepack true
git update-server-info
touch git-daemon-export-ok
cd ..
sudo chown -R www-data:www-data ${1}.git

Step 3. Set up a users list with htpasswd.

  1. Install apache2-utils
  2. htpasswd -c /path/to/git/users.txt username

Step 4. Set up a git config.

Put this in, say, git.conf under /etc/apache2/confs-available, and

SetEnv GIT_PROJECT_ROOT /var/www/git
ScriptAlias /git/ /usr/lib/git-core/git-http-backend/

RewriteCond %{QUERY_STRING} service=git-upload-pack [OR]
RewriteCond %{REQUEST_URI} /git-upload-pack$
RewriteRule ^/git/ - [E=AUTHNOTREQUIRED:yes]

<LocationMatch "^/git/">
    AuthType Basic
    AuthName "Git access"
    AuthUserFile /path/to/git/users.txt
    Require valid-user


I spent way too much time on this because of a mismatch between git and apache documentation. Basically, the authorization directives were specific to Apache 2.2 but I was working with Apache 2.4 – see here under runtime configuration changes where they mention that they have a new authorization module mod_authz_host. Also, because I’m not a sysadmin, I didn’t realize that I had to add “RewriteEngine On” so I was scratching my head about why the environment variable was never getting set.

Anyways, I hope the above saves someone some time. I promise not to write any tutorial blogs that only link to documentation without adding anything because, as you can see, reading manuals does not solve problems.

Diary #23: Trying to face forward

I often find myself getting into this mood where I want to do something that I know I should have spent more time doing, but out of fear of how much it’s going to hurt because I had put off doing it, I end up putting it off even further.

Most of the time, what seems to hurt is self-inflicted. There is the guilt trip of being less than fully devoted, not committed, and irresponsible that sends me into a negative feedback loop, thinking that if that’s my nature, why should I even waste my time pretending? It’s my nature to care more than I should about what other people think, and combining this with the negativity I mentioned ends up being very debilitating.

I’m trying to get myself to face forward in spite of the bad feelings. To begin with, I’m writing this blog to break the chain of missed blogs that I’ve achieved during this latter half of summer. Now that that’s over with, maybe I’ll be able to focus on some of the positive things that are going on with me. Things like finishing up with Summer of Code and going to Paris in mid-September! Next blog.

Diary #22: A lot of little things

Well, it’s hard to blog about a lot of little things, so I’ll clump them all into one.

I sent off a friend and fellow graduate student at SFO this past week after providing shelter in my apartment for about a week. I felt bad about how I was worrying about how it was cramping my bachelor grad student routine in the back of my head while they were staying over. I ended up just spending most of my time with them towards the end of their stay since, hey, who knows when I’ll get to meet them again after they return to China.

Since publishing the vim addin as a standalone plugin, people have starred the repo on github. I feel obligated to go to work on it and also a little guilty for not spending much time on it. Maybe dedicate an hour or so daily to looking up the API and jotting down plans on the wiki so that the project actually has a freaking pulse.

Speaking of which, I’m making progress on writing my own python console Qt widget. There is an existing project on sourceforge called qconsole, but it’s GPL and I’d rather go through the process of setting up my own.

I think I want to dedicate some time regularly visiting the code review stack exchange, at least to work on my mental checklist of reviewing code when I don’t have it in compilable/debuggable format. I guess this might be more for code quality rather than correctness of implementation of algorithms, but it’s still important to get a mental checklist ironed out for when I have to look over my own code during an interview.

I managed to set myself up to buy Kindle ebooks in Japanese from Amazon Japan. Sadly, you really only have print copy as an option for certain novels, but at least the selection is much larger than what is simply on Amazon US. It is so amazing.

I’ve got the okay from my advisor to take the trip to Paris in September (I was invited to participate in a CGAL developer meeting). It’d be nice to get to talk to people who do geometry for a living, get some feedback about my project, about software engineering prospects, open source development, and life in general. I’m going to focus on doing a good job for the rest of the summer of code and maybe follow the mailing list more closely so I can make a good impression when I get there. Also, time to do some trip planning in case my parents want to tag along.

Polynomial parser in C++ using Boost.Spirit

Edit (07/07/2014): I’m sorry, the initial code compiled but didn’t work. It’s been updated now, along with the blog.

First of all, here is a link to the code. It depends on Boost.Spirit, which is a header-only part of the Boost library. It compiles, and it parses bivariate polynomials (e.g. xy^2 + 2x – 5y).

The input is a user-input string, and the output is a polynomial, which is represented internally as a vector of terms. A term has a coefficient and two exponents, one for each variable.

struct term
  boost::optional<int> coefficient;
  boost::optional<int> x_exponent;
  boost::optional<int> y_exponent;

So imagine the parser going over the string and processing each term individually, stuffing it into a vector. That’s exactly how the parser rules work.

The rules in the parser define how the user input is parsed, and we attach actions in the form of lambda functions to each rule to be executed when each part is parsed. It starts at the top with start:

 namespace phx = boost::phoenix;
 start = eps[_val = std::vector<term>()]
>> poly_term[phx::push_back(_val, qi::_1)]
>> *(
('+' >> poly_term[phx::push_back(_val, qi::_1)])
('-' >> negative_poly_term[phx::push_back(_val, qi::_1)])

First of all, rules have attributes, and the start rule’s attribute is std::vector of term‘s. The rule says that we should initialize with an empty vector, then expect to parse at least one poly_term (i.e. polynomial term).

The poly_term is another rule, whose attribute is term. eps is another rule that consumes no input and always matches.

Inside the brackets of each rule is a lambda function that does something with the attribute of the rule. For example, the rule for poly_term says to take the term, stored in the placeholder qi::_1, and stuff it into the vector that we initialized at the beginning.

A single polynomial term contains three pieces of information: the coefficient out in front, and the two exponents for each variable, and the next rule shows how to break it down.

 poly_term = eps[_val = term()]
-int_[phx::bind(&term::coefficient, _val) = qi::_1]
-x_term[phx::bind(&term::x_exponent, _val) = qi::_1]
-y_term[phx::bind(&term::y_exponent, _val) = qi::_1]

First, we try to pick out the coefficient. int_ is a “primitive” Spirit rule that will pick up an int from the string being parsed. The action says that the int will be taken and assigned to the coefficient field of the term attribute of this rule.

Note that the coefficient is optional, as indicated by the dash outside of the parens. The Kleene operator in the start rule up above is another one, which matches any number of poly_terms. Here’s a complete list of the parser operators.

Somehow, it is quite beautiful to look at this code.

Japan travel #3: Sausage at Kyodai

This is picking up from my last Japan travel blog…

There are a lot of loan words in the Japanese language, and this can be convenient, endearing, and absolutely frustrating at the same time. Add in the fact that Japanese love to abbreviate things and you’re assured to never have a dull moment.

The convenient thing about loan words is that Japanese has specific notation called katakana that explicitly indicates foreign words. The マクドナルド menu in Japan is chock full of the things you’re used to, like アイスコーヒー, チーズバーガー, and ポタト, but it has some interesting things, like ドイツバーガー.

It’s really cute to pronounce the things you see in katakana to see how the Japanese decided to cast the words into their syllabary. For example, カーブ took me a little while to figure out. エネルギー makes me scratch my head because the Japanese do have the soft G phoneme but decided that the hard G at the end was more appropriate.

It gets complicated when abbreviations come into the picture. ファミレス is a common pattern where the first two syllables of each word in a compound word is smashed together to give you the word. Some are just tricky to parse if you don’t have any context and are seeing it for the first time, like スマホ.

The abbreviations extends to the Japanese words as well. For example, a common food combination is 天婦羅 and 玉子, so a noodle menu item might be called 天玉そば — notice how the first character from each item is used for shorthand.

Now, I thought I’d talk a little bit about about the actual reason that I went, and was able to go, to Japan in the first place: the theoretical computer science conference called Symposium on Computational Geometry — otherwise abbreviated as SoCG. But that’s an awkward acronym to spell out, and you can confuse it with things like GSoC, for example. Well, if you look at it long enough, I’m sure you’ll agree with a lot of the frequent attendees that “Sausage” is a much more endearing nickname for the conference. Suddenly, the wifi password of ‘sausage2014kyodai’ makes a lot of sense if you consider that the venue was the clock tower hall at Kyoto Daigaku, or Kyoto University.

You clever Japanese.

Well, over the course of the 4-day conference, I witnessed quite a few theoretical talks that went way over my head, but what I took away from those talks at a high level is that the emphasis doesn’t seem to be on any particular application but rather on solving a previously unsolved problem or solving a problem more efficiently, for example, by proving a lower asymptotic bound. Secondly, beautiful, clean, simple schematic figures are a lot better when the point is to illustrate your method in a severely constrained amount of time. I feel like I put a lot of pressure on myself to visualize real data thinking that I need to show that my work is authentic, but I am missing the point that it might be way too cluttered and distracting for my intended audience.

My seat in one of the conference rooms.

My seat in one of the conference rooms.

A view from the speaker's point of view. Also, blurry Carlos.

A view from the speaker’s point of view. Also, blurry Carlos.

Jin Akiyama, very famous Japanese mathematician, giving his invited talk.

Jin Akiyama, very famous Japanese mathematician, giving his invited talk.

As for my talk, I think it could have definitely went better. I knew that I had 15 minutes to work with ahead of time, but I did not factor in that part of that would be used for Q&A and speaker changeover, so it was pretty rushed. Anyways, I pointed people to the website I made in the end for reference, so it wasn’t so bad.

I was also lucky enough to run into some CGAL editors on the last day of the conference. From left to right are Michael, Monique, and Eric:


The Kyoto University campus was smaller than I expected. I think it is about the size of San Jose State University. The students really struck me as acting very young, maybe because they are, and I’m starting to not be anymore, but I don’t know — this is just my impression, which is similar to how I felt when I studied abroad in Hong Kong.

A view from the northwest corner of Kyoto University.

A view from the northwest corner of Kyoto University.

Lots of bikes on campus.

Lots of bikes on campus.

Also motor bikes on the left, too.

Also motor bikes on the left, too.

Small cars are pretty common to see.

Small cars are pretty common to see.

Having Indian food with Carlos near campus. Crazy amount for less than $10. The naan is huge.

Having Indian food with Carlos near campus. Crazy amount for less than $10. The naan is huge.

All in all, it was a fun time, probably solidifying my view that the academic life is not the one for me. It felt great to meet people who are interested in what you do, though most likely that is going to be that one person whose name you’ve seen online doing similar stuff already. It was cool and humbling to witness how chummy the researchers are with each other. I’m sure I will make my mark, but most likely it will be writing things other than academic papers. I think I can be more useful writing software or translations or blogs, for example.

Thanks for the trip!

Code Reading #1: Slick callback registration in MonoDevelop

I’m starting a new category of blog posts called ”Code Reading” where I’ll talk about something interesting I saw during my week of coding, with some code snippets of course.

Just today, this caught my eye when I was trying to figure out how to hook into an event in the Vim addin that I was trying to package up. So there is a global preferences window where the user checks boxes to enable certain things, and one of them is vim input mode in the text editor. Basically, there’s a global PropertyService that exposes these selections, as well as fires an event when something is updated.

I found an example of how this event can be registered/unregistered in the SourceEditorOptions class. Registration happens in the constructor:

        DefaultSourceEditorOptions (MonoDevelop.Ide.Gui.Content.TextStylePolicy currentPolicy)
            LoadAllPrefs (); 
            UpdateStylePolicy (currentPolicy);
            PropertyService.PropertyChanged += UpdatePreferences;

And unregistration happens in the destructor:

        public override void Dispose()
            PropertyService.PropertyChanged -= UpdatePreferences;
            FontService.RemoveCallback (UpdateFont);

So PropertyChanged shows up in PropertyService on line 270:

public static event EventHandler<PropertyChangedEventArgs> PropertyChanged;

The interesting thing to me is the event keyword, which means there’s language-level support for event handling.

It’s really slick syntax to just add or subtract your event handler to the event as the highlighted lines above show. Let’s just peek at what that event handler looks like:

        void UpdatePreferences (object sender, PropertyChangedEventArgs args)
            try {
                switch (args.Key) {
                    case "TabIsReindent":
                    this.TabIsReindent = (bool)args.NewValue;
                    case "EnableSemanticHighlighting":
                    this.EnableSemanticHighlighting = (bool)args.NewValue;

So it’s interesting that you can just pass the member function around. There’s magic that happens inferring the type of that thing, but the point is that it is a first-class object. Here’s another more explicit example elsewhere:

            properties.PropertyChanged += delegate(object sender, PropertyChangedEventArgs args) {
                if (PropertyChanged != null)
                    PropertyChanged (sender, args);

I guess it’s a lambda function, but not quite, because they call it a delegate. But it has the signature and the body, all the same.

Well, I was really impressed when I tried to plug my own function in and got a compiler error indicating what the signature should have been. I wonder where it is specified and also where the event is generated. It’s a mystery to me, but I know that it’s a lot more heavyweight to achieve in C++.

Diary #21: Knee deep in templates, long compiles

So CGAL makes pretty use of C++ templates, and the Summer of Code project that I am working on builds a Qt4 visualization of 2D arrangements, one of these CGAL data structures for building collections of 2D curves. I decided to make the individual UI and event handling components templated by the type of arrangement and specialize as needed, and I think it ended up saving me a lot of typing. The problem is the compilation time for the full demo is pretty substantial.


A clean actually took 5 minutes using a parallel build on a 4-core system. What’s hard to see is that actually one core gets stuck compiling a certain few files, for example, ArrangementDemoWindow.cpp is a major culprit. This bad boy is responsible for instantiating ArrangementDemoTab, which is templated by currently six different types of arrangements. Each of these tabs instantiate another level of at least six callback components that handle various operations on arrangements. There is a Qt GraphicsItem subclass for visualizing the actual arrangement that is also templated based on the arrangements. Finally, each of the components make use of little utility classes that are templated based on types provided by the arrangement type.

Now, C++ has no choice but to generate code for implicitly instantiated classes on the spot. so it’s probably no surprise that the memory usage spikes up to 4.1 GB when it hits a “fat” class like ArrangementDemoWindow. Sure, ArrangementDemoWindow.cpp is 1200 lines of code, but that shouldn’t cause such a massive effect. Unless you are a compiler that needs to instantiate a wide swath of template classes on demand to do type checking and such. So the silent console fails to convey the chaos that is happening behind the scenes, and the point is that there is a lot of redundant compilation happening.

So, it’s become an absolute pain to try to do any debugging with the full demo. It means that a single line change in a fat class like ArrangementDemoWindow means you have no choice but to sit through the deep compilation of all those templated classes, even if none of them changed at all. I’m fed up with it. Currently, I have to write a smaller UI example that instantiates only one type at a time so that recompilation is not so ridiculous.

But all this really makes me ask, is all this really unavoidable? Couldn’t I explicitly instantiate a certain set of template classes, precompile them once, and save them to a library for linking later on? How can I indicate this to the compiler?

Actually, if we could make use of C++11 features, we can use the extern templates feature to introduce some modularity. For example, suppose you wanted to precompile a vector of ints type. You can put this in your IntVector.h:

    #include <vector>
    extern template class std::vector<int>;
    typedef std::vector<int> IntVector;

and this in your IntVector.cpp:

    #include "IntVector.h"
    // instantiates the entire class from the default template
    template class std::vector<int>;

Then you can include the header file whenever you want to use IntVector. Wherever you use it, the compiler will not generate any code like it normally does, but you will need to compile IntVector.cpp and link its object. Now imagine if you have a lot of templated classes that are tied together with dependent parameter types and collaborate closely. Then this can save a ton of compilation time.

Clang will do it, and I guess g++ should do it if you set it to c++11 mode. But as always, I need to support the older compilers, so that’s right out.