Git repository via Apache – Take 3

I revisited an old post that I wrote to see how to host my own private git repository with Apache, but I didn’t write any details and made it hard on myself, so I’ll try to fix that here.

My setup

  • Anonymous read access, user/password protected write access
  • Ubuntu 14.04
  • Apache 2.4
  • Git 1.9.1

Step 1. Enable the prerequisite modules.

I had to add symlinks in /etc/apache2/mods-enabled for the rewrite and cgi modules. These are already available with a vanilla install, so look for rewrite.load and cgi.load in the mods-available folder.

In addition, you need to explicitly enable the rewrite module in the virtual host setup in the vanilla apache2 install. I added these lines right under the log defines in sites-enabled/000-default.conf:

RewriteEngine On
RewriteOptions Inherit

Step 2. Set up the git repositories.

I set up all my repositories under /path/to/git, e.g. /path/to/git/project1.git/, etc. The repositories have to be set up in a specific way to enable push. Here’s a script I use:

#!/bin/bash
if [[ $# < 1 ]]; then
    echo Usage: `basename $0` repo-name
    exit 0
fi
cd /path/to/git
if [[ -e ${1}.git ]]; then
    echo ${1}.git exists
    exit 0
fi
mkdir ${1}.git
cd ${1}.git
git init --bare
git config http.receivepack true
git update-server-info
touch git-daemon-export-ok
cd ..
sudo chown -R www-data:www-data ${1}.git

Step 3. Set up a users list with htpasswd.

  1. Install apache2-utils
  2. htpasswd -c /path/to/git/users.txt username

Step 4. Set up a git config.

Put this in, say, git.conf under /etc/apache2/confs-available, and

SetEnv GIT_PROJECT_ROOT /var/www/git
SetEnv GIT_HTTP_EXPORT_ALL
ScriptAlias /git/ /usr/lib/git-core/git-http-backend/

RewriteCond %{QUERY_STRING} service=git-upload-pack [OR]
RewriteCond %{REQUEST_URI} /git-upload-pack$
RewriteRule ^/git/ - [E=AUTHNOTREQUIRED:yes]

<LocationMatch "^/git/">
    Require env AUTHNOTREQUIRED
    AuthType Basic
    AuthName "Git access"
    AuthUserFile /path/to/git/users.txt
    Require valid-user
</LocationMatch>

Remarks

I spent way too much time on this because of a mismatch between git and apache documentation. Basically, the authorization directives were specific to Apache 2.2 but I was working with Apache 2.4 – see here under runtime configuration changes where they mention that they have a new authorization module mod_authz_host. Also, because I’m not a sysadmin, I didn’t realize that I had to add “RewriteEngine On” so I was scratching my head about why the environment variable was never getting set.

Anyways, I hope the above saves someone some time. I promise not to write any tutorial blogs that only link to documentation without adding anything because, as you can see, reading manuals does not solve problems.

Advertisements

Polynomial parser in C++ using Boost.Spirit

Edit (07/07/2014): I’m sorry, the initial code compiled but didn’t work. It’s been updated now, along with the blog.

First of all, here is a link to the code. It depends on Boost.Spirit, which is a header-only part of the Boost library. It compiles, and it parses bivariate polynomials (e.g. xy^2 + 2x – 5y).

The input is a user-input string, and the output is a polynomial, which is represented internally as a vector of terms. A term has a coefficient and two exponents, one for each variable.

struct term
{
  boost::optional<int> coefficient;
  boost::optional<int> x_exponent;
  boost::optional<int> y_exponent;
};

So imagine the parser going over the string and processing each term individually, stuffing it into a vector. That’s exactly how the parser rules work.

The rules in the parser define how the user input is parsed, and we attach actions in the form of lambda functions to each rule to be executed when each part is parsed. It starts at the top with start:

 namespace phx = boost::phoenix;
 ...
 start = eps[_val = std::vector<term>()]
>> poly_term[phx::push_back(_val, qi::_1)]
>> *(
('+' >> poly_term[phx::push_back(_val, qi::_1)])
|
('-' >> negative_poly_term[phx::push_back(_val, qi::_1)])
)

First of all, rules have attributes, and the start rule’s attribute is std::vector of term‘s. The rule says that we should initialize with an empty vector, then expect to parse at least one poly_term (i.e. polynomial term).

The poly_term is another rule, whose attribute is term. eps is another rule that consumes no input and always matches.

Inside the brackets of each rule is a lambda function that does something with the attribute of the rule. For example, the rule for poly_term says to take the term, stored in the placeholder qi::_1, and stuff it into the vector that we initialized at the beginning.

A single polynomial term contains three pieces of information: the coefficient out in front, and the two exponents for each variable, and the next rule shows how to break it down.

 poly_term = eps[_val = term()]
>>
-int_[phx::bind(&term::coefficient, _val) = qi::_1]
>>
-x_term[phx::bind(&term::x_exponent, _val) = qi::_1]
>>
-y_term[phx::bind(&term::y_exponent, _val) = qi::_1]
;

First, we try to pick out the coefficient. int_ is a “primitive” Spirit rule that will pick up an int from the string being parsed. The action says that the int will be taken and assigned to the coefficient field of the term attribute of this rule.

Note that the coefficient is optional, as indicated by the dash outside of the parens. The Kleene operator in the start rule up above is another one, which matches any number of poly_terms. Here’s a complete list of the parser operators.

Somehow, it is quite beautiful to look at this code.

Getting started with CppSharp

CppSharp is one cool project that generates C# bindings for C++ libraries. It’s under active development now and I finally got around to getting a sample up and running on OSX. Here’s a running example from following the Getting Started guide on the github page.

First, you’ll need to set yourself up with Mono. I set mine up through a Homebrew keg, and it set me up with a 32-bit runtime. So that means we need to make sure that C++ library we interface with is compiled for 32-bits.

Here’s a sample library. Make it and extract the archive into the working directory. You’ll see the following structure get unpacked:

libsample
libsample/include
libsample/lib

Next, you’ll need to clone and build CppSharp. The Getting Started page is the quickest way to build it. In short, you

  1. Clone the particular revisions of LLVM and Clang into the tools subdirectory.
  2. Configure and build LLVM with CMake, enabling C++11 and libc++ standard library support by adding cache entries LLVM_ENABLE_CXX11 and LLVM_ENABLE_LIBCXX.
  3. Configure and build CppSharp. They use premake as their build system, which is a lot simpler to deal with.

The result of building CppSharp is a set of .dlls. The easiest thing is to copy all the resulting .dll files to the working directory of where the executable for your binding generating code will go. Otherwise, you will need to add the directory containing these libraries to MONO_PATH.

To generate your bindings for libsample, you implement ILibrary. Here’s a barebones example that compiles and runs. It assumes you’ve got the following in your working directory:

lib/Release_x32/
libsample/{include,lib}

The binding generator is Sample.exe. parses the sample library and spits out bindings in the out/ directory.

Then you can proceed to use your C++ assets from C# — TestSample.exe compiled in the barebones example above will show you how. You just have to make sure the .dylib is in the working directory or visible through LD_LIBRARY_PATH.

Now that I’ve got this up and running, I’m looking into experimenting with QtSharp. Now that I kind of see what’s going on, it looks like the developer has committed bindings for Qt5. I guess the path of least friction is to do the same rather than mess around with Qt4. I just built 32-bit Qt5 overnight last night and will be testing it shortly.

Git Repository via Apache II

I made a previous post about serving a git repository via Apache, and the method of delivery was WebDAV. Fast forward a couple of years and I find myself trying to use the same setup on our lab machine. In the end, it doesn’t work: in spite of everything, a git push will always fail due to some error locking a refs file. I ended up giving up on WebDAV and going with git-http-backend.

The setup is essentially identical to the first example here. A vanilla apache 2.4 install will have the modules (mod_cgi, mod_alias, mod_env) available, so it’s just a matter of uncommenting those in httpd.conf. I only had to change the ScriptAlias line to point git to the git-http-backend command in the local install of git. We already had a basic auth file that we use with the mod_dav_svn setup which we reused to protect access to the git repositories. Everything just worked and we didn’t need to rebuild apache.

I kind of regret spending so much time compiling Apache 2.4 thinking that the error had to do with running Apache 2.2, but then again it was a nice exercise in building a bunch of interlinked libraries on Centos. One thing I learned is that in the typical configure-make-make install cycle, ./configure --prefix=PREFIX is really useful to do. I always felt terrible about sending everything to /usr/local because (1) it gets pretty messy, and (2) you might not have permission to do it. But having picked up the CMake convention of always building projects with an out-of-source build, this makes a lot of sense to do a lot of the time, especially if you’re building stuff just to try out. I ended up not needing most of the stuff I built and now I don’t have to worry about cleaning up /usr/local.

I git it: the branch + merge

It’s been almost a month since I first committed our project to version control under git, but up until now, I didn’t really know how to leverage git to support our two-person development effort. I’ll talk about the project setup that evolved from the past few weeks that’s really working out for me now.

First, a little bit about our setup: we have a server where we have our application deployed, but we also have active development going on right on that server. This is convenient and is a typical workflow for web projects: you’ll make and upload edits to see the changes on the server immediately. I have a local server set up on my laptop that I’ll work on, which saves me the hassle of the upload, and in this case turns out to be nice in that I’m not inadvertently clobbering work on the remote server.

Every once in a while, I’ll need to pull in the latest changes and also push out my work as well. For the longest time, I didn’t have a great way of doing this. Initially, I was just using FTP, at times to cherry-pick files and other times just batch downloading the whole project tree. Needless to say, this is a big waste of time and involved more point-and-click than I could stand. It also doesn’t give a good feeling about being in sync at all.

Sync? Great idea, we can use rsync! We do have shell access after all. The next step came when I wrote an rsync command to pull the files I care about from the remote server to my machine. Rsync is nice because it just transfers differences, so it’s nice and quick. The problem is that it clobbers old stuff, and for a while I was using a separate folder with rsync apart from my working folder and manually diff/patching. This also got annoying.

But wait a minute, I don’t have to worry about clobbering code, I could just create a branch in git and pull the changes into that branch. My own branch will be untouched and I can use git checkout to switch between branches. From there, I can merge that branch with my main working branch!

For example, the team meets at 5pm to continue work on the project. At that point, I will create a branch for myself do work on, and my partner will go about his work on the remote server. When we call it a night later on, I create another branch from the initial 5pm commit and pull in changes from the server using rsync.

At this point, git recognizes the fork, and when you issue a git merge command, git will do its best to merge changes, presenting a list of conflicted files if it can’t resolve them itself.

atsui@atsui-mmx:~/web/letswoosh$ git branch
* alex
  master
  parris
atsui@atsui-mmx:~/web/letswoosh$ git merge parris
Auto-merging app/plugins/woosh/controllers/events_controller.php
Auto-merging app/plugins/woosh/models/event.php
CONFLICT (content): Merge conflict in app/plugins/woosh/models/event.php
Auto-merging app/plugins/woosh/views/events/nearby_events.ctp
CONFLICT (content): Merge conflict in app/plugins/woosh/views/events/nearby_events.ctp
Auto-merging app/views/elements/tag_form.ctp
CONFLICT (content): Merge conflict in app/views/elements/tag_form.ctp
Automatic merge failed; fix conflicts and then commit the result.

I’ll go through the files, clean up the conflicts, add them back to git, and commit the changes, after which the two branches are successfully merged.

The benefits are clear: no manual diff/patch business, the ability to check out any past commit later on, and a pretty graph to look at. After doing things a pretty hard way, I guess I learned a lesson about using git.

This was really helpful for understanding merge:
http://github.com/guides/pull-requests

Knight’s Tour Problem

A knight's tour

A knight's tour

One of the homework problems we had in graph theory class asks the question: given a Knight piece and an 8×8 chessboard, can you start the Knight off on one square, move into every other square once and exactly once, and land back on the square that you started with? This is a really annoying problem because it’s easy to understand and sucks you in for hours. I only got half credit on it by trying to fill a 4×4 part of the board as much as I could and copying the same moves 4 times.  Apparently, a Knight’s tour doesn’t exist for a 4×4 as Professor So mentions some theorems about this problem. If you’re interested in more, check out Schwenk’s theorem.

Coincidentally, I was randomly practicing programming problems in Topcoder and I found that the Knight’s Tour makes an appearance as the 2nd problem in division 2, SRM 447. The problem has you implement an algorithm, namely Warnsdorff’s algorithm, which is a heuristic algorithm that makes the Knight piece walk the board, and if you start with a fresh board with the Knight in a corner, you can actually discover a Knight’s tour. BUT, the thing is, it doesn’t give you the elusive Hamiltonian circuit but rather a Hamiltonian path ie. the starting and ending points that you find with this algorithm are different. Still, it’s pretty awesome for a 500 point problem.

The cool thing about that algorithm is that although the Hamiltonian path problem is NP-hard, it supposedly is able to find Hamiltonian paths in many graphs in linear time. I’ve attached my Java program if you just want to test it out. Don’t worry about having seen this tour and having the problem being spoiled; apparently, there are billions of possible tours. Also, I have not found the Hamiltonian circuit yet, so good job if you managed to find it but don’t tell me 🙂

KnightsTour.java

extra space kills PHP session

So I’m working on a menu system for a CMS project in CakePHP, and I’m really getting a hang of the MVC pattern. Cake also makes things convenient; this isn’t fun to do but it’s easy when you stick to the framework.

So it’s pretty common to keep track of sessions. You write to a session variable, and you check against it to see what you should display. For some reason, I am logged in, but when I check my session, Cake can’t find my session information at all. The first thing I tried was to check my configuration: I set it to track sessions by writing temporary files, and this way I’d have evidence that they were fine.

So that’s all well and good. But I still got the same problem. It wasn’t making much sense to me, but I noticed that in another controller it worked fine. After I while, I found this: https://trac.cakephp.org/ticket/5031. It basically says I had an extra space at the end of my file that was causing the session data to be dropped. Apparently, it’s a PHP issue. Quite an annoying bug to track down!