Git repository via Apache – Take 3

I revisited an old post that I wrote to see how to host my own private git repository with Apache, but I didn’t write any details and made it hard on myself, so I’ll try to fix that here.

My setup

  • Anonymous read access, user/password protected write access
  • Ubuntu 14.04
  • Apache 2.4
  • Git 1.9.1

Step 1. Enable the prerequisite modules.

I had to add symlinks in /etc/apache2/mods-enabled for the rewrite and cgi modules. These are already available with a vanilla install, so look for rewrite.load and cgi.load in the mods-available folder.

In addition, you need to explicitly enable the rewrite module in the virtual host setup in the vanilla apache2 install. I added these lines right under the log defines in sites-enabled/000-default.conf:

RewriteEngine On
RewriteOptions Inherit

Step 2. Set up the git repositories.

I set up all my repositories under /path/to/git, e.g. /path/to/git/project1.git/, etc. The repositories have to be set up in a specific way to enable push. Here’s a script I use:

#!/bin/bash
if [[ $# < 1 ]]; then
    echo Usage: `basename $0` repo-name
    exit 0
fi
cd /path/to/git
if [[ -e ${1}.git ]]; then
    echo ${1}.git exists
    exit 0
fi
mkdir ${1}.git
cd ${1}.git
git init --bare
git config http.receivepack true
git update-server-info
touch git-daemon-export-ok
cd ..
sudo chown -R www-data:www-data ${1}.git

Step 3. Set up a users list with htpasswd.

  1. Install apache2-utils
  2. htpasswd -c /path/to/git/users.txt username

Step 4. Set up a git config.

Put this in, say, git.conf under /etc/apache2/confs-available, and

SetEnv GIT_PROJECT_ROOT /var/www/git
SetEnv GIT_HTTP_EXPORT_ALL
ScriptAlias /git/ /usr/lib/git-core/git-http-backend/

RewriteCond %{QUERY_STRING} service=git-upload-pack [OR]
RewriteCond %{REQUEST_URI} /git-upload-pack$
RewriteRule ^/git/ - [E=AUTHNOTREQUIRED:yes]

<LocationMatch "^/git/">
    Require env AUTHNOTREQUIRED
    AuthType Basic
    AuthName "Git access"
    AuthUserFile /path/to/git/users.txt
    Require valid-user
</LocationMatch>

Remarks

I spent way too much time on this because of a mismatch between git and apache documentation. Basically, the authorization directives were specific to Apache 2.2 but I was working with Apache 2.4 – see here under runtime configuration changes where they mention that they have a new authorization module mod_authz_host. Also, because I’m not a sysadmin, I didn’t realize that I had to add “RewriteEngine On” so I was scratching my head about why the environment variable was never getting set.

Anyways, I hope the above saves someone some time. I promise not to write any tutorial blogs that only link to documentation without adding anything because, as you can see, reading manuals does not solve problems.

Git Repository via Apache II

I made a previous post about serving a git repository via Apache, and the method of delivery was WebDAV. Fast forward a couple of years and I find myself trying to use the same setup on our lab machine. In the end, it doesn’t work: in spite of everything, a git push will always fail due to some error locking a refs file. I ended up giving up on WebDAV and going with git-http-backend.

The setup is essentially identical to the first example here. A vanilla apache 2.4 install will have the modules (mod_cgi, mod_alias, mod_env) available, so it’s just a matter of uncommenting those in httpd.conf. I only had to change the ScriptAlias line to point git to the git-http-backend command in the local install of git. We already had a basic auth file that we use with the mod_dav_svn setup which we reused to protect access to the git repositories. Everything just worked and we didn’t need to rebuild apache.

I kind of regret spending so much time compiling Apache 2.4 thinking that the error had to do with running Apache 2.2, but then again it was a nice exercise in building a bunch of interlinked libraries on Centos. One thing I learned is that in the typical configure-make-make install cycle, ./configure --prefix=PREFIX is really useful to do. I always felt terrible about sending everything to /usr/local because (1) it gets pretty messy, and (2) you might not have permission to do it. But having picked up the CMake convention of always building projects with an out-of-source build, this makes a lot of sense to do a lot of the time, especially if you’re building stuff just to try out. I ended up not needing most of the stuff I built and now I don’t have to worry about cleaning up /usr/local.

I git it: the branch + merge

It’s been almost a month since I first committed our project to version control under git, but up until now, I didn’t really know how to leverage git to support our two-person development effort. I’ll talk about the project setup that evolved from the past few weeks that’s really working out for me now.

First, a little bit about our setup: we have a server where we have our application deployed, but we also have active development going on right on that server. This is convenient and is a typical workflow for web projects: you’ll make and upload edits to see the changes on the server immediately. I have a local server set up on my laptop that I’ll work on, which saves me the hassle of the upload, and in this case turns out to be nice in that I’m not inadvertently clobbering work on the remote server.

Every once in a while, I’ll need to pull in the latest changes and also push out my work as well. For the longest time, I didn’t have a great way of doing this. Initially, I was just using FTP, at times to cherry-pick files and other times just batch downloading the whole project tree. Needless to say, this is a big waste of time and involved more point-and-click than I could stand. It also doesn’t give a good feeling about being in sync at all.

Sync? Great idea, we can use rsync! We do have shell access after all. The next step came when I wrote an rsync command to pull the files I care about from the remote server to my machine. Rsync is nice because it just transfers differences, so it’s nice and quick. The problem is that it clobbers old stuff, and for a while I was using a separate folder with rsync apart from my working folder and manually diff/patching. This also got annoying.

But wait a minute, I don’t have to worry about clobbering code, I could just create a branch in git and pull the changes into that branch. My own branch will be untouched and I can use git checkout to switch between branches. From there, I can merge that branch with my main working branch!

For example, the team meets at 5pm to continue work on the project. At that point, I will create a branch for myself do work on, and my partner will go about his work on the remote server. When we call it a night later on, I create another branch from the initial 5pm commit and pull in changes from the server using rsync.

At this point, git recognizes the fork, and when you issue a git merge command, git will do its best to merge changes, presenting a list of conflicted files if it can’t resolve them itself.

atsui@atsui-mmx:~/web/letswoosh$ git branch
* alex
  master
  parris
atsui@atsui-mmx:~/web/letswoosh$ git merge parris
Auto-merging app/plugins/woosh/controllers/events_controller.php
Auto-merging app/plugins/woosh/models/event.php
CONFLICT (content): Merge conflict in app/plugins/woosh/models/event.php
Auto-merging app/plugins/woosh/views/events/nearby_events.ctp
CONFLICT (content): Merge conflict in app/plugins/woosh/views/events/nearby_events.ctp
Auto-merging app/views/elements/tag_form.ctp
CONFLICT (content): Merge conflict in app/views/elements/tag_form.ctp
Automatic merge failed; fix conflicts and then commit the result.

I’ll go through the files, clean up the conflicts, add them back to git, and commit the changes, after which the two branches are successfully merged.

The benefits are clear: no manual diff/patch business, the ability to check out any past commit later on, and a pretty graph to look at. After doing things a pretty hard way, I guess I learned a lesson about using git.

This was really helpful for understanding merge:
http://github.com/guides/pull-requests

Git Repository via Apache

CakePHP, one of the projects I work with, uses Git. I’ve always ever used Subversion personally, so it seems time to get with the program here by setting up a Git repository. Well, Git is a different animal, enabling a distributed workflow, and you don’t have to use it like you use Subversion, but that’s another story.

In short, we’ll create a bare repository to be hosted via HTTP/DAV on Apache.

Solution

Here are the steps I took on Ubuntu 9.10. I had Git installed and an existing Apache server up and running but didn’t have DAV configured.

Start by creating a bare repository by cloning an existing git project like so:

git clone --bare /path/to/git/project /path/to/new/repo.git

Enable access to the repository:

touch /path/to/new/repo.git/git-daemon-export-ok

Copy the repository into place to be served by Apache:

mv /path/to/new/repo.git /var/www/repo.git

Also don’t forget to do this, otherwise gitting stuff doesn’t work:

cd /var/www/repo.git
git --bare update-server-info

If you haven’t set up the Dav module for Apache, you’ll need to if you want to push stuff, etc. Here’s how:

# enable the dav module + dependencies
a2enmod dav_fs

In /etc/apache2/sites-available/default, add a Directory container in the VirtualHost:

<Directory /path/to/repo.git>
   Dav On
   Allow from all
</Directory>

Restart the server…

/etc/init.d/apache2 restart

…and test it out.

Troubleshooting

I hadn’t set up DAV properly and spent an hour figuring out this error when I tried a git push:

error: Cannot access URL http://localhost/atsui/taspa.git/, return code 22
error: failed to push some refs to 'http://localhost/atsui/taspa.git'

It confused me because I could do git clone from the server without trouble, so I didn’t quite know what to think. Checking the log is always a wise thing to do. From Apache’s access.log:

27.0.0.1 - - [19/Jan/2010:09:43:49 -0800] "PROPFIND /atsui/taspa.git/ HTTP/    1.1" 405 565 "-" "git/1.6.3.3"

The 405 HTTP code translates to “Method not allowed”. Although I had enabled the DAV module, I didn’t have DAV enabled on that directory. Follow the steps above and you shouldn’t see the problem.

Thoughts

If you notice, the directory is configured to allow git read/write access to everyone. An extension to this guide might be to set up user authentication on Apache. Or if you’re into SSH and keys, you can set that up, too. I’m not so familiar with that so I might try that out as well.

Git’s got some stuff to get used to. I only found out about bare repositories after trying to push changes to a repository that I cloned from, which is bad because it’s like shoving changes into someone else’s workspace and confusing Git in the process. I’ll have to read more about this workflow. Do people just pull stuff that they like from others? Or should it be that people use Git to create patches and give it to someone to put together?

Links

Here’s the useful references I used for this write-up.

  1. http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#setting-up-a-public-repository
    Here’s where I found out how to create a Git repository to use in the Subversion repository style.
  2. http://www.kernel.org/pub/software/scm/git/docs/howto/setup-git-server-over-http.txt
    Details on how to get the Git repository onto Apache
  3. http://www.jedi.be/blog/2009/05/06/8-ways-to-share-your-git-repository/
    Lots of alternative setups and considerations for choosing specific ones.