Diary #22: A lot of little things

Well, it’s hard to blog about a lot of little things, so I’ll clump them all into one.

I sent off a friend and fellow graduate student at SFO this past week after providing shelter in my apartment for about a week. I felt bad about how I was worrying about how it was cramping my bachelor grad student routine in the back of my head while they were staying over. I ended up just spending most of my time with them towards the end of their stay since, hey, who knows when I’ll get to meet them again after they return to China.

Since publishing the vim addin as a standalone plugin, people have starred the repo on github. I feel obligated to go to work on it and also a little guilty for not spending much time on it. Maybe dedicate an hour or so daily to looking up the API and jotting down plans on the wiki so that the project actually has a freaking pulse.

Speaking of which, I’m making progress on writing my own python console Qt widget. There is an existing project on sourceforge called qconsole, but it’s GPL and I’d rather go through the process of setting up my own.

I think I want to dedicate some time regularly visiting the code review stack exchange, at least to work on my mental checklist of reviewing code when I don’t have it in compilable/debuggable format. I guess this might be more for code quality rather than correctness of implementation of algorithms, but it’s still important to get a mental checklist ironed out for when I have to look over my own code during an interview.

I managed to set myself up to buy Kindle ebooks in Japanese from Amazon Japan. Sadly, you really only have print copy as an option for certain novels, but at least the selection is much larger than what is simply on Amazon US. It is so amazing.

I’ve got the okay from my advisor to take the trip to Paris in September (I was invited to participate in a CGAL developer meeting). It’d be nice to get to talk to people who do geometry for a living, get some feedback about my project, about software engineering prospects, open source development, and life in general. I’m going to focus on doing a good job for the rest of the summer of code and maybe follow the mailing list more closely so I can make a good impression when I get there. Also, time to do some trip planning in case my parents want to tag along.

Polynomial parser in C++ using Boost.Spirit

Edit (07/07/2014): I’m sorry, the initial code compiled but didn’t work. It’s been updated now, along with the blog.

First of all, here is a link to the code. It depends on Boost.Spirit, which is a header-only part of the Boost library. It compiles, and it parses bivariate polynomials (e.g. xy^2 + 2x – 5y).

The input is a user-input string, and the output is a polynomial, which is represented internally as a vector of terms. A term has a coefficient and two exponents, one for each variable.

struct term
{
  boost::optional<int> coefficient;
  boost::optional<int> x_exponent;
  boost::optional<int> y_exponent;
};

So imagine the parser going over the string and processing each term individually, stuffing it into a vector. That’s exactly how the parser rules work.

The rules in the parser define how the user input is parsed, and we attach actions in the form of lambda functions to each rule to be executed when each part is parsed. It starts at the top with start:

 namespace phx = boost::phoenix;
 ...
 start = eps[_val = std::vector<term>()]
>> poly_term[phx::push_back(_val, qi::_1)]
>> *(
('+' >> poly_term[phx::push_back(_val, qi::_1)])
|
('-' >> negative_poly_term[phx::push_back(_val, qi::_1)])
)

First of all, rules have attributes, and the start rule’s attribute is std::vector of term‘s. The rule says that we should initialize with an empty vector, then expect to parse at least one poly_term (i.e. polynomial term).

The poly_term is another rule, whose attribute is term. eps is another rule that consumes no input and always matches.

Inside the brackets of each rule is a lambda function that does something with the attribute of the rule. For example, the rule for poly_term says to take the term, stored in the placeholder qi::_1, and stuff it into the vector that we initialized at the beginning.

A single polynomial term contains three pieces of information: the coefficient out in front, and the two exponents for each variable, and the next rule shows how to break it down.

 poly_term = eps[_val = term()]
>>
-int_[phx::bind(&term::coefficient, _val) = qi::_1]
>>
-x_term[phx::bind(&term::x_exponent, _val) = qi::_1]
>>
-y_term[phx::bind(&term::y_exponent, _val) = qi::_1]
;

First, we try to pick out the coefficient. int_ is a “primitive” Spirit rule that will pick up an int from the string being parsed. The action says that the int will be taken and assigned to the coefficient field of the term attribute of this rule.

Note that the coefficient is optional, as indicated by the dash outside of the parens. The Kleene operator in the start rule up above is another one, which matches any number of poly_terms. Here’s a complete list of the parser operators.

Somehow, it is quite beautiful to look at this code.

Japan travel #3: Sausage at Kyodai

This is picking up from my last Japan travel blog…

There are a lot of loan words in the Japanese language, and this can be convenient, endearing, and absolutely frustrating at the same time. Add in the fact that Japanese love to abbreviate things and you’re assured to never have a dull moment.

The convenient thing about loan words is that Japanese has specific notation called katakana that explicitly indicates foreign words. The マクドナルド menu in Japan is chock full of the things you’re used to, like アイスコーヒー, チーズバーガー, and ポタト, but it has some interesting things, like ドイツバーガー.

It’s really cute to pronounce the things you see in katakana to see how the Japanese decided to cast the words into their syllabary. For example, カーブ took me a little while to figure out. エネルギー makes me scratch my head because the Japanese do have the soft G phoneme but decided that the hard G at the end was more appropriate.

It gets complicated when abbreviations come into the picture. ファミレス is a common pattern where the first two syllables of each word in a compound word is smashed together to give you the word. Some are just tricky to parse if you don’t have any context and are seeing it for the first time, like スマホ.

The abbreviations extends to the Japanese words as well. For example, a common food combination is 天婦羅 and 玉子, so a noodle menu item might be called 天玉そば — notice how the first character from each item is used for shorthand.

Now, I thought I’d talk a little bit about about the actual reason that I went, and was able to go, to Japan in the first place: the theoretical computer science conference called Symposium on Computational Geometry — otherwise abbreviated as SoCG. But that’s an awkward acronym to spell out, and you can confuse it with things like GSoC, for example. Well, if you look at it long enough, I’m sure you’ll agree with a lot of the frequent attendees that “Sausage” is a much more endearing nickname for the conference. Suddenly, the wifi password of ‘sausage2014kyodai’ makes a lot of sense if you consider that the venue was the clock tower hall at Kyoto Daigaku, or Kyoto University.

You clever Japanese.

Well, over the course of the 4-day conference, I witnessed quite a few theoretical talks that went way over my head, but what I took away from those talks at a high level is that the emphasis doesn’t seem to be on any particular application but rather on solving a previously unsolved problem or solving a problem more efficiently, for example, by proving a lower asymptotic bound. Secondly, beautiful, clean, simple schematic figures are a lot better when the point is to illustrate your method in a severely constrained amount of time. I feel like I put a lot of pressure on myself to visualize real data thinking that I need to show that my work is authentic, but I am missing the point that it might be way too cluttered and distracting for my intended audience.

My seat in one of the conference rooms.

My seat in one of the conference rooms.

A view from the speaker's point of view. Also, blurry Carlos.

A view from the speaker’s point of view. Also, blurry Carlos.

Jin Akiyama, very famous Japanese mathematician, giving his invited talk.

Jin Akiyama, very famous Japanese mathematician, giving his invited talk.

As for my talk, I think it could have definitely went better. I knew that I had 15 minutes to work with ahead of time, but I did not factor in that part of that would be used for Q&A and speaker changeover, so it was pretty rushed. Anyways, I pointed people to the website I made in the end for reference, so it wasn’t so bad.

I was also lucky enough to run into some CGAL editors on the last day of the conference. From left to right are Michael, Monique, and Eric:

cgal_socg2014

The Kyoto University campus was smaller than I expected. I think it is about the size of San Jose State University. The students really struck me as acting very young, maybe because they are, and I’m starting to not be anymore, but I don’t know — this is just my impression, which is similar to how I felt when I studied abroad in Hong Kong.

A view from the northwest corner of Kyoto University.

A view from the northwest corner of Kyoto University.

Lots of bikes on campus.

Lots of bikes on campus.

Also motor bikes on the left, too.

Also motor bikes on the left, too.

Small cars are pretty common to see.

Small cars are pretty common to see.

Having Indian food with Carlos near campus. Crazy amount for less than $10. The naan is huge.

Having Indian food with Carlos near campus. Crazy amount for less than $10. The naan is huge.

All in all, it was a fun time, probably solidifying my view that the academic life is not the one for me. It felt great to meet people who are interested in what you do, though most likely that is going to be that one person whose name you’ve seen online doing similar stuff already. It was cool and humbling to witness how chummy the researchers are with each other. I’m sure I will make my mark, but most likely it will be writing things other than academic papers. I think I can be more useful writing software or translations or blogs, for example.

Thanks for the trip!

Code Reading #1: Slick callback registration in MonoDevelop

I’m starting a new category of blog posts called ”Code Reading” where I’ll talk about something interesting I saw during my week of coding, with some code snippets of course.

Just today, this caught my eye when I was trying to figure out how to hook into an event in the Vim addin that I was trying to package up. So there is a global preferences window where the user checks boxes to enable certain things, and one of them is vim input mode in the text editor. Basically, there’s a global PropertyService that exposes these selections, as well as fires an event when something is updated.

I found an example of how this event can be registered/unregistered in the SourceEditorOptions class. Registration happens in the constructor:

        DefaultSourceEditorOptions (MonoDevelop.Ide.Gui.Content.TextStylePolicy currentPolicy)
        {   
            LoadAllPrefs (); 
            UpdateStylePolicy (currentPolicy);
            PropertyService.PropertyChanged += UpdatePreferences;
            ... 

And unregistration happens in the destructor:

        public override void Dispose()
        {   
            PropertyService.PropertyChanged -= UpdatePreferences;
            FontService.RemoveCallback (UpdateFont);
        } 

So PropertyChanged shows up in PropertyService on line 270:

public static event EventHandler<PropertyChangedEventArgs> PropertyChanged;

The interesting thing to me is the event keyword, which means there’s language-level support for event handling.

It’s really slick syntax to just add or subtract your event handler to the event as the highlighted lines above show. Let’s just peek at what that event handler looks like:

        void UpdatePreferences (object sender, PropertyChangedEventArgs args)
        {   
            try {
                switch (args.Key) {
                    case "TabIsReindent":
                    this.TabIsReindent = (bool)args.NewValue;
                    break;
                    case "EnableSemanticHighlighting":
                    this.EnableSemanticHighlighting = (bool)args.NewValue;
                    break;
                    ... 

So it’s interesting that you can just pass the member function around. There’s magic that happens inferring the type of that thing, but the point is that it is a first-class object. Here’s another more explicit example elsewhere:

            properties.PropertyChanged += delegate(object sender, PropertyChangedEventArgs args) {
                if (PropertyChanged != null)
                    PropertyChanged (sender, args);
            }; 

I guess it’s a lambda function, but not quite, because they call it a delegate. But it has the signature and the body, all the same.

Well, I was really impressed when I tried to plug my own function in and got a compiler error indicating what the signature should have been. I wonder where it is specified and also where the event is generated. It’s a mystery to me, but I know that it’s a lot more heavyweight to achieve in C++.

Diary #21: Knee deep in templates, long compiles

So CGAL makes pretty use of C++ templates, and the Summer of Code project that I am working on builds a Qt4 visualization of 2D arrangements, one of these CGAL data structures for building collections of 2D curves. I decided to make the individual UI and event handling components templated by the type of arrangement and specialize as needed, and I think it ended up saving me a lot of typing. The problem is the compilation time for the full demo is pretty substantial.

arr_demo_clean_build

A clean actually took 5 minutes using a parallel build on a 4-core system. What’s hard to see is that actually one core gets stuck compiling a certain few files, for example, ArrangementDemoWindow.cpp is a major culprit. This bad boy is responsible for instantiating ArrangementDemoTab, which is templated by currently six different types of arrangements. Each of these tabs instantiate another level of at least six callback components that handle various operations on arrangements. There is a Qt GraphicsItem subclass for visualizing the actual arrangement that is also templated based on the arrangements. Finally, each of the components make use of little utility classes that are templated based on types provided by the arrangement type.

Now, C++ has no choice but to generate code for implicitly instantiated classes on the spot. so it’s probably no surprise that the memory usage spikes up to 4.1 GB when it hits a “fat” class like ArrangementDemoWindow. Sure, ArrangementDemoWindow.cpp is 1200 lines of code, but that shouldn’t cause such a massive effect. Unless you are a compiler that needs to instantiate a wide swath of template classes on demand to do type checking and such. So the silent console fails to convey the chaos that is happening behind the scenes, and the point is that there is a lot of redundant compilation happening.

So, it’s become an absolute pain to try to do any debugging with the full demo. It means that a single line change in a fat class like ArrangementDemoWindow means you have no choice but to sit through the deep compilation of all those templated classes, even if none of them changed at all. I’m fed up with it. Currently, I have to write a smaller UI example that instantiates only one type at a time so that recompilation is not so ridiculous.

But all this really makes me ask, is all this really unavoidable? Couldn’t I explicitly instantiate a certain set of template classes, precompile them once, and save them to a library for linking later on? How can I indicate this to the compiler?

Actually, if we could make use of C++11 features, we can use the extern templates feature to introduce some modularity. For example, suppose you wanted to precompile a vector of ints type. You can put this in your IntVector.h:

    #include <vector>
    extern template class std::vector<int>;
    typedef std::vector<int> IntVector;

and this in your IntVector.cpp:

    #include "IntVector.h"
    // instantiates the entire class from the default template
    template class std::vector<int>;

Then you can include the header file whenever you want to use IntVector. Wherever you use it, the compiler will not generate any code like it normally does, but you will need to compile IntVector.cpp and link its object. Now imagine if you have a lot of templated classes that are tied together with dependent parameter types and collaborate closely. Then this can save a ton of compilation time.

Clang will do it, and I guess g++ should do it if you set it to c++11 mode. But as always, I need to support the older compilers, so that’s right out.

Diary #20: The final year is coming

I had my last face-to-face meeting with my advisor last Friday, who will be relocating to his new post in Louisiana state. In our meeting, we hashed out some rough plans about how I will be finishing out my graduate studies and moving on. The month of April of next year got labelled with the big red X to signify the point in time where I should have my research finished and should dedicate fully to writing my thesis and ultimately graduating.

The end is in sight.

But getting a Ph.D. isn’t the final step: as one of my committee members indicated, it is the beginning. Unfortunately, I don’t have the intent to set out on the track to becoming a professor like he is. However, I intend to put the skills that I’ve picked up to good use as a research/software engineer.

Much like at the end of high school and college, I’m being told to do some soul-searching and to talk to people about what it’s like. I think, though, that I have largely made up my mind, and I really just want to focus on successful execution of the things that have to happen with the rest of my time here so that I can move on.

I don’t really regret getting into graduate school. I knew it was something that I wanted to do by taking up a double major as an undergraduate, and I followed through with it. As a graduate student, you end up doing a little bit of everything out of necessity, and I’m glad to have applied myself and accomplished what I have up to this point. Having done all of this, I feel like I am clear now on what I like to do, what my skills are, and how I will use/develop them to contribute to society and fulfill myself from here on out.

I am really looking forward to getting out, but there’s still a lot to be done, both to get out, and to get to the next step.

For one, I’m going to need to get started on interview preparation. As ashamed as I am to admit it, I have never gone through the process of interview to employment with any company. I’ve spent most of my life working through school. The closest thing I’ve had to an internship was this thing with Summer of Code, but it’s mainly facilitated online. So it’s not a matter of brushing up on something I haven’t done in a while, but this is going to be a brand new experience for me. I’ve ordered the Cracking the Coding Interview book that was recommended by the CS club and will use it to review algorithms as well as learn how to conduct myself.

Another thing is I need to do some additional research. At this point, I officially feel like I’m stuck in a rut. There’s certainly a roadmap and a direction to go, but (1) code is pretty unreliable, if not broken, and (2) I need to another way to compare results with other methods. Also (3) I need to figure out the details of the next and final method I will be working on.

I think I need to be careful and pace myself. This is the last summer that I will have, and it is going by really fast. There’s a lot that I want to do and also things that I have to do.

As long as I stay focused, I can do this.

Japan travel 2

Last time, I talked a little bit about Japan travel, and this time, I’ll talk about my experience using some of the internet cafes while I was over there.

The hostel that we stayed at provided complementary wifi access, but as with most complementary wifi, it wasn’t reliable. We also rented a Softbank wifi hotspot at the airport, but it was not very reliable, either. I needed to get some work done, so I figured I would go for an internet cafe.

A quick net search revealed one in the area we were staying called Comic Buster. As a general note, I think these stores usually serve Japanese residents, because you need to register in order to use the computers. The first shop clerk asked me if I lived around here and ended up taking my hotel address and number. I think that she thought it was a one-time thing, so she didn’t take my $3 membership fee, but I had to come back later because I didn’t finish my work, and I had to go through that with a different shop clerk…

Now that I’m actually reading the page that I linked, doesn’t it say that I get free registration if I mention that I saw the note on the home page? Damn, I lost 300 yen…

missed_out_free_registration

Well, I was set up with a private booth, which is the standard setup. The walls are probably as tall as a typical fence, and there’s a clear door that you slide closed and can cast a blanket over to secure your privacy. You get a Windows 7 PC. The JP keyboard layout is pretty funky, I had trouble finding certain special symbols for some reason:

IMG_0274

I don’t think I was supposed to, but I ended up hijacking the ethernet cable from the desktop for my own laptop. It was a lot harder in the booth that I got during the evening because of how little space there was in the booth and how the chair was completely not adjustable.

When we went to Tokyo, we also needed another place to hide out from the rain and have internet until we could check in with the person we were staying at in the evening, so asking around, we found a Gera Gera tucked away on the 3rd floor of a building. In hindsight, It’s pretty easy to find if I knew to look up and pay attention to the signs. We had to go through the same registration process, and it was kind of awkward to agree to all the things the shop clerk was telling me in spite of not following most of it — well, we got service, so I guess that it was a success, and it was kind of interesting. I felt sorry for the guests who had to wait until we were done with, though.

There seems to be a lot of different internet cafe chains throughout Japan. While transferring stations, I was handed a pack of tissues, complements of the nearby Space Create. I checked out their website and was surprised to find that they actually put some effort into catering to their female guests. I gotta say, those look like pretty nice booths, and I wonder if they actually see some patronage.

In conclusion, the net cafes are not what I imagined them to be. They really emphasize privacy, and there’s really not a lot of open space. There were “open” booths rather than “private” booths in the Comic Buster that I saw, which amounted to a lounge chair facing out of the window, but there were only a few of those compared to the dozens of booths there were. I guess it’s no big surprise, given the lack of space in general, and you probably don’t need or want an open space for the stuff you’d come to a net cafe to do.

I’m a little scared to think that it really seems like a great place to hang out in Japan: there’s great internet, there’s shelves and shelves of manga, there’s unlimited drinks from the fountain, and you can even order cheap food. I think I saw a shower room in the back as well. It really seems like an ideal setup to sink all of your time there if you wanted to. I feel like the novelty of traveling in Japan was the only thing that made me think of it as a waste of time to hang out there for too long, but it that kind of place really appeals to me…