Polynomial parser in C++ using Boost.Spirit

Edit (07/07/2014): I’m sorry, the initial code compiled but didn’t work. It’s been updated now, along with the blog.

First of all, here is a link to the code. It depends on Boost.Spirit, which is a header-only part of the Boost library. It compiles, and it parses bivariate polynomials (e.g. xy^2 + 2x – 5y).

The input is a user-input string, and the output is a polynomial, which is represented internally as a vector of terms. A term has a coefficient and two exponents, one for each variable.

struct term
{
  boost::optional<int> coefficient;
  boost::optional<int> x_exponent;
  boost::optional<int> y_exponent;
};

So imagine the parser going over the string and processing each term individually, stuffing it into a vector. That’s exactly how the parser rules work.

The rules in the parser define how the user input is parsed, and we attach actions in the form of lambda functions to each rule to be executed when each part is parsed. It starts at the top with start:

 namespace phx = boost::phoenix;
 ...
 start = eps[_val = std::vector<term>()]
>> poly_term[phx::push_back(_val, qi::_1)]
>> *(
('+' >> poly_term[phx::push_back(_val, qi::_1)])
|
('-' >> negative_poly_term[phx::push_back(_val, qi::_1)])
)

First of all, rules have attributes, and the start rule’s attribute is std::vector of term‘s. The rule says that we should initialize with an empty vector, then expect to parse at least one poly_term (i.e. polynomial term).

The poly_term is another rule, whose attribute is term. eps is another rule that consumes no input and always matches.

Inside the brackets of each rule is a lambda function that does something with the attribute of the rule. For example, the rule for poly_term says to take the term, stored in the placeholder qi::_1, and stuff it into the vector that we initialized at the beginning.

A single polynomial term contains three pieces of information: the coefficient out in front, and the two exponents for each variable, and the next rule shows how to break it down.

 poly_term = eps[_val = term()]
>>
-int_[phx::bind(&term::coefficient, _val) = qi::_1]
>>
-x_term[phx::bind(&term::x_exponent, _val) = qi::_1]
>>
-y_term[phx::bind(&term::y_exponent, _val) = qi::_1]
;

First, we try to pick out the coefficient. int_ is a “primitive” Spirit rule that will pick up an int from the string being parsed. The action says that the int will be taken and assigned to the coefficient field of the term attribute of this rule.

Note that the coefficient is optional, as indicated by the dash outside of the parens. The Kleene operator in the start rule up above is another one, which matches any number of poly_terms. Here’s a complete list of the parser operators.

Somehow, it is quite beautiful to look at this code.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s