During the writing of my HLSL2GLSL parser, I did some research as to which C# parser framework to use. I did stumble upon Irony, and not Scott Hanselman wrote a blog post about it so I thought I would share. From the original article at Scott Hanselman’s blog:
One of the best, if not the best way to sharpen the saw and keep your software development skills up to date is by reading code. Sure, write lots of code, but don’t forget to explore other people’s brainscode. There’s always fifteen different ways to create a “textboxes over data” application, and while it’s interesting to take a look at whatever the newest way to make business software, sometimes it’s nice to relax by looking at some implementations of classic software issues like parsers, lexers, and abstract syntax trees. If you didn’t go to school or failed to take a compilers class at least knowing that this area of software engineering exists and is accessible to you is very important.
It’s so nice to discover open source projects that I didn’t know existed. One such project I just stumbled upon while doing research for a customer is “Irony,” a .NET language implementation kit. From the CodePlex site:
Irony is a development kit for implementing languages on .NET platform. Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs. Irony’s scanner and parser modules use the grammar encoded as c# class to control the parsing process. See the expression grammar sample for an example of grammar definition in c# class, and using it in a working parser.
Irony includes “simplified grammars for C#, Scheme, SQL, GwBasic, JSON and others” to learn from. There are different kinds of parsers that are grammar generators you might be familiar with. For example, ANTLR is a what’s called a LL(*) grammar generator, while Irony is a LALR (Look Ahead Left to Right) grammar generator.
I myself started by using Irony for the initial groundwork for my HLSL2GLSL translator. However, a few syntaxic resolution issues were easier to resolve using a LL(*) generator and this is why I switched over to Antlr. But what I really liked with Irony was the fact that the grammar is written in code (instead of putting code in the grammar) and the included grammar debugging tool was a valuable asset.
In the midst of reading this article, I also stumbled upon Sprache, a really simple parser framework which implements its parsers in Linq. Not quite as powerful as a full blown tool but can do the trick for simple language and parsing tasks, I have to say I just love the Linq approach to parsers.
A simple parser might parse a sequence of characters:
// Parse any number of capital 'A's in a row
var parseA = Parse.Char('A').AtLeastOnce();
Sprache provides a number of built-in functions that can make bigger parsers from smaller ones, often callable via Linq query comprehensions:
Parser<string> identifier =
from leading in Parse.Whitespace.Many()
from first in Parse.Letter.Once()
from rest in Parse.LetterOrDigit.Many()
from trailing in Parse.Whitespace.Many()
select new string(first.Concat(rest).ToArray());
var id = identifier.Parse(" abc123 ");
Assert.AreEqual("abc123", id);
Enjoy!