[svlug] Some pretty serious parsing

Steve Litt slitt at troubleshooters.com
Sat Nov 14 09:31:16 PST 2015


On Sat, 14 Nov 2015 15:41:18 +0000
Karen Shaeffer <shaeffer at neuralscape.com> wrote:

 
> Hi Steve,
> Parsers are not difficult to write, but you do need to be formal
> about it to assure correctness for the entire input space of token
> sequences. The first thing to do is to characterize your language.
> I'd read up about the Chomsky Hierarchy, which is a containment
> hierarchy of classes of formal languages.
> 
> type-3 grammars are regular grammars that can be parsed with regular
> expressions in the form of a finite state automaton. If the Stylz
> language is a type-3 grammar, then this is the easiest to parse.
> 
> type-2 grammars are context-free grammars. The Stylz language might
> be a context free grammar. The question is about whether a token
> appearing in a sequence of tokens (statement) affects the syntactical
> rules of later tokens within that statement. If not, then your
> language is context-free and rather easy to parse. Just do a google
> search for context free parsing.
> 
> type-1 grammars are context-sensitive grammars. Such languages can be
> processed with linear bounded automatons. Much more complicated than
> context-free grammars, depending on the extent and number of
> contextual idoms possible.

Thanks Karen!

I knew nothing about anything you wrote, so of course without your
suggestion I'd have never gone down that road. I'll first learn about
the Chomsky Hierarchy, probably relying heavily on making myself a
terminology glossary, because all this stuff is brand new to me: I was
an office automation programmer (applications). These are the times I
wish I'd studied Comp-Sci in college.

But anyway, now at least I know the right direction to follow, and what
I need to learn, and after that, which way to influence Stylz (toward
type3 or type2). And given that the project demands I prioritize fast
and easy authoring over parsability, once I know what's type3 and
type2, I could always write an informal preprocessor that turns the
ideal authoring language into a type3 or type2, and when people crash
the corner cases of my informal preparser, I'll just say "sorry, bug".
In the long run, if a lot of people use Stylz, Version 2 will be
written by someone much more knowledgeable than I anyway.

Thanks for pointing me in the right direction!

SteveT

Steve Litt 
November 2015 featured book: Troubleshooting Techniques
     of the Successful Technologist
http://www.troubleshooters.com/techniques



More information about the svlug mailing list