On Wed, 19 Feb 2020 12:27:31 +0100, Bruno Le Floch <[log in to unmask]> wrote: >On 2/19/20 10:25 AM, Jonathan Spratte wrote: >> Hi, >> >>> - Defining an environment that sets up active characters to emulate >>> traditional BNF syntax is very concise, but comes at the cost of >>> delimited arguments and catcode madness. >> >> You could as well use a letter-by-letter parser that doesn't need altered >> category codes. Take a look at the pgf module `parser`. That's not `expl3` >> but could give you an idea on what's possible. >> >> Best, >> Jonathan >> > >I've been *toying* for a long time (haven't gone very far) with the idea >of writing a parser generator, but I couldn't decide how powerful to >make it. One option would be to support "parsing expression grammars" >(PEGs), which can be parsed in linear time using a packrat parser (but >use a lot of memory, possibly problematic). Another option would be to >stick with more traditional things like LL or LR parser. To be honest, >I don't know enough about parsers and what useful languages they cover >to decide. Thoughts welcome. > >Best, >Bruno Given that the machinery already exists in the regex module, I would suggest making a lexer generator and then complementing it with a parser generator. Some quick reading suggests that LALR parsers (e.g. YACC) would be a good balance of expressiveness and efficiency. Since this would be used for document syntax, the parsers do not need to be very powerful, since user-facing syntax should be relatively simple and certainly should be unambiguous. Warmly, Kelly