Comparison of parser generators

This is a list of notable lexer generators and parser generators for various language classes.

Regular languages

Regular languages are a category of languages (sometimes known as Chomsky Type 3) which can be matched by a state machine (more specifically, by a deterministic finite automaton) or, equivalently, by a regular expression. In particular, a regular language can match constructs like "A follows B", "Either A or B", "A, followed by zero or more instances of B", but cannot match constructs which require consistency between non-adjacent elements, such as "some instances of A followed by the same number of instances of B", and also cannot express the concept of recursive "nesting" ("every A is eventually followed by a matching B"). A classic example of a problem which a regular grammar cannot handle is the question of whether a given string contains correctly-nested parentheses. (This is typically handled by a Chomsky Type 2 grammar, also known as a context-free grammar.)

NameLexer algorithmOutput languagesGrammar, codeDevelopment platformLicense
AlexDFAHaskellmixedallBSD
AnnoFlexDFAJavamixedJava Virtual MachineBSD
AustenXDFAJavaseparateallBSD
C# FlexDFAC#mixed.NET CLRGNU GPL
C# LexDFAC#mixed.NET CLR?
CookCCDFAJavamixedJava Virtual MachineApache License 2.0
DFAlexDFAno code generation requiredJavaJavaApache License 2.0
DolphinDFAC++separateallProprietary
flexDFA table drivenC, C++mixedallBSD
gelexDFAEiffelmixedEiffelMIT
golexDFAGomixedGoBSD-style
gplexDFAC#mixed.NET CLRBSD-like
JFlexDFAJavamixedJava Virtual MachineBSD
JLexDFAJavamixedJava Virtual MachineBSD-like
lexDFACmixedPOSIXProprietary, CDDL
lexertlDFAC++allGNU LGPL
LRSTARDFAC++separateWindowsProprietary
QuexDFA direct codeC, C++mixedallGNU LGPL
RagelDFAC, C++, Assembly, Objective C, D, Go, Ruby, JavamixedallGNU GPL, MIT[1][1][1]
re2cDFA direct codeCmixedallPublic domain

Deterministic context-free languages

Context-free languages are a category of languages (sometimes known as Chomsky Type 2) which can be matched by a sequence of replacement rules, each of which essentially maps each non-terminal element to a sequence of terminal elements and/or other nonterminal elements. Grammars of this type can match anything that can be matched by a regular grammar, and furthermore, can handle the concept of recursive "nesting" ("every A is eventually followed by a matching B"), such as the question of whether a given string contains correctly-nested parentheses. The rules of Context-free grammars are purely local, however, and therefore cannot handle questions that require non-local analysis such as "Does a declaration exist for every variable that is used in a function?". To do so technically would require more a sophisticated grammar, like a Chomsky Type 1 grammar, also known as a Context-sensitive grammar. However, parser generators for context-free grammars often support the ability for user-written code to introduce limited amounts of context-sensitivity. (For instance, upon encountering a variable declaration, user-written code could save the name and type of the variable into an external data structure, so that these could be checked against later variable references detected by the parser.)

The deterministic context-free languages are a proper subset of the Context-Free languages which can be efficiently parsed by Deterministic pushdown automata.

NameParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense
ANTLR4ALL(*)[2]EBNFC#, Java, Python, JavaScript, C++, Swift, GomixedgeneratedJava Virtual MachineYesBSD
ANTLR3LL(*)EBNFActionScript, Ada95, C, C++, C#, Java, JavaScript, Objective-C, Perl, Python, RubymixedgeneratedJava Virtual MachineYesBSD
APGRecursive descent, BacktrackingABNFC, C++, JavaScript, JavaseparatenoneallNoGNU GPL
AXERecursive descentAXE/C++C++11mixednoneany platform with standard C++11 compilerNoBoost
BeaverLALR(1)EBNFJavamixedexternalJava Virtual MachineNoBSD
BisonLALR(1), LR(1), IELR(1), GLR?C, C++, JavamixedexternalallNoGNU GPL
Bison++[note 1]LALR(1)?C++mixedexternalPOSIXNoGNU GPL
Bisonc++LALR(1)?C++mixedexternalPOSIXNoGNU GPL
BtYaccBacktracking Bottom-up?C++mixedexternalallNoPublic domain
byaccLALR(1)YACCCmixedexternalallNoPublic domain
BYACC/JLALR(1)YACCC, JavamixedexternalallNoPublic domain
CL-YaccLALR(1)LispCommon LispmixedexternalallNoMIT
Coco/RLL(1)EBNFC, C++, C#, F#, Java, Ada, Object Pascal, Delphi, Modula-2, Oberon, Ruby, Swift, Unicon, Visual Basic .NETmixedgeneratedJava Virtual Machine, .NET Framework, Microsoft Windows, POSIX (depends on output language)NoGNU GPL
CookCCLALR(1)Java annotationsJavamixedgeneratedJava Virtual MachineNoApache License 2.0
CppCCLL(k)?C++mixedgeneratedPOSIXNoGNU GPL
CSPLR(1)?C++separategeneratedPOSIXNoApache License 2.0
CUPLALR(1)?JavamixedexternalJava Virtual MachineNoBSD-like
DragonLR(1), LALR(1)?C++, JavaseparategeneratedallNoGNU GPL
eliLALR(1)?CmixedgeneratedPOSIXNoGNU GPL, GNU LGPL
Epsilon Grammar StudioLL(*)ABNFC++separateinternalMicrosoft WindowsYesproprietary
EssenceLR(???)?Scheme 48mixedexternalallNoBSD
Eto.ParseLL(k)BNF, EBNF or C#N/A (state machine is runtime generated)separateinternal.NET FrameworkNoMIT
eyappLALR(1)?Perlmixedexternal or generatedallNoPerl
FrownLALR(k)?Haskell 98mixedexternalallNoGNU GPL
geyaccLALR(1)?EiffelmixedexternalallNoMIT
GOLDLALR(1)BNFx86 assembly language, ANSI C, C#, D, Java, Pascal, Object Pascal, Python, Visual Basic 6, Visual Basic .NET, Visual C++separategeneratedMicrosoft WindowsYesModified Zlib
GPPGLALR(1)YACCC#separateexternalMicrosoft WindowsYesBSD
GrammaticaLL(k)BNF dialectC#, JavaseparategeneratedJava Virtual MachineNoBSD
HiLexedLL(*)EBNF or JavaJavaseparateinternalJava Virtual MachineNoGNU LGPL
Hime Parser GeneratorLR(1), LALR(1), LR(0)BNF dialectC#, Java, Rustseparategenerated.NET Framework, Java Virtual MachineNoGNU LGPL
HyaccLR(1), LALR(1), LR(0)YACCCmixedexternalallNoGNU GPL
IronyLALR(1)C#N/A (state machine is runtime generated)separateinternal.NET FrameworkYesMIT
iyaccLALR(1)YACCIconmixedexternalallNoGNU GPL
jaccLALR(1)?JavamixedexternalJava Virtual MachineNoBSD
JavaCCLL(k)EBNFJava, C++, JavaScript (via GWT compiler)[3]mixedgeneratedJava Virtual MachineYesBSD
jayLALR(1)YACCC#, JavamixednoneJava Virtual MachineNoBSD
JFLAPLL(1), LALR(1)?Java??Java Virtual MachineYes?
JetPAGLL(k)?C++mixedgeneratedallNoGNU GPL
JS/CCLALR(1)EBNFJavaScript, JScript, ECMAScriptmixedinternalallYesBSD
KDevelop-PG-QtLL(1), Backtracking, Shunting yard?C++mixedgenerated or externalall, KDENoGNU LGPL
KelbtBacktracking LALR(1)?C++mixedgeneratedPOSIXNoGNU GPL
kmyaccLALR(1)?C, Java, Perl, JavaScriptmixedexternalallNoGNU GPL
LapgLALR(1)?C, C++, C#, Java, JavaScriptmixedgeneratedJava Virtual MachineNoGNU GPL
LemonLALR(1)?CmixedexternalallNoPublic domain
LEPLRecursive descentPythonPython (no generation, library)separatenoneallNoMPL/GNU LGPL
LimeLALR(1)?PHPmixedexternalallNoGNU GPL
LISALR(?), LL(?), LALR(?), SLR(?)?JavamixedgeneratedJava Virtual MachineYesPublic domain
LLgenLL(1)?CmixedexternalPOSIXNoBSD
LLnextgenLL(1)?CmixedexternalallNoGNU GPL
LLLPGLL(k) + syntactic and semantic predicatesANTLR-likeC#mixedgenerated (?).NET Framework, MonoVisual StudioGNU LGPL
LPGBacktracking LALR(k)?JavamixedgeneratedJava Virtual MachineNoEPL
LRSTARLALR(1), LR(1), LR(*)EBNF, TBNF or
Yacc-like
C++separategeneratedWindowsVisual StudioProprietary
MenhirLR(1)?OCamlmixedgeneratedallNoQPL
ML-YaccLALR(1)?MLmixedexternalallNo?
MonkeyLR(1)?JavaseparategeneratedJava Virtual MachineNoGNU GPL
MstaLALR(k), LR(k)YACC, EBNFC, C++mixedexternal or generatedPOSIX, CygwinNoGNU GPL
MTP (More Than Parsing)LL(1)?JavaseparategeneratedJava Virtual MachineNoGNU GPL
MyParserLL(*)MarkdownC++11separateinternalany platform with standard C++11 compilerNoMIT License
NLTGLRC#/BNF-likeC#mixedmixed.NET FrameworkNoMIT
ocamlyaccLALR(1)?OCamlmixedexternalallNoQPL
olexLL(1)?C++mixedgeneratedallNoGNU GPL
parglareScannerless LALR(1)/SLR(1)/GLRBNF-like, PythonN/A (state machine is runtime generated)mixednoneallNoMIT
ParsecLL, BacktrackingHaskellHaskellmixednoneallNoBSD
Parse::YappLALR(1)?PerlmixedexternalallNoGNU GPL
Parser ObjectsLL(k)?Javamixed?Java Virtual MachineNozlib
PCCTSLL?C, C++??allNo?
PLYLALR(1)BNFPythonmixedgeneratedallNoMIT License
PlyPlusLALR(1)EBNFPythonseparategeneratedallNoMIT License
PRECCLL(k)?CseparategeneratedDOS, POSIXNoGNU GPL
QLALRLALR(1)?C++mixedexternalallNoGNU GPL
RPATKRecursive descent, BacktrackingBNFC (no generation, library)separatenoneallNoGNU GPL
SableCCLALR(1)?C, C++, C#, Java, OCaml, PythonseparategeneratedallNoGNU LGPL
SLKLL(k) LR(k) LALR(k)EBNFC, C++, C#, Java, JavaScriptseparateexternalallNoMIT-like
SP (Simple Parser)Recursive descentPythonPythonseparategeneratedallNoGNU LGPL
SpiritRecursive descent?C++mixedinternalallNoBoost
SpracheLL, BacktrackingC#interpretedmixedinternal.NET FrameworkNoMIT
StyxLALR(1)?C, C++separategeneratedallNoGNU LGPL
Sweet ParserLALR(1)?C++separategeneratedMicrosoft WindowsNozlib
TapLL(1)?C++mixedgeneratedallNoGNU GPL
TextTransformerLL(k)?C++mixedgeneratedMicrosoft WindowsYesProprietary
TinyPGLL(1)?C#, Visual Basic??Microsoft WindowsYesCPOL 1.0
Toy Parser GeneratorRecursive descent?PythonmixedgeneratedallNoGNU LGPL
TP YaccLALR(1)?Turbo PascalmixedexternalallYesGNU GPL
UltraGramLALR(1), LR(1), GLRBNFC++, Java, C#, Visual Basic .NETseparateexternalMicrosoft WindowsYesPublic domain
UniCCLALR(1)EBNFC, C++, Python, JavaScript, JSON, XMLmixedgeneratedPOSIXNoBSD
UrchinCCLL(1)?Java?generatedJava Virtual MachineNo?
WhaleLR(?), some conjunctive stuff, see Whale Calf?C++mixedexternalallNoProprietary
wisentLALR(1)?C++, JavamixedexternalallNoGNU GPL
Yacc AT&T/SunLALR(1)YACCCmixedexternalPOSIXNoCPL & CDDL
Yacc++LR(1), LALR(1)YACCC++, C#mixedgenerated or externalallNoProprietary
YappsLL(1)?PythonmixedgeneratedallNoMIT
yeccLALR(1)?ErlangseparategeneratedallNoErlang
Visual BNFLR(1), LALR(1)?C#separategenerated.NET FrameworkYesProprietary
YooParseLR(1), LALR(1)?C++mixedexternalallNoMIT
ParseLR(1)BNF in C++ types??noneC++11 compliant compilerNoMIT
GGLL LL(1) Graph Java mixed generated Windows Yes MIT
ProductParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense

Parsing expression grammars, deterministic boolean grammars

NameParsing algorithmOutput languagesGrammar, codeDevelopment platformLicense
ArpeggioPEG parser interpreter, PackratPython (no generation, interpreted)mixedallMIT
AustenXPackrat (modified)JavaseparateallBSD
AurochsPackratC, OCaml, JavamixedallGNU GPL
CanopyPackratJava, JavaScript, Python, RubyseparateallGNU GPL
CL-pegPackratCommon LispmixedallMIT
Drat!PackratDmixedallGNU GPL
FrisbyPackratHaskellmixedallBSD
grammar::pegPackratTclmixedallBSD
GrakoPackrat + Cut + Left RecursionPython / C++ (beta)separateallBSD
IronMetaPackratC#mixedMicrosoft WindowsBSD
KatahdinPackrat (modified), mutating interpreterC#mixedallPublic domain
Laja2-phase scannerless top-down backtracking + runtime supportJavaseparateallGNU GPL
lars::parserPackrat (modified to support left-recursion and resolve grammar ambiguity)C++identicalallGNU GPL, commercial license available on request
LPegParsing MachineLuamixedallMIT
lugParsing MachineC++17mixedallMIT
MouseRecursive descentJavaseparateJava Virtual MachineApache License 2.0
NarwhalPackratCmixedPOSIX, Microsoft WindowsBSD
Nearley Earley JavaScript mixed all MIT
Nemerle.PegRecursive descent + PrattNemerleseparateallBSD
neotomaPackratErlangseparateallMIT
NPEGRecursive descentC#mixedallMIT
OMetaPackrat (modified, partial memoization)JavaScript, Squeak, PythonmixedallMIT
PackCCPackrat (modified)CmixedallMIT
PackratPackratSchememixedallMIT
PappyPackratHaskellmixedallProprietary
parboiledRecursive descentJava, ScalamixedJava Virtual MachineApache License 2.0
Lambda PEGRecursive descentJavamixedJava Virtual MachineApache License 2.0
parseppRecursive descentC++mixedallPublic domain
ParsnipPackratC++mixedMicrosoft WindowsGNU GPL
pegRecursive descentCmixedallMIT
PEG.jsPackrat (partial memoization)JavaScriptmixedallMIT
peg-parserPEG parser interpreterDylanseparateall
PegasusRecursive descent / Packrat (selectively)C#mixedMicrosoft WindowsMIT
pegcRecursive descentCmixedallPublic domain
pestRecursive descentRustseparateallMPL
PetitParserPackratSmalltalk, Java, DartmixedallMIT
PEGTLRecursive descentC++11mixedallMIT
PGEHybrid recursive descent / operator precedence[4]Parrot bytecodemixedParrot virtual machineArtistic 2.0
PyPy rlibPackratPythonmixedallMIT
pyPEGPEG parser interpreter, PackratPythonmixedallGNU GPL
Rats!PackratJavamixedJava Virtual MachineGNU LGPL
Spirit2Recursive descent C++mixedallBoost
textXPEG parser interpreter, PackratPython (no generation, interpreted)separateallMIT
TreetopRecursive descentRubymixedallMIT
YardRecursive descentC++mixedallMIT or Public domain
WaxeyeParsing MachineC, Java, JavaScript, Python, Racket, RubyseparateallMIT
PHP PEG? (PEG Parser?)PHPmixedallBSD

General context-free, conjunctive or boolean languages

NameParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense
ACCENTEarleyYACC variantCmixedexternalallNoGNU GPL
APaGeDGLR, LALR(1), LL(k)?DmixedgeneratedallNoArtistic
BisonLALR(1), LR(1), IELR(1), GLRYACCC, C++, Java, XMLmixed (except XML)externalallNoGNU GPL
DMS Software Reengineering ToolkitGLR?ParlansemixedgeneratedMicrosoft WindowsNoProprietary
DParserScannerless GLR?CmixedscannerlessPOSIXNoBSD
Dypgenruntime-extensible GLR?OCamlmixedgeneratedallNoCeCILL-B
E3Earley?OCamlmixedexternal, or scannerlessallNo?
ElkhoundGLR?C++, OCamlmixedexternalallNoBSD
eu.h8me.ParsingGLR?N/A (state machine is runtime generated)separateexternal.NET FrameworkNoBSD
GDKLALR(1), GLR?C, Lex, Haskell, HTML, Java, Object Pascal, YaccmixedgeneratedPOSIXNoMIT
HappyLALR, GLR?HaskellmixedexternalallNoBSD
Hime Parser GeneratorGLR?C#, Java, Rustseparategenerated.NET Framework, Java Virtual MachineNoGNU LGPL
IronText LibraryLALR(1), GLRC#C#mixedgenerated or external.NET FrameworkNoApache License 2.0
JisonLALR(1), LR(0), SLR(1)YACCJavaScript, C#, PHPmixedgeneratedallNoMIT
SyntaxLALR(1), LR(0), SLR(1) CLR(1) LL(1)JSON/YACCJavaScript, Python, PHP, Ruby, C#, RustmixedgeneratedallNoMIT
LajaScannerless, two phaseLajaJavaseparatescannerlessallNoGNU GPL
ModelCCEarleyAnnotated class modelJavageneratedgeneratedallNoBSD
parglareScannerless LR/GLRBNF-likePython interpreted, automata run-time generatedmixedscannerlessallNoMIT
P1CombinatorsBNF-likeOCamlmixedexternal, or scannerlessallNo?
P3Earley/combinatorsBNF-likeOCamlmixedexternal, or scannerlessallNo?
P4Earley/combinators, infinitary CFGsBNF-likeOCamlmixedexternal, or scannerlessallNo?
Scannerless Boolean ParserScannerless GLR (Boolean grammars)?Haskell, JavaseparatescannerlessJava Virtual MachineNoBSD
SDF/SGLRScannerless GLRSDFC, JavaseparatescannerlessallYesBSD
SmaCCGLR(1), LALR(1), LR(1)?SmalltalkmixedinternalallYesMIT
SPARKEarley?PythonmixedexternalallNoMIT
TomGLR?CgeneratednoneallNo"No licensing or copyright restrictions"
UltraGramLALR, LR, GLR?C++, C#, Java, Visual Basic .NETseparategeneratedMicrosoft WindowsYesProprietary
WormholePruning, LR, GLR, Scannerless GLR?C, PythonmixedscannerlessMicrosoft WindowsNoMIT
Whale CalfGeneral tabular, SLL(k), Linear normal form (Conjunctive grammars), LR, Binary normal form (Boolean grammars)?C++separateexternalallNoProprietary
yaepEarleyyacc likeCmixedexternalallNoLGPL
Zecc Recursive Pattern Matching Zecc/Zacc Linkable Library mixed scannerless macOS Yes Proprietary

Context-sensitive grammars

Name Parsing algorithm Input grammar notation Boolean grammar capabilities Development platform License
LuZc delta chain modular Conjunctive, not complimentary POSIX proprietary
bnf2xml recursive descent (is a text filter output is xml) simple bnf grammar (input matching), output is xml ? beta, and not a full-fledged EBNF parser GNU GPLv2

See also

References

  1. http://www.colm.net/open-source/ragel/
  2. "Adaptive LL(*) Parsing: The Power of Dynamic Analysis" (PDF). Terence Parr. Retrieved 2016-04-03.
  3. "Building parsers for the web with JavaCC & GWT (Part one)". Chris Ainsley. Retrieved 2014-05-04.
  4. "Parrot: Grammar Engine". The Parrot Foundation. 2011. "PGE rules provide the full power of recursive descent parsing and operator precedence parsing."

Notes

  1. Bison 1.19 fork
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.