不错的想法。您假设以下一项或多项:
a) that each tool that has a grammar, uses a canonical parsing engine type (e.g., everybody uses bison)
b) that there is some parsing tool that understands the zillion grammar specification schemes that exist
c) that whatever the parser is, it will parse language fragments (perhaps well formed).
a) 显然是错误的。我从来没见过b)。实际上没有一个解析引擎会执行 c);他们只能解析“完整程序”。
恕我直言,您唯一的希望是使用具有大量经过良好测试的语言定义的解析器生成器。
ANTLR http://www.antlr.org可以说是一个;它当然有一长串贡献的语言定义。而且它们都可以在一个地方找到。但据我所知,不做语言片段。怀疑它是否具有所有解析树的 XML 导出。
Bison http://www.gnu.org/software/bison可以说是一个;有很多很多的语言处理器是使用 Bison 构建的。但定义分散各处,收集起来非常困难。也不做语言片段。很确定它没有 XML 导出功能。
Our DMS 软件再造工具包 http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html可以说是一个。有很多语言定义。它们都收集在一个地方(我们公司)。它确实为每个解析生成 AST,并且具有内置的 XML 导出。 DMS 还可以解析任何它所知道的任何语言的非终结符语言。
给定 DMS .lex、.atg(“属性语法”)和兼容的源文件,DMS 可以很好地模拟您的示例。
接下来是 DMS 词法分析器/解析器的构建和运行,并带有 XML 导出,用于在以下位置找到的代数语法:作为 DMS 域的代数 http://www.semanticdesigns.com/Products/DMS/SimpleDMSDomainExample.html
(the ++XML该示例的中间部分是被告知导出 XML 的解析步骤):
C:\DMS\Domains\Algebra\Tools\Parser\Source>make
perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -lexer
MakeDMSTool: Selected domain "Algebra".
LexerGenerator V2.1a
Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved
Parsing lexical specification ...
Processing mode Algebra ...
Exiting with final status 0
perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -tool %Temporaries
MakeDMSTool: Selected domain "Algebra".
Using attribute grammar in "/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/Syntax/Algebra.atg"
AttributeEvaluatorGenerator V3.0
Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved
Parsing attribute grammar ...
Generating attribute evaluator(s) ...
Exiting with final status 0
rm -rf /cygdrive/c/DMS/Domains/Algebra/Tools/%Temporaries
perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -prettyprinter
MakeDMSTool: Selected domain "Algebra".
PrettyPrinterGenerator V2.0
Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved
Parsing pretty printer specification ...
Generating pretty printer ...
Exiting with final status 0
AttributeEvaluatorGenerator V3.0
Copyright (c) 1999-2010 Semantic Designs, Inc.; All Rights Reserved
Parsing attribute grammar ...
Generating attribute evaluator(s) ...
......................
Exiting with final status 0
cd /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/\%Generated; \
perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -weave-preserve-productions %PreserveProductions.*.par
MakeDMSTool: Selected domain "Algebra".
perl /cygdrive/c/DMS/Executables/MakeDMSTool Algebra -parser
MakeDMSTool: Selected domain "Algebra".
export PARLANSEINCLUDEDIRECTORIES=`perl -e '($_ = $ARGV[0].";/cygdrive/c/DMS/Domains/PARLANSE/Library/Arrays;/cygdrive/c/DMS/Domains
/PARLANSE/Library/Bags;/cygdrive/c/DMS/Domains/PARLANSE/Library/HashTables;/cygdrive/c/DMS/Domains/PARLANSE/Library/Pipes;/cygdrive/
c/DMS/Domains/PARLANSE/Library/Sequences;/cygdrive/c/DMS/Domains/PARLANSE/Library/Sets;/cygdrive/c/DMS/Domains/PARLANSE/Library/Stac
ks;/cygdrive/c/DMS/Domains/PARLANSE/Library/Utilities;/cygdrive/c/DMS/Domains/PARLANSE/Library/Algorithms/Source;/cygdrive/c/DMS/Dom
ains/PARLANSE/Library/Booleans/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Characters/Source;/cygdrive/c/DMS/Domains/PARLANSE/Li
brary/Graphics/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/HashTrees/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Numbers/Sou
rce;/cygdrive/c/DMS/Domains/PARLANSE/Library/References/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/SQL/Source;/cygdrive/c/DMS/D
omains/PARLANSE/Library/Streams/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/SuffixTrees/Source;/cygdrive/c/DMS/Domains/PARLANSE/
Library/System/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/Search/Source;/cygdrive/c/DMS/Domains/PARLANSE/Library/TestSupport/So
urce") =~ s!//(.)/!$1:/!g; $_ =~ s!/cygdrive/(.)/!$1:/!g; print $_' "/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source;/cygdrive/c
/DMS/Domains/Algebra/Tools/Parser/Source/Components;/cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/%Generated;/cygdrive/c/DMS/D
omains/DMSStringGrammar/Tools/DomainParser/Source;/cygdrive/c/DMS/Domains/Algebra/Tools/Lexer/Source;/cygdrive/c/DMS/Domains/Algebra
/Tools/Lexer/Source/%Generated;/cygdrive/c/DMS/Domains/DMSLexical/Tools/DomainLexer/Source;/cygdrive/c/DMS/Infrastructure/HyperGraph
/Source;/cygdrive/c/DMS/Domains"`; \
cd `echo /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source`; \
nice /cygdrive/c/DMS/Domains/PARLANSE/Tools/Compiler/p0c.exe DomainParser.par
PARLANSE0 Compiler V19.16.40
Semantic Designs, Inc. *** Confidential Information
128/485/133408 smallest/average/largest activation record/grain stack space required.
Largest stack space required by function at Line 1533
in file FFIModule.par
89 grains.
3775 functions/procedures.
223447 lines of source code read.
7160772 bytes of object code.
No errors detected.
mv -f /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/Source/DomainParser.P0B /cygdrive/c/DMS/Domains/Algebra/Tools/Parser/DomainParser
.P0B
C:\DMS\Domains\Algebra\Tools\Parser\Source>run ../DomainParser ++XML C:\DMS\Domains\Algebra\Tools\Lexer\TestCase\algebraformula.txt
Domain Parser for Algebra 2.3.3
Copyright (C) Semantic Designs 1996-2010; All Rights Reserved
31 tree nodes in tree.
<DMSForest>
<tree node="formula" type="1" domain="1" id="10qx0" parents="0" line="1" column="1" file="1">
<tree node="product" type="4" domain="1" id="10qwx" line="1" column="1" file="1">
<tree node="term" type="10" domain="1" id="10qwy" line="1" column="1" file="1">
<tree node="'D'" type="19" domain="1" id="10qw5" literal="0" line="1" column="1" file="1"/>
<tree node="'['" type="20" domain="1" id="10qw6" literal="0" line="1" column="2" file="1"/>
<tree node="formula" type="1" domain="1" id="10qwt" line="1" column="4" file="1">
<tree node="product" type="4" domain="1" id="10qws" line="1" column="4" file="1">
<tree node="term" type="9" domain="1" id="10qwr" line="1" column="4" file="1">
<tree node="'('" type="17" domain="1" id="10qw7" literal="0" line="1" column="4" file="1"/>
<tree node="formula" type="3" domain="1" id="10qwp" line="1" column="5" file="1">
<tree node="formula" type="2" domain="1" id="10qwk" line="1" column="5" file="1">
<tree node="formula" type="1" domain="1" id="10qwf" line="1" column="5" file="1">
<tree node="product" type="5" domain="1" id="10qwe" line="1" column="5" file="1">
<tree node="product" type="4" domain="1" id="10qwa" line="1" column="5" file="1">
<tree node="term" type="7" domain="1" id="10qw9" line="1" column="5" file="1">
<tree node="VARIABLE" type="15" domain="1" id="10qw8" line="1" column="5" file="1">
<literal>x</literal>
</tree>
</tree>
</tree>
<tree node="'*'" type="13" domain="1" id="10qwb" literal="0" line="1" column="7" file="1"/>
<tree node="term" type="8" domain="1" id="10qwd" line="1" column="8" file="1">
<tree node="NUMBER" type="16" domain="1" id="10qwc" literal="23" line="1" column="8" file="1"/>
</tree>
</tree>
</tree>
<tree node="'+'" type="11" domain="1" id="10qwg" literal="0" line="1" column="10" file="1"/>
<tree node="product" type="4" domain="1" id="10qwj" line="1" column="12" file="1">
<tree node="term" type="7" domain="1" id="10qwi" line="1" column="12" file="1">
<tree node="VARIABLE" type="15" domain="1" id="10qwh" line="1" column="12" file="1">
<literal>y</literal>
</tree>
</tree>
</tree>
</tree>
<tree node="'-'" type="12" domain="1" id="10qwl" literal="0" line="1" column="13" file="1"/>
<tree node="product" type="4" domain="1" id="10qwo" line="1" column="14" file="1">
<tree node="term" type="7" domain="1" id="10qwn" line="1" column="14" file="1">
<tree node="VARIABLE" type="15" domain="1" id="10qwm" line="1" column="14" file="1">
<literal>z</literal>
</tree>
</tree>
</tree>
</tree>
<tree node="')'" type="18" domain="1" id="10qwq" literal="0" line="1" column="15" file="1"/>
</tree>
</tree>
</tree>
<tree node="','" type="21" domain="1" id="10qwu" literal="0" line="1" column="16" file="1"/>
<tree node="VARIABLE" type="15" domain="1" id="10qwv" line="1" column="18" file="1">
<literal>x</literal>
</tree>
<tree node="']'" type="22" domain="1" id="10qww" literal="0" line="1" column="19" file="1"/>
</tree>
</tree>
</tree>
<FileIndex>
<File index="1">C:/DMS/Domains/Algebra/Tools/Lexer/TestCase/algebraformula.txt</File>
</FileIndex>
<DomainIndex>
<Domain index="1">Algebra</Domain>
</DomainIndex>
</DMSForest>
Exiting with final status 0
C:\DMS\Domains\Algebra\Tools\Parser\Source>
If you really如果想要一个能够理解许多语法符号的引擎,那么使用 DMS 构建这样的引擎可能是最简单的。只需将每个语法形式(例如 ANTLR 或 bison)定义为 DMS 的 DSL,使用 DMS 解析特定语法形式实例(例如 ANLTR bnf 实例),应用 DMS 重写规则将其转换为 DMS 语法,然后构建 DMS 解析器。 (您也必须对词法分析器执行相同的操作。)。