根据配置语言的复杂性,使用一次性解析器可能比创建 AST 然后遍历树更好。但这两种方法都是完全有效的。
也许你应该花几分钟(或几个小时:))阅读野牛手册 https://www.gnu.org/software/bison/manual/。在这里,我将只关注一般方法和您可能使用的 bison 功能。
最重要的一项是将额外参数传递到解析器的能力。特别是,您需要将引用或指针传递给将包含已解析配置的对象。您需要额外的输出参数,因为解析器本身只会返回成功或失败指示(您也需要)。
这是一个简单的示例,它仅构造一个名称到字符串的字典。请注意,与您提到的教程的作者不同,我更喜欢将扫描器和解析器编译为 C++,从而避免需要extern "C"
接口。这适用于当前版本flex
and bison
,只要您不尝试将非 POD 对象放入解析器堆栈即可。不幸的是,这意味着我们不能直接使用 std::string ;我们需要使用指针(并且我们也不能使用智能指针。)
文件扫描仪.l
%{
#include <string>
#include "config.h"
using std::string;
%}
%option noinput nounput noyywrap nodefault
%option yylineno
// Set the output file to a C++ file. This could also be done on the
// command-line
%option outfile="scanner.cc"
%%
"#".* ; /* Ignore comments */
[[:space:]] ; /* Ignore whitespace */
[[:alpha:]̣_][[:alnum:]_]* { yylval = new string(yytext, yyleng); return ID; }
[[:alnum:]_@]+ { yylval = new string(yytext, yyleng); return STRING; }
["][^"]*["] { yylval = new string(yytext+1, yyleng-2); return STRING; }
. { return *yytext; }
现在是 bison 文件,它只识别分配。这需要 bison v3;需要进行一些小的调整才能与 bison v2.7 一起使用。
config.y
%code requires {
#include <map>
#include <string>
#include <cstdio>
using Config = std::map<std::string, std::string>;
// The semantic type is a pointer to a std::string
#define YYSTYPE std::string*
// Forward declarations
extern FILE* yyin;
extern int yylineno;
int yylex();
// Since we've defined an additional parse parameter, it will also
// be passed to yyerror. So we need to adjust the prototype accordingly.
void yyerror(Config&, const char*);
}
// Set the generated code filenames. As with the flex file, this is
// probably
// better done on the command line.
%output "config.cc"
%defines "config.h"
// The parser takes an additional argument, which is a reference to the
// dictionary
// which will be returned.
%parse-param { Config& config }
%token ID STRING
// If semantic values are popped off the stack as the result of error
// recovery,
// they will leak, so we need to clean up.
%destructor { delete $$; } ID STRING
%%
config: %empty
| config assignment
;
assignment: ID '=' STRING { config[*$1] = *$3;
delete $1; delete $3;
}
| ID '=' ID { config[*$1] = config[*$3];
delete $1; delete $3;
}
%%
// The driver would normally go into a separate file. I've put it here
// for simplicity.
#include <iostream>
#include <cstring>
void yyerror(Config& unused, const char* msg) {
std::cerr << msg << " at line " << yylineno << '\n';
}
int main(int argc, const char** argv) {
if (argc > 1) {
yyin = fopen(argv[1], "r");
if (!yyin) {
std::cerr << "Unable to open " << argv[1] << ": "
<< strerror(errno) << '\n';
return 1;
}
} else {
yyin = stdin;
}
Config config;
int rv = yyparse(config);
if (rv == 0)
for (const auto& kv : config)
std::cout << kv.first << ": \"" << kv.second << "\"\n";
return rv;
}
编译:
flex scanner.l
bison config.y
g++ --std=c++11 -Wall config.cc scanner.cc
试试看:
$ cat sample.config
a=17
b= @a_single_token@
c = "A quoted string"
d9 =
"Another quoted string"
$ ./config sample.config
a: "17"
b: "@a_single_token@"
c: "A quoted string"
d9: "Another quoted string"