我想对于一个随机的路人来说提升精神答案可能看起来很复杂/矫枉过正。
我想再试一次,因为现在是 2021 年了,无论是使用 C++17,我们都可以仅使用标准库来完成合理的工作。
事实证明还有很多工作要做。 Qi 实现需要 86 行代码,但标准库实现需要 136 行。另外,我花了很长时间(几个小时)来调试/编写。尤其是很难得到'='
, '['
, ']'
作为令牌边界std::istream&
。我用的是ctype
来自这个答案的方面方法:如何在 C++ 中逐行迭代 cin?
我确实离开了DebugPeeker
(20行)所以你也许可以自己理解。
简短的博览会
顶级解析函数看起来很正常,并显示了我想要实现的目标:自然std::istream
萃取:
static Ast::File std_parse_game(std::string_view input) {
std::istringstream iss{std::string(input)};
using namespace Helpers;
if (Ast::File parsed; iss >> parsed)
return parsed;
throw std::runtime_error("Unable to parse game");
}
其余的都位于命名空间中Helpers
:
static inline std::istream& operator>>(std::istream& is, Ast::File& v) {
for (section s; is >> s;) {
if (s.name == "parameters")
is >> v.parameters;
else if (s.name == "initialization")
is >> v.initializations.emplace_back();
else if (s.name == "color")
is >> v.colors.emplace_back();
else
is.setstate(std::ios::failbit);
}
if (is.eof())
is.clear();
return is;
}
到目前为止,这一举措的回报良好。不同的部分类型是相似的:
static inline std::istream& operator>>(std::istream& is, Ast::Parameters& v) {
return is
>> entry{"numColors", v.numColors}
>> entry{"boardSize", v.boardSize}
>> entry{"numSnails", v.numSnails};
}
static inline std::istream& operator>>(std::istream& is, Ast::Initialization& v) {
return is
>> entry{"id", v.id}
>> entry{"row", v.row}
>> entry{"col", v.col}
>> entry{"orientation", v.orientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::Color& v) {
return is
>> entry{"id", v.id}
>> entry{"nextColor", v.nextColor}
>> entry{"deltaOrientation", v.deltaOrientation};
}
现在,如果一切都像这样一帆风顺,我就不会推荐 Spirit。现在我们进入条件解析。
这entry{"name", value}
配方使用“操纵器类型”:
template <typename T> struct entry {
entry(std::string name, T& into) : _name(name), _into(into) {}
std::string _name;
T& _into;
friend std::istream& operator>>(std::istream& is, entry e) {
return is >> expect{e._name} >> expect{'='} >> e._into;
}
};
同样,部分正在使用expect
and token
:
struct section {
std::string name;
friend std::istream& operator>>(std::istream& is, section& s) {
if (is >> expect('['))
return is >> token{s.name} >> expect{']'};
return is;
}
};
条件对于能够检测 EOF 而不将流置于硬失败模式(is.bad()
!= is.fail()
).
expect
是建立在token
:
template <typename T> struct expect {
expect(T expected) : _expected(expected) {}
T _expected;
friend std::istream& operator>>(std::istream& is, expect const& e) {
if (T actual; is >> token{actual})
if (actual != e._expected)
is.setstate(std::ios::failbit);
return is;
}
};
您会注意到错误信息少了很多。我们只是做
溪流fail()
如果没有找到预期的令牌。
这就是真正的复杂性所在。我不想解析字符
特点。但读书std::string
using operator>>
只会停止于
空格,意味着节名会“吃掉”括号:parameters]
代替parameters
,钥匙可能会吃掉=
如果没有字符
分隔空间。
在上面链接的答案中,我们学习如何塑造自己的性格
分类区域设置方面:
// make sure =,[,] break tokens
struct mytoken_ctype : std::ctype<char> {
static auto const* get_table() {
static std::vector rc(table_size, std::ctype_base::mask());
rc[' '] = rc['\f'] = rc['\v'] = rc['\t'] = rc['\r'] = rc['\n'] =
std::ctype_base::space;
// crucial for us:
rc['='] = rc['['] = rc[']'] = std::ctype_base::space;
return rc.data();
}
mytoken_ctype() : std::ctype<char>(get_table()) {}
};
然后我们需要使用它,但前提是我们解析一个std::string
令牌。那
方式,如果我们expect('=')
它不会跳过'='
因为我们的方面称其为空白...
template <typename T> struct token {
token(T& into) : _into(into) {}
T& _into;
friend std::istream& operator>>(std::istream& is, token const& t) {
std::locale loc = is.getloc();
if constexpr (std::is_same_v<std::decay_t<T>, std::string>) {
loc = is.imbue(std::locale(std::locale(), new mytoken_ctype()));
}
try { is >> t._into; is.imbue(loc); }
catch (...) { is.imbue(loc); throw; }
return is;
}
};
我试图保持它的简洁。如果我使用正确的格式,我们会
还有更多行代码:)
演示和测试
我使用了相同的 Ast 类型,因此测试这两种实现和
比较结果是否相等。
NOTES:
- 在 Compiler Explorer 上我们可以享受
libfmt
方便输出
- 为了进行比较,我使用了一项 C++20 功能来生成编译器
operator==
实时编译器资源管理器
#include <boost/spirit/home/qi.hpp>
#include <boost/fusion/include/io.hpp>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
namespace qi = boost::spirit::qi;
namespace Ast {
using Id = unsigned;
using Size = uint16_t; // avoiding char types for easy debug/output
using Coord = Size;
using ColorNumber = Size;
using Orientation = Size;
using Delta = signed;
struct Parameters {
Size numColors{}, boardSize{}, numSnails{};
bool operator==(Parameters const&) const = default;
};
struct Initialization {
Id id;
Coord row;
Coord col;
Orientation orientation;
bool operator==(Initialization const&) const = default;
};
struct Color {
Id id;
ColorNumber nextColor;
Delta deltaOrientation;
bool operator==(Color const&) const = default;
};
struct File {
Parameters parameters;
std::vector<Initialization> initializations;
std::vector<Color> colors;
bool operator==(File const&) const = default;
};
using boost::fusion::operator<<;
template <typename T>
static inline std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
return os << fmt::format("vector<{}>{}",
boost::core::demangle(typeid(T).name()), v);
}
} // namespace Ast
BOOST_FUSION_ADAPT_STRUCT(Ast::Parameters, numColors, boardSize, numSnails)
BOOST_FUSION_ADAPT_STRUCT(Ast::Initialization, id, row, col, orientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::Color, id, nextColor, deltaOrientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::File, parameters, initializations, colors)
template <typename It>
struct GameParser : qi::grammar<It, Ast::File()> {
GameParser() : GameParser::base_type(start) {
using namespace qi;
start = skip(blank)[file];
auto section = [](const std::string& name) {
return copy('[' >> lexeme[lit(name)] >> ']' >> (+eol | eoi));
};
auto required = [](const std::string& name, auto value) {
return copy(lexeme[eps > lit(name)] > '=' > value >
(+eol | eoi));
};
file = parameters >
*initialization >
*color >
eoi; // must reach end of input
parameters = section("parameters") >
required("numColors", _size) >
required("boardSize", _size) >
required("numSnails", _size);
initialization = section("initialization") >
required("id", _id) >
required("row", _coord) >
required("col", _coord) >
required("orientation", _orientation);
color = section("color") >
required("id", _id) >
required("nextColor", _colorNumber) >
required("deltaOrientation", _delta);
BOOST_SPIRIT_DEBUG_NODES((file)(parameters)(initialization)(color))
}
private:
using Skipper = qi::blank_type;
qi::rule<It, Ast::File()> start;
qi::rule<It, Ast::File(), Skipper> file;
// sections
qi::rule<It, Ast::Parameters(), Skipper> parameters;
qi::rule<It, Ast::Initialization(), Skipper> initialization;
qi::rule<It, Ast::Color(), Skipper> color;
// value types
qi::uint_parser<Ast::Id> _id;
qi::uint_parser<Ast::Size> _size;
qi::uint_parser<Ast::Coord> _coord;
qi::uint_parser<Ast::ColorNumber> _colorNumber;
qi::uint_parser<Ast::Orientation> _orientation;
qi::int_parser<Ast::Delta> _delta;
};
static Ast::File qi_parse_game(std::string_view input) {
using SVI = std::string_view::const_iterator;
static const GameParser<SVI> parser{};
try {
Ast::File parsed;
if (qi::parse(input.begin(), input.end(), parser, parsed)) {
return parsed;
}
throw std::runtime_error("Unable to parse game");
} catch (qi::expectation_failure<SVI> const& ef) {
std::ostringstream oss;
auto where = ef.first - input.begin();
auto sol = 1 + input.find_last_of("\r\n", where);
auto lineno = 1 + std::count(input.begin(), input.begin() + sol, '\n');
auto col = 1 + where - sol;
auto llen = input.substr(sol).find_first_of("\r\n");
oss << "input.txt:" << lineno << ":" << col << " Expected: " << ef.what_ << "\n"
<< " note: " << input.substr(sol, llen) << "\n"
<< " note:" << std::setw(col) << "" << "^--- here";
throw std::runtime_error(oss.str());
}
}
namespace Helpers {
struct DebugPeeker {
DebugPeeker(std::istream& is, int line) : is(is), line(line) { dopeek(); }
~DebugPeeker() { dopeek(); }
private:
std::istream& is;
int line;
void dopeek() const {
std::char_traits<char> t;
auto ch = is.peek();
std::cerr << "DEBUG " << line << " Peek: ";
if (std::isgraph(ch))
std::cerr << "'" << t.to_char_type(ch) << "'";
else
std::cerr << "<" << ch << ">";
std::cerr << " " << std::boolalpha << is.good() << "\n";
}
};
#define DEBUG_PEEK(is) // Peeker _peek##__LINE__(is, __LINE__);
// make sure =,[,] break tokens
struct mytoken_ctype : std::ctype<char> {
static auto const* get_table() {
static std::vector rc(table_size, std::ctype_base::mask());
rc[' '] = rc['\f'] = rc['\v'] = rc['\t'] = rc['\r'] = rc['\n'] =
std::ctype_base::space;
// crucial for us:
rc['='] = rc['['] = rc[']'] = std::ctype_base::space;
return rc.data();
}
mytoken_ctype() : std::ctype<char>(get_table()) {}
};
template <typename T> struct token {
token(T& into) : _into(into) {}
T& _into;
friend std::istream& operator>>(std::istream& is, token const& t) {
DEBUG_PEEK(is);
std::locale loc = is.getloc();
if constexpr (std::is_same_v<std::decay_t<T>, std::string>) {
loc = is.imbue(std::locale(std::locale(), new mytoken_ctype()));
}
try { is >> t._into; is.imbue(loc); }
catch (...) { is.imbue(loc); throw; }
return is;
}
};
template <typename T> struct expect {
expect(T expected) : _expected(expected) {}
T _expected;
friend std::istream& operator>>(std::istream& is, expect const& e) {
DEBUG_PEEK(is);
if (T actual; is >> token{actual})
if (actual != e._expected)
is.setstate(std::ios::failbit);
return is;
}
};
template <typename T> struct entry {
entry(std::string name, T& into) : _name(name), _into(into) {}
std::string _name;
T& _into;
friend std::istream& operator>>(std::istream& is, entry e) {
DEBUG_PEEK(is);
return is >> expect{e._name} >> expect{'='} >> e._into;
}
};
struct section {
std::string name;
friend std::istream& operator>>(std::istream& is, section& s) {
DEBUG_PEEK(is);
if (is >> expect('['))
return is >> token{s.name} >> expect{']'};
return is;
}
};
static inline std::istream& operator>>(std::istream& is, Ast::Parameters& v) {
DEBUG_PEEK(is);
return is
>> entry{"numColors", v.numColors}
>> entry{"boardSize", v.boardSize}
>> entry{"numSnails", v.numSnails};
}
static inline std::istream& operator>>(std::istream& is, Ast::Initialization& v) {
DEBUG_PEEK(is);
return is
>> entry{"id", v.id}
>> entry{"row", v.row}
>> entry{"col", v.col}
>> entry{"orientation", v.orientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::Color& v) {
DEBUG_PEEK(is);
return is
>> entry{"id", v.id}
>> entry{"nextColor", v.nextColor}
>> entry{"deltaOrientation", v.deltaOrientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::File& v) {
DEBUG_PEEK(is);
for (section s; is >> s;) {
if (s.name == "parameters")
is >> v.parameters;
else if (s.name == "initialization")
is >> v.initializations.emplace_back();
else if (s.name == "color")
is >> v.colors.emplace_back();
else
is.setstate(std::ios::failbit);
}
if (is.eof())
is.clear();
return is;
}
}
static Ast::File std_parse_game(std::string_view input) {
std::istringstream iss{std::string(input)};
using namespace Helpers;
if (Ast::File parsed; iss >> parsed)
return parsed;
throw std::runtime_error("Unable to parse game");
}
std::string read_file(const std::string& name) {
std::ifstream ifs(name);
return std::string(std::istreambuf_iterator<char>(ifs), {});
}
int main() {
std::string const game_save = read_file("input.txt");
Ast::File g1, g2;
try {
std::cout << "Qi: " << (g1 = qi_parse_game(game_save)) << "\n";
} catch (std::exception const& e) { std::cerr << e.what() << "\n"; }
try {
std::cout << "std: " << (g2 = std_parse_game(game_save)) << "\n";
} catch (std::exception const& e) { std::cerr << e.what() << "\n"; }
std::cout << "Equal: " << std::boolalpha << (g1 == g2) << "\n";
}
让我松了口气的是,解析器对数据达成了一致:
Qi: ((4 11 2)
vector<Ast::Initialization>{(0 3 4 0), (1 5 0 1)}
vector<Ast::Color>{(0 1 2), (1 2 1), (2 3 -2), (3 0 -1)})
std: ((4 11 2)
vector<Ast::Initialization>{(0 3 4 0), (1 5 0 1)}
vector<Ast::Color>{(0 1 2), (1 2 1), (2 3 -2), (3 0 -1)})
Equal: true
总结/结论
虽然这个答案是“标准的”和“便携式的”,但它有一些缺点。
例如,它肯定不容易正确,它几乎没有调试选项或错误报告,它不验证输入格式。例如。它仍然会阅读这个邪恶的混乱并接受它:
[parameters] numColors=999 boardSize=999 numSnails=999
[color] id=0 nextColor=1 deltaOrientation=+2 [color] id=1 nextColor=2
deltaOrientation=+1 [
initialization] id=1 row=5 col=0 orientation=1
[color] id=2 nextColor=3 deltaOrientation=-2
[parameters] numColors=4 boardSize=11 numSnails=2
[color] id=3 nextColor=0 deltaOrientation=-1
[initialization] id=0 row=3 col=4 orientation=0
如果您的输入格式不稳定并且是计算机编写的,我强烈建议不要使用标准库方法,因为它将导致难以诊断的问题和可怕的用户体验(不要让您的用户想要扔掉他们的计算机窗口的原因是无用的错误消息,例如“无法解析游戏数据”)。
否则,你可能会。一方面,编译速度会更快。