PHP XML Expat 解析器:如何只读取 XML 文档的一部分?

2024-01-16

我有一个具有以下结构的 XML 文档:

<posts>
<user id="1222334">
  <post>
    <message>hello</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>hello client how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
<user id="2333343">
  <post>
    <message>good morning</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>good morning how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
</posts>

我能够创建解析器并打印出整个文档,但问题是我只想打印(用户)节点和具有特定属性(id)的子节点。

我的 PHP 代码是:

if( !empty($_GET['id']) ){
    $id = $_GET['id'];
    $parser=xml_parser_create();
    function start($parser,$element_name,$element_attrs)
      {
    switch($element_name)
        {
        case "USER": echo "-- User --<br>";
        break;
        case "CLIENT": echo "Name: ";
        break;
        case "MESSAGE": echo "Message: ";
        break;
        case "TIME": echo "Time: ";
        break;
        case "POST": echo "--Post<br> ";
        }
  }

function stop($parser,$element_name){  echo "<br>";  }
function char($parser,$data){ echo $data; }
xml_set_element_handler($parser,"start","stop");
xml_set_character_data_handler($parser,"char");

$file = "test.xml";
$fp = fopen($file, "r");
while ($data=fread($fp, filesize($file)))
  {
  xml_parse($parser,$data,feof($fp)) or 
  die (sprintf("XML Error: %s at line %d", 
  xml_error_string(xml_get_error_code($parser)),
  xml_get_current_line_number($parser)));
  }
xml_parser_free($parser);
}

使用这个在start()函数可以选择正确的节点,但对读取过程没有任何影响:

    if(($element_name == "USER") && $element_attrs["ID"] && ($element_attrs["ID"] == "$id"))

任何帮助,将不胜感激

UPDATE:XMLReader 可以工作,但是当使用 if 语句时它会停止工作:

foreach ($filteredUsers as $user) {
echo "<table border='1'>";
foreach ($user->getChildElements('post') as $index => $post) {

    if( $post->getChildElements('client') == "operator" ){
    printf("<tr><td class='blue'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));
    }else{
    printf("<tr><td class='green'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));

    }
}
echo "</table>";
}

As suggested in a comment earlier, you can alternatively use the XMLReaderDocs http://php.net/XMLReader.

XMLReader 扩展是一个 XML Pull 解析器。阅读器充当文档流上向前移动的光标,并在途中的每个节点处停止。

It is a class (with the same name: XMLReader) which can open a file. By default you use next() to move to the next node. You would then check if the current position is at an element and then if the element has the name you're looking for and then you could process it, for example by reading the outer XML of the element XMLReader::readOuterXml()Docs http://php.net/XMLReader.readOuterXml.

与 Expat 解析器中的回调相比,这有点麻烦。为了获得更大的灵活性XMLReader我通常创造自己能够工作的迭代器XMLReader反对并提供我需要的步骤 https://gist.github.com/hakre/5147685.

它们允许直接迭代具体元素foreach。这是这样一个例子:

require('xmlreader-iterators.php'); // https://gist.github.com/hakre/5147685

$xmlFile = '../data/posts.xml';

$ids = array(3, 8);

$reader = new XMLReader();
$reader->open($xmlFile);

/* @var $users XMLReaderNode[] - iterate over all <user> elements */
$users = new XMLElementIterator($reader, 'user');

/* @var $filteredUsers XMLReaderNode[] - iterate over elements with id="3" or id="8" */
$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
}

我创建了一个 XML 文件,其中包含更多类似您的问题中的帖子,编号在id从一开始的属性:

$xmlFile = '../data/posts.xml';

然后我创建了一个数组,其中包含感兴趣的用户的两个 ID 值:

$ids = array(3, 8);

稍后将在过滤条件中使用它。然后XMLReader创建并打开 XML 文件:

$reader = new XMLReader();
$reader->open($xmlFile);

下一步创建一个迭代器<user>该读者的要素:

$users = new XMLElementIterator($reader, 'user');

然后过滤id之前存储到数组中的属性值:

$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

剩下的就是迭代foreach现在所有条件都已制定:

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
}

这将返回 ID 为 3 和 8 的用户的 XML:

---------------
User with ID 3:
<user id="3">
        <post>
            <message>message</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>
---------------
User with ID 8:
<user id="8">
        <post>
            <message>message 8.1</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.2</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.3</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>

The XMLReaderNode which is part of the XMLReader iterators https://gist.github.com/hakre/5147685 does also provide a SimpleXMLElementDocs http://php.net/SimpleXMLElement in case you want to easily read values inside of the <user> element.

以下示例展示了如何获取计数<post>里面的元素<user>元素:

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
    echo "Number of posts: ", $user->asSimpleXML()->post->count(), "\n";
}

然后这将显示Number of posts: 1对于用户 ID 3 和Number of posts: 3对于用户 ID 8。

但是,如果外部 XML 很大,您不想这样做,而是想继续在该元素内部进行迭代:

// rewind
$reader->open($xmlFile);

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    foreach ($user->getChildElements('post') as $index => $post) {
        printf(" * #%d: %s\n", ++$index, $post->getChildElements('message'));
    }
    echo "Number of posts: ", $index, "\n";
}

产生以下输出:

---------------
User with ID 3:
 * #1: message 3
Number of posts: 1
---------------
User with ID 8:
 * #1: message 8.1
 * #2: message 8.2
 * #3: message 8.3
Number of posts: 3

此示例显示:根据嵌套子级的大小,您可以使用可用的迭代器进一步遍历getChildElements()或者您也可以使用常见的 XML 解析器,例如SimpleXML甚至DOMDocumentXML 的子集。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

PHP XML Expat 解析器:如何只读取 XML 文档的一部分? 的相关文章

随机推荐