<?php
$html = <<<HTML
税务调查。
[caption id="attachment_111" align="aligncenter" width="100"]<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登与儿子。" width="100" height="100" class="size-full wp-image" /> 拜登与儿子。[/caption]
他在声明中说:“我会非常认真地调查,往来。”
<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登总统" width="100" height="100" class="aligncenter size-full wp-image" />
<div style="position:relative; overflow:hidden"> <iframe src="https://cdn.google.com/players/VM.html" width="100" height="100" frameborder="0" scrolling="auto" title="大促销 拜登的美国" style="position:absolute;"></iframe> </div>
<iframe style="border: none; overflow: hidden;" src="https://www.facebook.com/plugins/video.php?height=100&href=https%3A%2F%2Fwww.facebook.com;width=100&t=0" width="100" height="100" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
<iframe src="https://www.facebook.com/plugins/video.php?height=400&href=https%3A%2F%2&show_text=false&width=100&t=0" width="100" height="100" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowfullscreen="true" allow="autoplay; clipboard-write; encrypted-media; picture-in-picture; web-share" allowFullScreen="true"></iframe>
<b>更多热点</b>
<p>halo拜登也指美国经济不会衰退</p>
<figure id="attachment_279" style="width: 100px" class="wp-caption alignnone"><img class="size-full wp-imag" src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="修理厂商总会拜登城" width="100" height="100" /><figcaption class="wp-caption-text">修理厂商总会拜登城</figcaption></figure>
<a href="http://google.com">go to google</a>
<span style="color: #ff6600;"><strong>另外,拜登声明中说</strong></span>
HTML;
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->preserveWhiteSpace = true;
$dom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED );
$xpath = new DOMXPath($dom);
$tags = ["拜登", "认真"];
$orContains = '(' . implode(
' or ',
array_map(
fn($str) => "contains(text(), '$str')",
$tags
)
) . ')';
$blacklisted = implode(
' or ',
array_map(
fn($tag) => "descendant-or-self::$tag",
['a', 'img', 'iframe', 'figure', 'figcaption']
)
);
$path = "//*[not($blacklisted) and $orContains]";
//echo $path . "\n\n\n---\n";
foreach ($xpath->query($path) as $node) {
$newText = preg_replace(
'~\[caption .+?\[/caption](*SKIP)(*FAIL)|' . implode('|', $tags) . '~us',
'<span class="article-tag"><a class="mytag" href="http://outside.com">$0</a></span>',
$dom->saveXML($node),
-1,
$count
);
if ($count) {
$replacement = $dom->createDocumentFragment();
$replacement->appendXML($newText);
$node->parentNode->replaceChild($replacement, $node);
}
}
foreach ($xpath->query("/p/text()") as $node) {
var_export($dom->saveXML($node));
echo "\n====\n";
/*$newText = preg_replace(
'~(?:\[caption .+?\[/caption]|<span class="article-tag"><a class="mytag" href="http://outside.com">.*?</a></span>)(*SKIP)(*FAIL)|' . implode('|', $tags) . '~us',
'<span class="article-tag"><a class="mytag" href="http://outside.com">$0</a></span>',
$dom->saveXML($node),
-1,
$count
);
if ($count) {
$replacement = $dom->createDocumentFragment();
$replacement->appendXML($newText);
$node->parentNode->replaceChild($replacement, $node);
}
*/
}
echo html_entity_decode(mb_substr($dom->saveXML($dom->documentElement), 3, -4));
- Output for 8.2.0 - 8.2.18, 8.3.0 - 8.3.6
- Deprecated: mb_convert_encoding(): Handling HTML entities via mbstring is deprecated; use htmlspecialchars, htmlentities, or mb_encode_numericentity/mb_decode_numericentity instead in /in/HgdAW on line 32
'税务调查。
[caption id="attachment_111" align="aligncenter" width="100"]'
====
' 拜登与儿子。[/caption]
他在声明中说:“我会非常认真地调查,往来。”
'
====
'
'
====
税务调查。
[caption id="attachment_111" align="aligncenter" width="100"]<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登与儿子。" width="100" height="100" class="size-full wp-image"/> 拜登与儿子。[/caption]
他在声明中说:“我会非常认真地调查,往来。”
<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登总统" width="100" height="100" class="aligncenter size-full wp-image"/>
<div style="position:relative; overflow:hidden"> <iframe src="https://cdn.google.com/players/VM.html" width="100" height="100" frameborder="0" scrolling="auto" title="大促销 拜登的美国" style="position:absolute;"/> </div><iframe style="border: none; overflow: hidden;" src="https://www.facebook.com/plugins/video.php?height=100&href=https%3A%2F%2Fwww.facebook.com;width=100&t=0" width="100" height="100" frameborder="0" allowfullscreen="allowfullscreen"/><iframe src="https://www.facebook.com/plugins/video.php?height=400&href=https%3A%2F%2&show_text=false&width=100&t=0" width="100" height="100" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowfullscreen="true" allow="autoplay; clipboard-write; encrypted-media; picture-in-picture; web-share"/><b>更多热点</b><p>halo<span class="article-tag"><a class="mytag" href="http://outside.com">拜登</a></span>也指美国经济不会衰退</p><figure id="attachment_279" style="width: 100px" class="wp-caption alignnone"><img class="size-full wp-imag" src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="修理厂商总会拜登城" width="100" height="100"/><figcaption class="wp-caption-text">修理厂商总会拜登城</figcaption></figure><a href="http://google.com">go to google</a><span style="color: #ff6600;"><strong>另外,<span class="article-tag"><a class="mytag" href="http://outside.com">拜登</a></span>声明中说</strong></span>
- Output for 7.4.0 - 7.4.33, 8.0.1 - 8.0.30, 8.1.0 - 8.1.28
- '税务调查。
[caption id="attachment_111" align="aligncenter" width="100"]'
====
' 拜登与儿子。[/caption]
他在声明中说:“我会非常认真地调查,往来。”
'
====
'
'
====
税务调查。
[caption id="attachment_111" align="aligncenter" width="100"]<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登与儿子。" width="100" height="100" class="size-full wp-image"/> 拜登与儿子。[/caption]
他在声明中说:“我会非常认真地调查,往来。”
<img src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="拜登总统" width="100" height="100" class="aligncenter size-full wp-image"/>
<div style="position:relative; overflow:hidden"> <iframe src="https://cdn.google.com/players/VM.html" width="100" height="100" frameborder="0" scrolling="auto" title="大促销 拜登的美国" style="position:absolute;"/> </div><iframe style="border: none; overflow: hidden;" src="https://www.facebook.com/plugins/video.php?height=100&href=https%3A%2F%2Fwww.facebook.com;width=100&t=0" width="100" height="100" frameborder="0" allowfullscreen="allowfullscreen"/><iframe src="https://www.facebook.com/plugins/video.php?height=400&href=https%3A%2F%2&show_text=false&width=100&t=0" width="100" height="100" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowfullscreen="true" allow="autoplay; clipboard-write; encrypted-media; picture-in-picture; web-share"/><b>更多热点</b><p>halo<span class="article-tag"><a class="mytag" href="http://outside.com">拜登</a></span>也指美国经济不会衰退</p><figure id="attachment_279" style="width: 100px" class="wp-caption alignnone"><img class="size-full wp-imag" src="https://royaldesign.com/image/11/gubi-moon-dining-table-round-120-h73-3?w=168&quality=80" alt="修理厂商总会拜登城" width="100" height="100"/><figcaption class="wp-caption-text">修理厂商总会拜登城</figcaption></figure><a href="http://google.com">go to google</a><span style="color: #ff6600;"><strong>另外,<span class="article-tag"><a class="mytag" href="http://outside.com">拜登</a></span>声明中说</strong></span>
preferences:
156.64 ms | 410 KiB | 121 Q