3v4l.org

run code in 150+ php & hhvm versions
Bugs & Features
<?php $hrefPattern = '/<a\\s+([^>]*)href=(["\']??)([^"\'>]*?)\\2([^>]*)>(.*)<\\/a>/siU'; $html = <<<HTML <p>If you find any cases where this code falls down, let us know using the Feedback link below.</p> <p>Before using this or similar scripts to fetch pages from other websites, we suggest you read through the related article on <a href="/php/parse-robots/" title="foobar" target="_parent">setting a user agent and parsing robots.txt</a>.</p> <h2>First checking robots.txt</h2> <p>As mentioned above, before using a script to download files you should always <a target="_blank" href="/php/parse-robots/">check the robots.txt file</a>. Here we're making use of the <tt>robots_allowed</tt> function from the article linked above to determine whether we're allowed to access files:</p> <p>As mentioned above, before using a script to download files you should always <a href="/php/parse-robots/">check the robots.txt file</a>. Here we're making use of the <tt>robots_allowed</tt> function from the article linked above to determine whether we're allowed to access files:</p> HTML; preg_match_all($hrefPattern, $html, $matches, PREG_OFFSET_CAPTURE); var_dump($matches);
Output for 4.3.3 - 7.1.0
array(6) { [0]=> array(3) { [0]=> array(2) { [0]=> string(108) "<a href="/php/parse-robots/" title="foobar" target="_parent">setting a user agent and parsing robots.txt</a>" [1]=> int(228) } [1]=> array(2) { [0]=> string(75) "<a target="_blank" href="/php/parse-robots/">check the robots.txt file</a>" [1]=> int(462) } [2]=> array(2) { [0]=> string(59) "<a href="/php/parse-robots/">check the robots.txt file</a>" [1]=> int(773) } } [1]=> array(3) { [0]=> array(2) { [0]=> string(0) "" [1]=> int(231) } [1]=> array(2) { [0]=> string(16) "target="_blank" " [1]=> int(465) } [2]=> array(2) { [0]=> string(0) "" [1]=> int(776) } } [2]=> array(3) { [0]=> array(2) { [0]=> string(1) """ [1]=> int(236) } [1]=> array(2) { [0]=> string(1) """ [1]=> int(486) } [2]=> array(2) { [0]=> string(1) """ [1]=> int(781) } } [3]=> array(3) { [0]=> array(2) { [0]=> string(18) "/php/parse-robots/" [1]=> int(237) } [1]=> array(2) { [0]=> string(18) "/php/parse-robots/" [1]=> int(487) } [2]=> array(2) { [0]=> string(18) "/php/parse-robots/" [1]=> int(782) } } [4]=> array(3) { [0]=> array(2) { [0]=> string(32) " title="foobar" target="_parent"" [1]=> int(256) } [1]=> array(2) { [0]=> string(0) "" [1]=> int(506) } [2]=> array(2) { [0]=> string(0) "" [1]=> int(801) } } [5]=> array(3) { [0]=> array(2) { [0]=> string(43) "setting a user agent and parsing robots.txt" [1]=> int(289) } [1]=> array(2) { [0]=> string(26) "check the robots.txt file" [1]=> int(507) } [2]=> array(2) { [0]=> string(26) "check the robots.txt file" [1]=> int(802) } } }
Output for 4.3.0 - 4.3.2
Warning: Wrong value for parameter 4 in call to preg_match_all() in /in/ucc3J on line 28 NULL