Readers like you help support MUO. When you make a purchase using links on our site, we may earn an affiliate commission. Read More.

I have a list of links on a website that I wish to check with a PHP web crawler program. Some of these links may be broken. Rather than attempting to follow a broken link and getting, say, a 404 error message, I would like to skip over the link before trying to load it.

I am using the features of simple_html_dom.php in my web crawler. So I would like to detect a broken link before I perform $html->load_file($link);

How can I do this?

MAKEUSEOF VIDEO OF THE DAY
SCROLL TO CONTINUE WITH CONTENT
Jerry Yurow
2011-12-18 22:02:00
Jeff,Thanks for your answer.  It turns out that, this time, I do not have to worry so much about broken links, but, rather than seeing a warning message like the one below appearing on my output screen.  I would like it instead to go into a file in a sub-directory of my own choosing on my website.  I have tried the PHP statements:ini_set('error_log','www.yurowdesigns.com/programs/error.log');ini_set('log_errors',TRUE);in my PHP script that are supposed to do this, but they do not seem to do anything My error.log file remains empty and I still am getting warning messages like the one below on my output screen.   I do not want to suppress these messages--they are useful--just to re-route them to my error.log file.Warning: "file_get_contents(http://www.yurowdesigns.com/UkraineSIG/test.asp) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/yurow/wwwroot/yurowdesigns.com/programs/simple_html_dom.php on line 555"Any ideas?
Jeff Fabish
2011-12-20 23:55:00
Hi Jerry,Can you post the source code to PasteBin so I can analyze it?Thanks,- Jeff