What are some essential PHP scripts for a webcrawler?

MafioSol November 9, 2011
Pinterest Stumbleupon Whatsapp

I’m creating a web crawler in PHP to produce urls from a web page. Building on this, what scripts do I need to have to:

Take a ‘root’ url embedded in the code,
Check to see if robot.txt exists at the root,
If robot.txt exists, read file and store information for future instructions.

 

 

Ads by Google

  1. James Bruce
    November 10, 2011 at 10:24 am

    Hi Mafio - I'm not really sure what you're asking for. The pseudo code you have there is a good start certainly, now you just need to code that. Read those links Jay posted above then ask again with something more specific.

  2. Jay
    November 10, 2011 at 5:43 am

    may be, articles of James Bruce and James Bruce himself can help you !

    Read his articles:

    http://www.makeuseof.com/tag/build-basic-web-crawler-pull-information-website
    http://www.makeuseof.com/tag/build-webcrawler-part-2