[svlug] wget: disable robots.txt compliance?

Jeffrey Siegal jbs at quiotix.com
Sat Oct 27 01:46:01 PDT 2001


Karsten M. Self wrote:

> I'm trying to grab a site using wget, but its respect for the robots.txt
> standard is unfortunately preventing this.  Any convenient way to bypass
> this, or an alternate tool for the job?

The man page says that "robots = off" in the config file should do the 
trick.

Personally, I think it should default to that. A human using the wget 
command is not a "robot"







More information about the svlug mailing list