[svlug] wget: disable robots.txt compliance?
Jeffrey Siegal
jbs at quiotix.com
Sat Oct 27 01:46:01 PDT 2001
Karsten M. Self wrote:
> I'm trying to grab a site using wget, but its respect for the robots.txt
> standard is unfortunately preventing this. Any convenient way to bypass
> this, or an alternate tool for the job?
The man page says that "robots = off" in the config file should do the
trick.
Personally, I think it should default to that. A human using the wget
command is not a "robot"
More information about the svlug
mailing list