Thursday, August 13, 2009

wget with proxy

As a researcher, we often need to crawl the web pages and save a dump in local disk for future use. However, sometimes we use a proxy to connect a specific website for security concerns or performance issue. There are two ways to assign this job, the first method:


  • Create a ~/.wgetrc file: This file can be used as a configuration file for wget command.
    http_proxy=<your http proxy>

    Reference: wget manual

  • Add a environment variable to your ~/.bashrc.
    export http_proxy="...url..."



Either way works well, but how about a proxy through socks proxy?

sorry... I cannot find a solution yet

No comments:

Post a Comment