As a researcher, we often need to crawl the web pages and save a dump in local disk for future use. However, sometimes we use a proxy to connect a specific website for security concerns or performance issue. There are two ways to assign this job, the first method:
- Create a ~/.wgetrc file: This file can be used as a configuration file for wget command.
http_proxy=<your http proxy>
Reference: wget manual
- Add a environment variable to your ~/.bashrc.
export http_proxy="...url..."
Either way works well, but how about a proxy through socks proxy?
sorry... I cannot find a solution yet
No comments:
Post a Comment