Recursivly download a whole website
A few times I've needed to download a full website. Some examples have been to download a full website for offline browsing, or to make a full static HTML backup/archive.
wget makes it quite simple. It usually comes built-in with unix systems, and it’s available for Windows as well.
wget --wait 3 --random-wait --timeout 5 --tries 0 --mirror --no-parent --adjust-extension --convert-links --page-requisites --directory-prefix=output/dir http://example.com
Explanation of options
|--wait 3||Wait 3 seconds between each request to not punch the server too badly|
|--random-wait||Randomly vary the wait time to prevent potential identitfication and blocking by statistical analysis|
|--timeout 5||Timeout before giving up|
|--tries 0||Retry ininitely|
|--mirror||Enables several options suitable for mirroring, like recursion and time-stamping|
|--no-parent||Don't traverse upwards to parent directories when mirroring|
|--adjust-extension||Converts file extensions like |
|--convert-links||Converts links in downloaded files to work better locally|
|--page-requisites||Download things like images, stylesheets, etc. required by pages|
|--directory-prefix=output/dir||What directory out dump all the downloaded files in|