Recursivly download a whole website

Published:

A few times I've needed to download a full website. Some examples have been to download a full website for offline browsing, or to make a full static HTML backup/archive.

Using wget makes it quite simple. It usually comes built-in with unix systems, and it’s available for Windows as well.

wget --wait 3 --random-wait --timeout 5 --tries 0 --mirror --no-parent --adjust-extension --convert-links --page-requisites --directory-prefix=output/dir http://example.com

Explanation of options

OptionExplanation
--wait 3Wait 3 seconds between each request to not punch the server too badly
--random-waitRandomly vary the wait time to prevent potential identitfication and blocking by statistical analysis
--timeout 5Timeout before giving up
--tries 0Retry ininitely
--mirrorEnables several options suitable for mirroring, like recursion and time-stamping
--no-parentDon't traverse upwards to parent directories when mirroring
--adjust-extensionConverts file extensions like .phpand .cgi to .html
--convert-linksConverts links in downloaded files to work better locally
--page-requisitesDownload things like images, stylesheets, etc. required by pages
--directory-prefix=output/dirWhat directory out dump all the downloaded files in