Using wget to Download an Entire Website
Note: Don't expect wget to be able to receive access to "back-end" components, which are required for the full functionality of the Website. The --mirror option is intended for the visible, "front-end" components.
Here's how it's done:
$ wget --mirror --convert-links -p --no-parent -w
Explanation of the options used above:
--mirror (or: '-m') : The basic option which achieves the task of backing up the Website.
--convert-links (or: '-k') : Save the Website's links in local viewing format (links are defined to work locally & offline).
-p : Download all required files, so that the Website's functionality is maintained and preserved as much as possible.
--no-parent : Only files below the specified location will be downloaded. The parent directory won't be accessed during the download process.
-w
-P
It's also possible to exclude the downloading of certain file types & directories, and to prevent access to external domains, by adding the following options:
--reject
--exclude-directories=
--domains=
Note: The mirroring achieved with wget isn't always perfect, and there definitely are cases in which certain aspects of the Website's functionality aren't fully available while offline. If the results are not satisfactory, a possible alternative could be to use the tool HTTrack Website Copier.
No comments:
Post a Comment