Even though WiFi is everywhere these days, you will find yourself without it from time to time. And when you do, there may be certain websites you wish you could access while offline — perhaps for research, entertainment, or posterity.
For example, if you’re to embark on a 12-hour international flight, downloading an entire website can be a great alternative to ebooks, audiobooks, podcasts, and movies. Examples of good and interesting infodump websites include Michael Bluejay’s guides and Wait But Why.
But how do you go about downloading an entire website? It’s easier than you think! Here are four tools you can use to do the job for you, nearly zero effort required.
1. WebCopy (Windows)
WebCopy by Cyotek takes a website URL and scans it for links, pages, and media. As it finds pages, it recursively looks for more links, pages, and media until the whole website is discovered. Then you can use the configuration options to decide which parts to download offline.
The interesting thing about WebCopy is you can set up multiple “projects” that each have their own settings and configurations. This makes it easy to re-download many different sites whenever you want, each one in the same exact way every time. One project can copy many websites, so use them with an organized plan (e.g. a “Tech” project for copying tech sites).
To download a website with WebCopy:
- Install and launch the app.
- Navigate to File > New to create a new project.
- Type the URL into the Website field.
- Change the Save folder field to where you want the site saved.
- Play around with Project > Rules… (learn more about WebCopy Rules).
- Navigate to File > Save As… to save the project.
- Click Copy Website in the toolbar to start the process.
Once the copying is done, you can use the Results tab to see the status of each individual page and/or media file. The Errors tab shows any problems that may have occurred and the Skipped tab shows files that weren’t downloaded. But most important is the Sitemap, which shows the full directory structure of the website as discovered by WebCopy.
To view the website offline, open File Explorer and navigate to the save folder you designated. Open the index.html (or sometimes index.htm) in your browser of choice to start browsing.
2. HTTrack (Windows, Linux, Android)
HTTrack is more known than WebCopy, and is arguably better because it’s open source and available on platforms other than Windows, but the interface is a bit clunky and leaves much to be desired. However, it works well so don’t let that turn you away.
Like WebCopy, it uses a project-based approach that lets you copy multiple websites and keep them all organized. You can pause and resume downloads, and you can update copied websites by re-downloading old and new files.
To download a website with HTTrack:
- Install and launch the app.
- Click Next to begin creating a new project.
- Give the project a name, category, base path, then click Next.
- Select Download web site(s) for Action, then type each website’s URL in the Web Addresses box, one URL per line. You can also store URLs in a TXT file and import it, which is convenient when you want to re-download the same sites later. Click Next.
- Adjust parameters if you want, then click Finish.
Once everything is downloaded, you can browse the site like normal by going to where the files were downloaded and opening the index.html or index.htm in a browser.
3. SiteSucker (Mac, iOS)
If you’re on a Mac, your best option is SiteSucker. This simple tool rips entire websites and maintains the same overall structure, and includes all relevant media files too (e.g. images, PDFs, style sheets). It has a clean and easy-to-use interface that could not be easier to use: you literally paste in the website URL and press Enter.
One nifty feature is the ability to save the download to a file, then use that file to download the same exact files and structure again in the future (or on another machine). This feature is also what allows SiteSucker to pause and resume downloads.
SiteSucker costs $5 and does not come with a free version or a free trial, which may be its biggest downside. The latest version requires macOS 10.11 El Capitan or later. Older versions of SiteSucker are available for older Mac systems, but some features may be missing.
4. Wget (Windows, Mac, Linux)
Wget is a command-line utility that can retrieve all kinds of files over the HTTP and FTP protocols. Since websites are served through HTTP and most web media files are accessible through HTTP or FTP, this makes Wget an excellent tool for ripping websites.
While Wget is typically used to download single files, it can be used to recursively download all pages and files that are found through an initial page:
wget -r -p //www.makeuseof.com
However, some sites may detect and prevent what you’re trying to do because ripping a website can cost them a lot of bandwidth. To get around this, you can disguise yourself as a web browser with a user agent string:
wget -r -p -U Mozilla //www.makeuseof.com
If you want to be polite, you should also limit your download speed (so you don’t hog the web server’s bandwidth) and pause between each download (so you don’t overwhelm the web server with too many requests):
wget -r -p -U Mozilla --wait=10 --limit-rate=35K //www.makeuseof.com
Wget comes bundled with most Unix-based systems. On Mac, you can install Wget using a single Homebrew command: brew install wget (how to set up Homebrew on Mac). On Windows, you’ll need to use this ported version instead.
Which Websites Do You Want to Download?
The bigger the site, the bigger the download. We don’t recommend downloading huge sites like MakeUseOf because you’ll need thousands of MBs to store all of the media files we use. The same is true for any other site that’s frequently updated and heavy on media.
The best sites to download are those with lots of text and not many images, and sites that don’t regularly add new pages or changed. Static information sites, online ebook sites, and sites you want to archive in case they go down are ideal.
Which sites are you trying to download? Are there any other tools for copying websites that we missed? Share them with us down in the comments below!
Originally written by Justin Pot on April 20, 2013
Image Credit: RawPixel.com via Shutterstock