The modern Web is a dynamic place. However, sometimes it's necessary (or desirable) to remove the dynamic functionality of a website, while preserving its static content.
Inspired in part by Karen Stevenson's excellent blog post, "Sending a Drupal Site into Retirement," I wanted to outline a few other techniques for accomplishing this.
Reasons you may want to create a static copy of a site:
- The site runs on an outdated version of dynamic web software
- The site has been hacked, but its content is still relevant
- The site's content has lost its immediacy, but may need to be revived in the future as a dynamic website
- The site was built in 2004 in ColdFusion by a vendor that has flown the coop (oops)
Method One: wget
Wget is a cross-platform command-line program for retrieving web pages. It's almost like it was built to do this.
Run the following code to crawl www.example.com and save it as flat files to an arbitrary directory of your choosing (noted by /path/to/destination/directory):
wget -P /path/to/destination/directory/ -mpck --user-agent="" -e robots=off --wait 1 -E <a href="https://www.example.com/">https://www.example.com/</a>
More Information for the Stanford Web Environment
If you have a Drupal, WordPress, or MediaWiki site hosted on the Stanford WWW servers (AKA "AFS"), you can use the wget method to create a static copy of your site in cgi-bin.
Assuming you have a site at http://ponies.stanford.edu and it lives at /afs/ir/group/ponies/cgi-bin/drupal.
- SSH into corn.stanford.edu
- Run the following command:
wget -P /afs/ir/group/ponies/WWW/ -mpck --user-agent="" -e robots=off --wait 1 -E <a href="http://ponies.stanford.edu/">http://ponies.stanford.edu/</a>
- Visit http://www.stanford.edu/group/ponies/ponies.stanford.edu in a browser; you should have a full copy of your production site
- You may have to do some cleanup of the HTML code, and may want to rename the directory using the following command:
mv /afs/ir/group/ponies/WWW/ponies.stanford.edu /afs/ir/group/ponies/WWW/static
- Once you've checked everything out and it looks good, you can submit a Virtual Host change request so that ponies.stanford.edu points at www.stanford.edu/group/ponies/static
- If you want to then delete the dynamic site, submit a HelpSU request.
Method Two: Drupal's "Disable All Forms" Module
This method works well if you may want to revive the Drupal site at some point in the future, but don't want to deal with spammers and other malcontents.