Comparison of software saving Web pages for offline use

Last updated

A number of proprietary software products are available for saving Web pages for later use offline. They vary in terms of the techniques used for saving, what types of content can be saved, the format and compression of the saved files, provision for working with already saved content, and in other ways.

Contents

HTML Content

NameTechnologyCompleteness of saved contentSupport for collectionsEase of adding to existing collectionsNavigable between saved pages in offlineFormat of saved files; open/proprietaryCompressionNotes
wget command line application images and CSS (if -p option is used), but no client-side generated HTML contentYes ?Yes, if -k option is usedOpen (HTML or WARC)Yes, if WARC files are used
HTTrack command line application has WinHTTrack for Windows and WebHTTrack for Linux/BSD/Unix GUI front-ends ? ? ?Yes. Links all remade so open your locally stored pages for the site you downloadOpen. Standard HTML pages saved in a folder. Click on index.html to open home pageNoMany options to let you refine what you save.
Tenmax's Teleportwindows desktop application and scriptable tools for web crawling and archivingmultimedia (except streaming files), CSS, limited support for javascript events and cookies; shockwave/flash content is downloaded but not crawled ? ?YesOpen. Standard HTML pages saved in a folder. Click on index.html to open home pageNosupports advanced filtering options and authentication
ScrapBook Firefox extension See note [ScrapBook 1] [1] YesEasyYes IF those pages were saved in scrapbook Proprietary catalog; regular HTML and content for each pageNo

See note [ScrapBook 2]

Mozilla Archive Format Firefox extensionImages, CSS and other static content; clientside-generated HTML content saved fineYesImpossibleNoMAFF (=ZIP of regular HTML and web content)AlwaysThe Mozilla Archive Format add-on is no longer maintained since September 5, 2018. [2]
Read Later Fast Google Chrome extension Stylesheets are saved incompletely or not at allNoNo Proprietary; restricted to Google Chrome profile locationNo
PageArchiverGoogle Chrome extensionVideo and audio files (via Flash or HTML5) are not savedYesYes (import/export features)NoOpen; regular HTML for pages, regular zip file for catalogYes for catalog
Archia's Web Page Archiver [3] E-mail based on-line serviceSee note [Archia 1] NoNoNoOpenYes

See also

Notes

ScrapBook

  1. Saved content:
    Default:
    • images, CSS and other static content; clientside-generated HTML content—all saved fine
    Optionally:
    • sound (MP3, WAV, RAM, WMA)
    • video (MPG, AVI, MOV, WMV)
    • archives (ZIP, LZH, RAR, JAR, XPI)
    • java - but can be problematic
    • custom document extensions (e.g. PDF)
  2. Extra features:
    • Search across collections
    Known issues:
    • saved pages embedding TED.com presentations (incl. pages on TED.com) cannot be played even when online
    • selecting a piece of page will save only selected piece — inconvenient when you change page title with a quote from the page
    • doesn't work with Firefox Quantum at the moment

Archia

  1. Saved content:
    Images, CSS and other static content, sound (MP3, WAV, RAM, WMA), video (MPG, AVI, MOV, WMV), archives (ZIP, LZH, RAR, JAR, XPI), custom document extensions (e.g. PDF)

Video

To save video embedded on web sites (e.g. YouTube), there are video download extensions for Firefox (including Download Helper) and Chrome.

References

  1. Zhang, Gary. "Best Ways to Save Webpage for Offline Viewing (#3 is Awesome!)". Garyzzc. Retrieved 2019-01-03.
  2. "Documentation". mozdev.org. Retrieved 3 August 2019.
  3. "Archia: Web Page Archiver". Archived from the original on 2015-02-19. Retrieved 2015-02-19.