Comparison of software saving Web pages for offline use

Last updated

A number of proprietary software products are available for saving Web pages for later use offline. They vary in terms of the techniques used for saving, what types of content can be saved, the format and compression of the saved files, provision for working with already saved content, and in other ways.

Contents

HTML Content

NameTechnologyCompleteness of saved contentSupport for collectionsEase of adding to existing collectionsNavigable between saved pages in offlineFormat of saved files; open/proprietaryCompressionNotes
wget command line application images and CSS (if -p option is used), but no client-side generated HTML contentYes?Yes, if -k option is usedOpen (HTML or WARC)Yes, if WARC files are used
HTTrack command line application has WinHTTrack for Windows and WebHTTrack for Linux/BSD/Unix GUI front-ends???Yes. Links all remade so open your locally stored pages for the site you downloadOpen. Standard HTML pages saved in a folder. Click on index.html to open home pageNoMany options to let you refine what you save.
Tenmax's Teleportwindows desktop application and scriptable tools for web crawling and archivingmultimedia (except streaming files), CSS, limited support for javascript events and cookies; shockwave/flash content is downloaded but not crawled??YesOpen. Standard HTML pages saved in a folder. Click on index.html to open home pageNosupports advanced filtering options and authentication
ScrapBook Firefox extension See note [ScrapBook 1] [1] YesEasyYes IF those pages were saved in scrapbookProprietary catalog; regular HTML and content for each pageNo

See note [ScrapBook 2]

Mozilla Archive Format Firefox extensionImages, CSS and other static content; clientside-generated HTML content saved fineYesImpossibleNoMAFF (=ZIP of regular HTML and web content)AlwaysThe Mozilla Archive Format add-on is no longer maintained since September 5, 2018. [2]
Read Later Fast Google Chrome extension Stylesheets are saved incompletely or not at allNoN/ANoProprietary; restricted to Google Chrome profile locationNo
PageArchiverGoogle Chrome extensionVideo and audio files (via Flash or HTML5) are not savedYesYes (import/export features)NoOpen; regular HTML for pages, regular zip file for catalogYes for catalog
Archia's Web Page Archiver [3] E-mail based on-line serviceSee note [Archia 1] NoNoNoOpenYes

Notes

ScrapBook

  1. Saved content:
    Default:
    • images, CSS and other static content; clientside-generated HTML content—all saved fine
    Optionally:
    • sound (MP3, WAV, RAM, WMA)
    • video (MPG, AVI, MOV, WMV)
    • archives (ZIP, LZH, RAR, JAR, XPI)
    • java - but can be problematic
    • custom document extensions (e.g. PDF)
  2. Extra features:
    • Search across collections
    Known issues:
    • saved pages embedding TED.com presentations (incl. pages on TED.com) cannot be played even when online
    • selecting a piece of page will save only selected piece — inconvenient when you change page title with a quote from the page
    • doesn't work with Firefox Quantum at the moment

Archia

  1. Saved content:
    Images, CSS and other static content, sound (MP3, WAV, RAM, WMA), video (MPG, AVI, MOV, WMV), archives (ZIP, LZH, RAR, JAR, XPI), custom document extensions (e.g. PDF)

Video

To save video embedded on web sites (e.g. YouTube), there are video download extensions for Firefox (including Download Helper) and Chrome.

See also

Related Research Articles

Advanced Systems Format

Advanced Systems Format is Microsoft's proprietary digital audio/digital video container format, especially meant for streaming media. ASF is part of the Media Foundation framework.

KMPlayer

K-Multimedia Player is a media player for Windows and iOS that can play most current formats, including VCD, HDML, DVD, AVI, MKV, Ogg, OGM, 3GP, MPEG-1/2/4, AAC, WMA 7, 8, WMV, RealMedia, FLV and QuickTime. It has a significant user base, and has received positive ratings and reviews on major independent download sites such as Softonic, Adobe and CNET. KMPlayer is supported by a wide range of advertisements, including in the homepage, dedicated side panels, the options panel, and pop-ups.

PeaZip

PeaZip is a free and open-source file manager and file archiver for Microsoft Windows, ReactOS, Linux and BSD made by Giorgio Tani. It supports its native PEA archive format and other mainstream formats, with special focus on handling open formats. It supports 211 file extensions.

Xarchiver

Xarchiver is a front-end to various command line archiving tools for Linux and BSD operating systems, designed to be independent of the desktop environment. It is the default archiving application of Xfce and LXDE.

SE-Explorer is a freeware portable file manager for Windows which can be used as alternative to Windows Explorer. It is sharply different from FAR Manager and Norton Commander because it is GUI-based application with tabbed interface which made it possible to manipulate more than one directory or file view at the time and it has both types of file managers: orthodox two-panelled manager with two file windows side by side and native explorer emulator.

Munax was a Swedish company that developed a Large Hyper-Parallel Execution (LHPE) search engine system Munax XE. Munax XE, is an all-content search engine and powered nationwide and worldwide public search engines with page, document, audio, video, images, software, and email search. Other customers included vertical search engines and mobile operators.

A demultiplexer for digital media files, or media demultiplexer, also called a file splitter by laymen or consumer software providers, is software that demultiplexes individual elementary streams of a media file, e.g., audio, video, or subtitles and sends them to their respective decoders for actual decoding. Media demultiplexers are not decoders themselves, but are format container handlers that separate media streams from a (container) file and supply them to their respective audio, video, or subtitles decoders.

Dingoo

The Dingoo is a handheld gaming console that supports music and video playback and open game development. The system features an on-board radio and recording program. It is available to consumers in three colors: white, black, and pink. It was released in February 2009 and has since sold over 1 million units.

SUPER (computer program)

Simplified Universal Player Encoder & Recorder (SUPER) is a closed-source adware front end for open-source software video players and encoders provided by the FFmpeg, MEncoder, MPlayer, x264, ffmpeg2theora, musepack, Monkey's Audio, True Audio, WavPack, libavcodec, and the Theora/Vorbis RealProducer plugIn projects. It was first released in 2005. SUPER provides a graphical user interface to these back-end programs, which use a command-line interface.

The Mozilla Archive Format (MAFF) is a web page archiving format provided by Firefox through an extension. It is used to save one or more web pages together with their associated audio, video, and other related web resources to a single file. Unlike MHTML, which uses MIME encoding within a single HTML file, MAFF compresses the page into a ZIP container file.

MPEG Video Wizard DVD

MPEG Video Wizard DVD, also known as MVW-DVD, is a non-linear video editing software developed by Womble Multimedia, Inc.. It allows users to edit video content, create DVDs with menus and then burn them without the need for any additional software.

XMedia Recode

XMedia Recode is a freeware video and audio transcoding program for Microsoft Windows developed by Sebastian Dörfler. It can import and export many types of files such as WMV, MP4, MP3, 3GP, Matroska and more. XMedia Recode can convert unprotected DVDs or DVD files to any supported output file. XMedia Recode features a drag-and-drop style interface and uses job queuing and batch processing to automate the task of transcoding multiple files.

Free Studio

Free Studio is a freeware set of multimedia programs developed by DVDVideoSoft. The programs are available in one integrated package and also as separate downloads.

The PAC-PAD 1 is the first version of an Android tablet computer developed by the Pakistan Aeronautical Complex Kamra in collaboration with Hong Kong based INNAVTEK International. A succeeding model is being developed with cell phone network data connectivity.

The PAC-PAD Takhti 7 is a tablet-computer offered by Pakistan Aeronautical Complex developed in conjuncture with INNAVTEK, the Takhti differs from its sister product PAC PAD 1 because it has double RAM and a dual-core ARM Cortex-A8 processor, the Takhti uses Android Ice Cream Sandwich instead of Android Gingerbread used by the PAC-PAD 1. It is currently priced at PKR 12500 ($120), including warranty and multiple covers/casings for the device.

CineAsset

CineAsset is a complete mastering software suite by Doremi Labs that can create and playback encrypted and unencrypted DCI compliant packages from virtually any source. CineAsset includes a separate "Editor" application for generating Digital Cinema Packages (DCPs). CineAsset Pro adds the ability to generate encrypted DCPs and Key Delivery Messages (KDMs) for any encrypted content in the database.

CinePlayer

CinePlayer is a software based media player used to review Digital Cinema Packages (DCP) without the need for a digital cinema server by Doremi Labs. CinePlayer can play back any DCP, not just those created by Doremi Mastering products. In addition to playing DCPs, CinePlayer can also playback JPEG2000 image sequences and many popular multimedia file types.

Hetman Partition Recovery is a shareware program for recovery of deleted data from hard drive partitions and other storage media. The utility supports both functioning disks and damaged logical partitions and recovers data from both reformatted disks and disks which have had their file system changed from FAT to NTFS or vice versa. In addition to working on existing partitions the tool can also find deleted logical drives, displaying them to the user for further search and recovery of deleted files as well as correcting errors in logical partition design. Hetman Partition Recovery supports reading of regular, zipped, and encrypted files, from disks formatted under NTFS and/or FAT file systems.

Linux.Encoder is considered to be the first ransomware Trojan targeting computers running Linux. There are additional variants of this Trojan that target other Unix and Unix-like systems. Discovered on November 5, 2015, by Dr. Web, this malware affected at least tens of Linux users.

References

  1. Zhang, Gary. "Best Ways to Save Webpage for Offline Viewing (#3 is Awesome!)". Garyzzc. Retrieved 2019-01-03.
  2. "Documentation". mozdev.org. Retrieved 3 August 2019.
  3. "Archia: Web Page Archiver". Archived from the original on 2015-02-19. Retrieved 2015-02-19.