National Software Reference Library

Last updated
National Software Reference Library
AbbreviationNSRL
TypeGO
Parent organization
NIST
Website http://www.nsrl.nist.gov/

The National Software Reference Library (NSRL), is a project of the National Institute of Standards and Technology (NIST) which maintains a repository of known software, file profiles and file signatures for use by law enforcement and other organizations involved with computer forensic investigations. The project is supported by the United States Department of Justice's National Institute of Justice, the Federal Bureau of Investigation (FBI), Defense Computer Forensics Laboratory (DCFL), the U.S. Customs Service, software vendors, and state and local law enforcement. [1] It also provides a research environment for computational analysis of large sets of files. [2]

Contents

Components

The NSRL is made up of three major elements: [3]

  1. A large physical collection of commercial software packages (e.g., operating systems, off-the-shelf application software);
  2. A database containing detailed information, or metadata, about each file that makes up each of those software packages;
  3. A smaller public dataset containing the most widely used metadata for each file in the collection that is published and updated quarterly. This is called the Reference Data Set.

Reference Data Set

The NSRL collects software from various sources and computes message digests, or cryptographic hash values, from them. The digests are stored in the Reference Data Set (RDS) which can be used to identify "known" files on digital media. This will help alleviate much of the effort involved in determining which files are important as evidence on computers or file systems that have been seized as part of criminal investigations. [3] Although the RDS hashset contains some malicious software (such as steganography and hacking tools) it does not contain illicit material (e.g. indecent images).

The collection of original software media is maintained in order to provide repeatability of the calculated hash values, ensuring admissibility of this data in court.

In 2004 the NSRL released a set of hashes for verifying eVoting software, as part of the US Election Assistance Commission's Electronic Voting Security Strategy. [4]

As of October 1, 2013 the Reference Data Set is at version 2.42 and contains over 33.9 million unique hash values. The data set is available at no cost to the public. [5]

In addition to operating system and application software, the library has also collected numerous popular video game titles to be used both as part of data forensics, as well as partially to serve as video game preservation. [6]

See also

Related Research Articles

A disk image is a snapshot of a storage device's structure and data typically stored in one or more computer files on another storage device. Traditionally, disk images were bit-by-bit copies of every sector on a hard disk often created for digital forensic purposes, but it is now common to only copy allocated data to reduce storage space. Compression and deduplication are commonly used to reduce the size of the image file set. Disk imaging is done for a variety of purposes including digital forensics, cloud computing, system administration, as part of a backup strategy, and legacy emulation as part of a digital preservation strategy. Disk images can be made in a variety of formats depending on the purpose. Virtual disk images are intended to be used for cloud computing, ISO images are intended to emulate optical media and raw disk images are used for forensic purposes. Proprietary formats are typically used by disk imaging software. Despite the benefits of disk imaging the storage costs can be high, management can be difficult and they can be time consuming to create.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

File verification is the process of using an algorithm for verifying the integrity of a computer file, usually by checksum. This can be done by comparing two files bit-by-bit, but requires two copies of the same file, and may miss systematic corruptions which might occur to both files. A more popular approach is to generate a hash of the copied file and comparing that to the hash of the original file.

<span class="mw-page-title-main">Computer forensics</span> Branch of digital forensic science

Computer forensics is a branch of digital forensic science pertaining to evidence found in computers and digital storage media. The goal of computer forensics is to examine digital media in a forensically sound manner with the aim of identifying, preserving, recovering, analyzing and presenting facts and opinions about the digital information.

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The Association for Library Collections and Technical Services Preservation and Reformatting Section of the American Library Association, defined digital preservation as combination of "policies, strategies and actions that ensure access to digital content over time." According to the Harrod's Librarian Glossary, digital preservation is the method of keeping digital material alive so that they remain usable as technological advances render original hardware and software specification obsolete.

<span class="mw-page-title-main">Data (computer science)</span> Quantities, characters, or symbols on which operations are performed by a computer

In computer science, data is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. Digital data is data that is represented using the binary number system of ones (1) and zeros (0), instead of analog representation. In modern (post-1960) computer systems, all data is digital.

<span class="mw-page-title-main">Digital forensics</span> Branch of forensic science

Digital forensics is a branch of forensic science encompassing the recovery, investigation, examination, and analysis of material found in digital devices, often in relation to mobile devices and computer crime. The term "digital forensics" was originally used as a synonym for computer forensics but has expanded to cover investigation of all devices capable of storing digital data. With roots in the personal computing revolution of the late 1970s and early 1980s, the discipline evolved in a haphazard manner during the 1990s, and it was not until the early 21st century that national policies emerged.

HashKeeper is a database application of value primarily to those conducting forensic examinations of computers on a somewhat regular basis.

Anti–computer forensics or counter-forensics are techniques used to obstruct forensic analysis.

Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.

<span class="mw-page-title-main">EnCase</span>

EnCase is the shared technology within a suite of digital investigations products by Guidance Software. The software comes in several products designed for forensic, cyber security, security analytics, and e-discovery use. EnCase is traditionally used in forensics to recover evidence from seized hard drives. It allows the investigator to conduct in-depth analysis of user files to collect evidence such as documents, pictures, internet history and Windows Registry information.

<span class="mw-page-title-main">Fingerprint (computing)</span> Digital identifier derived from the data by an algorithm

In computer science, a fingerprinting algorithm is a procedure that maps an arbitrarily large data item to a much shorter bit string, its fingerprint, that uniquely identifies the original data for all practical purposes just as human fingerprints uniquely identify people for practical purposes. This fingerprint may be used for data deduplication purposes. This is also referred to as file fingerprinting, data fingerprinting, or structured data fingerprinting.

PREservation Metadata: Implementation Strategies (PREMIS) is the de facto digital preservation metadata standard.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:

File carving is the process of reassembling computer files from fragments in the absence of filesystem metadata.

<span class="mw-page-title-main">Digital forensic process</span>

The digital forensic process is a recognized scientific and forensic process used in digital forensics investigations. Forensics researcher Eoghan Casey defines it as a number of steps from the original incident alert through to reporting of findings. The process is predominantly used in computer and mobile forensic investigations and consists of three steps: acquisition, analysis and reporting.

File fixity is a digital preservation term referring to the property of a digital file being fixed, or unchanged. Fixity checking is the process of verifying that a digital object has not been altered or corrupted. During transfer, a repository may run a fixity check to ensure a transmitted file has not been altered en route. Within the repository, fixity checking is used to ensure that digital files have not been affected by data rot or other digital preservation dangers. By itself, fixity checking does not ensure the preservation of a digital file. Instead, it allows a repository to identify which corrupted files to replace with a clean copy from the producer or from a backup.

Forensic search is an emerging field of computer forensics. Forensic search focuses on user created data such as email files, cell phone records, office documents, PDFs and other files that are easily interpreted by a person.

<span class="mw-page-title-main">Video game preservation</span> Form of preservation in video gaming

Video game preservation is a form of preservation applied to the video game industry that includes, but is not limited to, digital preservation. Such preservation efforts include archiving development source code and art assets, digital copies of video games, emulation of video game hardware, maintenance and preservation of specialized video game hardware such as arcade games and video game consoles, and digitization of print video game magazines and books prior to the Digital Revolution.

<span class="mw-page-title-main">CAINE Linux</span>

CAINE Linux is an Italian Linux live distribution managed by Giovanni "Nanni" Bassetti. The project began in 2008 as an environment to foster digital forensics and incidence response (DFIR), with several related tools pre-installed.

References

  1. "NIST Special Database 28 (RDS)". National Software Reference Library (NSRL). National Institute of Standards and Technology. 27 August 2010. Retrieved 7 April 2012.
  2. White, Doug; Owens, Trevor (May 4, 2012). "Life-Saving: The National Software Reference Library". The Signal: The Library of Congress Digital Preservation Blog. The Library of Congress. Retrieved 16 January 2016.
  3. 1 2 Steve Mead (September 2006). "Unique file identification in the National Software Reference Library" (PDF). Digital Investigation. 3 (3): 138–150. doi:10.1016/j.diin.2006.08.010. ISSN   1742-2876. S2CID   17079984.
  4. Verton, Dan (November 2004). "Feds Issue Test Copies of E-voting Software". Computer World. Retrieved 1 September 2010.
  5. "RDS Readme.txt". Archived from the original on 6 November 2013. Retrieved 1 October 2013.
  6. Valentine, Rebekah (September 10, 2018). "NIST adds gaming software to National Software Reference Library". GamesIndustry.biz . Retrieved September 10, 2018.