Investigative Data Warehouse

Last updated

Investigative Data Warehouse (IDW) is a searchable database operated by the FBI. It was created in 2004. Much of the nature and scope of the database is classified. The database is a centralization of multiple federal and state databases, including criminal records from various law enforcement agencies, the U.S. Department of the Treasury's Financial Crimes Enforcement Network (FinCEN), and public records databases. According to Michael Morehart's testimony before the House Committee on Financial Services in 2006, the "IDW is a centralized, web-enabled, closed system repository for intelligence and investigative data. This system, maintained by the FBI, allows appropriately trained and authorized personnel throughout the country to query for information of relevance to investigative and intelligence matters." [1]

Contents

Overview

The size of the database appears to be growing rapidly. In 2004, according to a government solicitation for bids to manage the project, it was approximately 10TB in size. In 2005, according to one FBI official, the IDW contained approximately 100 million documents. In 2006 it contained more than 560 million documents and was accessible by more than 12,000 individuals. According to the FBI's website, as of August 22, 2007, the database contained 700 million records from 53 databases and was accessible by 13,000 individuals around the world.

As of 2007, the FBI is the subject of a lawsuit brought by the EFF (Electronic Frontier Foundation) because of a lack of public notice describing the database and the criteria for including personal information, as required by the Privacy Act of 1974. The lawsuits are a result of two Freedom of Information Act requests filed by the EFF in 2006.

It was built in part by Chiliad corporation, [2] [3] the FBI Office of the Chief Technology Officer, [4] and others. Companies listed on the FOIA files include Northrop Grumman [5] and others.

Purpose

Investigative Data Warehouse–Secret (IDW-S) "provides data and data processing/analysis services to FBI agents and analysts as they perform counter-terrorism, counter-intelligence, and law enforcement missions". The core subsystem supports the Counter-Terrorism Division (CTD), the Special Event Unit, and via DOCLAB-S, the Joint Intelligence Committee Investigation (JICI) and IntelPlus. [6]

According to a 2005 email, "IDW will also be used for criminal and other authorized non-CT investigations as it evolves." (CT being counter terrorism) [7]

Subsystems


Within the system, there were subsystems named IDW-S Core, SPT, and DOCLAB-S [8]

The special projects team (SPT):

allows for the rapid import of new specialized data sources. These data sources are not made available to the general IDW users but instead are provided to a small group of users who have a demonstrated "need-to-know". The SPT System is similar in function to the IDW-S system, with the main difference is a different set of data sources. The SPT System allows its users to access not only the standard IDW Data Store but the specialized SPT Data Store. [9]

Privacy

According to internal emails, the FBI performed several Privacy Impact Assessments (PIAs) of the IDW system. They worked with lawyers from their National Security Law Branch (NSLB) to attempt to make sure their system was complying with various laws regarding sharing of information and secrecy [10] (for example, rule 6e of the Federal Rules of Criminal Procedure, regarding the secrecy of Grand Jury material [11] ).

The Information Sharing Policy Group (ISPG) formed a Discretionary Access Control Team (DACT), to work on "approval of data sets" and "access control requirements" for IDW and DataMart, and responding to other Intelligence Community agencies requesting access. [12]

The EFF FOIA IDW website states "Despite the vast amount of personal information contained in the IDW, the FBI has never published a Privacy Act notice describing the system or explaining the ways in which the records might be used." [13]

There was also a 2005 email from someone on the Office of General Council (OGC) about "preliminary staff musings that maybe we should limit FBI PIA requirements to non-NS systems" (NS being National Security). [14] There was also an email from 2006 saying that 'national security systems are exempt from E-Gov', [15] apparently referring to the E-Government Act of 2002, which has a section that deals with privacy.

Data sources

The IDW used many data sources. The FOIA documents from EFF are heavily redacted, but some of the sources are as follows:

There was also talk of linking the FTTTF "Data Mart" with IDW. [26]

The data in IDW is classified at the 'Secret' level or lower. Higher classifications are not allowed, and can be removed [27]

See also

Related Research Articles

Translingual Information Detection, Extraction and Summarization (TIDES) is a technology development program funded by the U.S. Defense Advanced Research Projects Agency (DARPA), focused on the automated processing and understanding of language data. The primary goal of the program is to enable English speakers to locate and interpret required information quickly and effectively regardless of the original language.

Computer and network surveillance is the monitoring of computer activity and data stored locally on a computer or data being transferred over computer networks such as the Internet. This monitoring is often carried out covertly and may be completed by governments, corporations, criminal organizations, or individuals. It may or may not be legal and may or may not require authorization from a court or other independent government agencies. Computer and network surveillance programs are widespread today and almost all Internet traffic can be monitored.

<span class="mw-page-title-main">Information Awareness Office</span> DARPA division overseeing the "Total Information Awareness" program

The Information Awareness Office (IAO) was established by the United States Defense Advanced Research Projects Agency (DARPA) in January 2002 to bring together several DARPA projects focused on applying surveillance and information technology to track and monitor terrorists and other asymmetric threats to U.S. national security by achieving "Total Information Awareness" (TIA).

The Communications Assistance for Law Enforcement Act (CALEA), also known as the "Digital Telephony Act," is a United States wiretapping law passed in 1994, during the presidency of Bill Clinton.

<span class="mw-page-title-main">Freedom of Information Act (United States)</span> 1967 US statute regarding access to information held by the US government

The Freedom of Information Act (FOIA), 5 U.S.C. § 552, is the United States federal freedom of information law that requires the full or partial disclosure of previously unreleased or uncirculated information and documents controlled by the U.S. government, state, or other public authority upon request. The act defines agency records subject to disclosure, outlines mandatory disclosure procedures, and includes nine exemptions that define categories of information not subject to disclosure. The act was intended to make U.S. government agencies' functions more transparent so that the American public could more easily identify problems in government functioning and put pressure on Congress, agency officials, and the president to address them. The FOIA has been changed repeatedly by both the legislative and executive branches.

<span class="mw-page-title-main">Carnivore (software)</span> Electronic communication monitor used by the FBI

Carnivore, later renamed DCS1000, was a system implemented by the Federal Bureau of Investigation (FBI) that was designed to monitor email and electronic communications. It used a customizable packet sniffer that could monitor all of a target user's Internet traffic. Carnivore was implemented in October 1997. By 2005 it had been replaced with improved commercial software.

<span class="mw-page-title-main">Carl Malamud</span> Technologist, author, and public domain advocate

Carl Malamud is an American technologist, author, and public domain advocate, known for his foundation Public.Resource.Org. He founded the Internet Multicasting Service. During his time with this group, he was responsible for developing the first Internet radio station, for putting the U.S. Securities and Exchange Commission's EDGAR database on-line, and for creating the Internet 1996 World Exposition.

Mark S. Zaid is an American attorney, based in Washington, D.C., with a practice focused on national security law, freedom of speech constitutional claims, and government accountability.

The Automated Targeting System or ATS is a United States Department of Homeland Security computerized system that, for every person who crosses U.S. borders, scrutinizes a large volume of data related to that person, and then automatically assigns a rating for which the expectation is that it helps gauge whether this person may be placed within a risk group of terrorists or other criminals. Similarly ATS analyzes data related to container cargo.

A government database collects information for various reasons, including climate monitoring, securities law compliance, geological surveys, patent applications and grants, surveillance, national security, border control, law enforcement, public health, voter registration, vehicle registration, social security, and statistics.

<span class="mw-page-title-main">Digital Collection System Network</span>

The Digital Collection System Network (DCSNet) is the Federal Bureau of Investigation (FBI)'s point-and-click surveillance system that can perform instant wiretaps on almost any telecommunications device in the US.

<span class="mw-page-title-main">Data breach</span> Intentional or unintentional release of secure information

A data breach is a security violation, in which sensitive, protected or confidential data is copied, transmitted, viewed, stolen, altered or used by an individual unauthorized to do so. Other terms are unintentional information disclosure, data leak, information leakage and data spill. Incidents range from concerted attacks by individuals who hack for personal gain or malice, organized crime, political activists or national governments, to poorly configured system security or careless disposal of used computer equipment or data storage media. Leaked information can range from matters compromising national security, to information on actions which a government or official considers embarrassing and wants to conceal. A deliberate data breach by a person privy to the information, typically for political purposes, is more often described as a "leak".

<i>Bank Julius Baer v. WikiLeaks</i>

Bank Julius Baer & Co. v. WikiLeaks, 535 F. Supp. 2d 980, was a lawsuit filed by Bank Julius Baer against the website WikiLeaks.

The Electronic Frontier Foundation (EFF) is an international non-profit digital rights group based in San Francisco, California. The foundation was formed on 10 July 1990 by John Gilmore, John Perry Barlow and Mitch Kapor to promote Internet civil liberties.

The Electronic Frontier Foundation (EFF) is an international non-profit advocacy and legal organization based in the United States.

MiTAP, or Mitre Text and Audio Processing, is a computer system that tries to automatically gather, translate, organize, and present information "for monitoring infectious disease outbreaks and other global events." It is also used in the FBI Investigative Data Warehouse.

<span class="mw-page-title-main">FBI Index</span> System used to track American citizens and other people

The FBI Indexes, or Index List, was a system used to track American citizens and other people by the Federal Bureau of Investigation (FBI) before the adoption of computerized databases. The Index List was originally made of paper index cards, first compiled by J. Edgar Hoover at the Bureau of Investigations before he was appointed director of the FBI. The Index List was used to track U.S. citizens and others believed by the FBI to be dangerous to national security, and was subdivided into various divisions which generally were rated based on different classes of danger the subject was thought to represent.

The Narcotics and Dangerous Drugs Information System, or NADDIS, is a data index and collection system operated by the United States Drug Enforcement Administration (DEA). Comprising millions of DEA reports and records on individuals, NADDIS is a system by which intelligence analysts, investigators and others in law enforcement retrieve reports from the DEA's Investigative Filing and Reporting System (IFRS). NADDIS is thought to have become the most widely used, if least known, tool in drug law enforcement.

<span class="mw-page-title-main">FISA Improvements Act</span>

The FISA Improvements Act is a proposed act by Senator Dianne Feinstein, Chair of the Senate Intelligence Committee. Prompted by the disclosure of NSA surveillance by Edward Snowden, it would establish the surveillance program as legal, but impose some limitations on availability of the data. Opponents say the bill would codify warrantless access to many communications of American citizens for use by domestic law enforcement.

<span class="mw-page-title-main">Emma Best (journalist)</span> American journalist

Emma Best is an American investigative reporter who gained national attention with their work for WikiLeaks and activist Julian Assange. Best is known for prolific filing of Freedom of Information Act (FOIA) requests on behalf of MuckRock and co-founding the whistleblower site Distributed Denial of Secrets (DDoSecrets) which resulted in Best being investigated by the Department of Homeland Security and temporarily banned from filing FOIA requests.

References

Sources consulted
Endnotes
  1. Morehart 2005, op. cit.
  2. "Chiliad Case Study" (PDF). Archived from the original (PDF) on 2012-05-24. Retrieved 2009-03-18.
  3. David Gardner (2006-08-30). "FBI Shows off Counterterrorism Database". Information Week. Archived from the original on 2011-06-13. Retrieved 2009-03-18.
  4. EFF FOIA Files, 2008 Apr 8, idw01 Archived 2016-03-04 at the Wayback Machine , page 28 of linked pdf
  5. EFF FOIA files, 2008 Apr 8 idw01, page 27 of linked pdf
  6. FBI, IDW-S System Security Plan, 2005 Jan 24
  7. EFF FOIA files, 2008 Apr 8 idw02 Archived 2016-03-04 at the Wayback Machine , pg 13 of linked PDF
  8. FBI, IDW-S System Security Plan, 2005 Jan 24. It is unclear from the FOIA documents the difference between IDW-S and IDW, and thus whether Core SPT and DOCLAB-S are under IDW, or IDW-S.
  9. FBI, S-CONOPS IDW, 2004 Nov 29 Archived 2016-03-04 at the Wayback Machine page 52 of linked pdf
  10. EFF FOIA Files, 2008 April 8 idw02 Archived 2016-03-04 at the Wayback Machine . Most of this FOIA release is emails within the FBI concerning PIAs
  11. EFF FOIA Files, 2008 April 8 idw02 Archived 2016-03-04 at the Wayback Machine , page 73 of linked pdf. For Rule 6e, see https://www.law.cornell.edu/rules/frcrmp/Rule6.htm Archived 2011-11-04 at the Wayback Machine Cornell
  12. EFF FOIA Files, 2008 April 8 idw02 Archived 2016-03-04 at the Wayback Machine pg 74, 75 of linked pdf
  13. "EFF website, FOIA: DOJ's Investigative Data Warehouse ". Archived from the original on 2009-03-28. Retrieved 2009-03-18.
  14. EFF FOIA Files, 2008 April 8 idw02 Archived 2016-03-04 at the Wayback Machine , page 10 of linked pdf. This particular email also mentions the VCF system (which was later scrapped), saying that PIAs for VCF could 'entail substantial costs'
  15. EFF FOIA Files, 2008 Jun 9 idw04 Archived 2016-03-04 at the Wayback Machine , page 35 of linked pdf
  16. 1 2 FBI, IDW Privileged Users Guide, 2004 Dec 1
  17. 1 2 FBI, IDW-S System Security Plan, 2003 Dec 3
  18. 1 2 FBI IDW Status Update, 2005 Sep 21
  19. FBI IDW Status Update, 2005 Sep 21. 'Open Source News' is, in other documents, referred to alongside MiTAP and/or DARPA TIDES.
  20. Note: Some FBI documents list DARPA TIDES, some list MiTAP, some simply say "Open Source News". They are related projects, if not perhaps the same thing.
  21. Financial Crimes Enforcement Network
  22. EFF FOIA files 2008 Apr 8 idw02 Archived 2016-03-04 at the Wayback Machine , pg 8/9 of linked pdf
  23. FBI S-CONOPS IDW 2004 Nov 29 Archived 2016-03-04 at the Wayback Machine page 53 of linked pdf
  24. EFF FOIA Files, 2008 Apr 8, idw02 Archived 2016-03-04 at the Wayback Machine page 83 of linked PDF
  25. EFF FOIA Files, 2008 Apr 8, idw01 Archived 2016-03-04 at the Wayback Machine , page 33 of linked pdf
  26. EFF FOIA Files, 2008 Apr 8, idw02 Archived 2016-03-04 at the Wayback Machine . Page 37 of linked pdf
  27. EFF FOIA files, 2008 Apr 2, idw01 Archived 2016-03-03 at the Wayback Machine , page 43