End of Term Web Archive

Last updated
End of Term Web Archive (EOTArchive)
North America Geological Tapestry.gif
A version of this USGS map was archived by project partner UNT in the 2008 End of Term collection.
Mission statement "The End of Term Web Archive captures and saves U.S. Government websites at the end of presidential administrations."
Commercial?No
Type of projectCollaborative government web archive
Established2008
Website eotarchive.org

The End of Term Web Archive preserves U.S. federal government websites during administration changes. [1]

Contents

Background

The End of Term Web Archive was set up following a 2008 announcement from National Archives and Records Administration (NARA) that they would not be archiving government websites during transition, after carrying out such crawls in 2000 and 2004. [2] The 2004 federal web harvest can be accessed alongside congressional web harvests, beginning with the 109th United States Congress, at National Archives. [3]

The first project partners were the Library of Congress, George Washington University, Stanford University, University of North Texas, the US Government Publishing Office, California Digital Library and the Internet Archive, all members of the International Internet Preservation Consortium. The project was initially sketched out after a General Assembly of the IIPC in 2008. [4] NARA and the Environmental Data & Governance Initiative (EDGI) joined the 2020/21 project. [5]

The project

Custom error page used to direct whitehouse.gov visitors as the website changed in 2009. White House.gov 404 error 1-20-09.JPG
Custom error page used to direct whitehouse.gov visitors as the website changed in 2009.

The project archives websites and documents for public access and research use. [6] A UNT study into the risk to document files found that 83% of PDFs on the .gov domain in 2008 were missing four years later. [7] This is consistent with the requirement to manage websites, but their status means that changes may be of interest to the public and watchdog groups. [8] Evidence of the demand for continued access to historical web material can be found in an announcement made by the EPA in response to concerns about changes in 2017, stating that pages from the previous administration would be carefully archived. [9] These snapshot pages were clearly marked to distinguish them from contemporary content. [10]

The archive prioritizes sites administering areas regarded as likely to be updated or removed over the period of transition. [11] The public are encouraged to nominate important sites and these are combined with broad crawls of government domains to create the collection. [12] [13] Although it is extensive - the 2016 crawl preserved 11,382 sites - it stops short of being comprehensive. [14] [15] Researchers have used these collections to examine the history of climate change policy and reuse of suspended U.S. government Twitter accounts. [16] [17]

See also

Related Research Articles

<span class="mw-page-title-main">Natural Resources Defense Council</span> Non-profit international environmental advocacy group, with its headquarters in New York City

The Natural Resources Defense Council (NRDC) is a United States-based 501(c)(3) non-profit international environmental advocacy group, with its headquarters in New York City and offices in Washington D.C., San Francisco, Los Angeles, Chicago, Bozeman, India, and Beijing. The group was founded in 1970 in opposition to a hydro-electric power power plant in New York.

<span class="mw-page-title-main">United States Environmental Protection Agency</span> U.S. federal government agency

The Environmental Protection Agency (EPA) is an independent agency of the United States government tasked with environmental protection matters. President Richard Nixon proposed the establishment of EPA on July 9, 1970; it began operation on December 2, 1970, after Nixon signed an executive order. The order establishing the EPA was ratified by committee hearings in the House and Senate.

robots.txt Standard used to advise web crawlers and scrapers not to index a web page or site

robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.

<span class="mw-page-title-main">Alexa Internet</span> American web traffic analysis company (1996–2022)

Alexa Internet, Inc. was an American web traffic analysis company based in San Francisco. It was a wholly-owned subsidiary of Amazon.

<span class="mw-page-title-main">Chemtrail conspiracy theory</span> Conspiracy theory about contrails

The chemtrail conspiracy theory is the erroneous belief that long-lasting condensation trails left in the sky by high-flying aircraft are actually "chemtrails" consisting of chemical or biological agents, sprayed for nefarious purposes undisclosed to the general public. Believers in this conspiracy theory say that while normal contrails dissipate relatively quickly, contrails that linger must contain additional substances. Those who subscribe to the theory speculate that the purpose of the chemical release may be solar radiation management, weather modification, psychological manipulation, human population control, biological or chemical warfare, or testing of biological or chemical agents on a population, and that the trails are causing respiratory illnesses and other health problems.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

<span class="mw-page-title-main">Greenhouse gas emissions by the United States</span> Climate changing gases from the North American country

The United States produced 5.2 billion metric tons of carbon dioxide equivalent greenhouse gas (GHG) emissions in 2020, the second largest in the world after greenhouse gas emissions by China and among the countries with the highest greenhouse gas emissions per person. In 2019 China is estimated to have emitted 27% of world GHG, followed by the United States with 11%, then India with 6.6%. In total the United States has emitted a quarter of world GHG, more than any other country. Annual emissions are over 15 tons per person and, amongst the top eight emitters, is the highest country by greenhouse gas emissions per person. However, the IEA estimates that the richest decile in the US emits over 55 tonnes of CO2 per capita each year. Because coal-fired power stations are gradually shutting down, in the 2010s emissions from electricity generation fell to second place behind transportation which is now the largest single source. In 2020, 27% of the GHG emissions of the United States were from transportation, 25% from electricity, 24% from industry, 13% from commercial and residential buildings and 11% from agriculture. In 2021, the electric power sector was the second largest source of U.S. greenhouse gas emissions, accounting for 25% of the U.S. total. These greenhouse gas emissions are contributing to climate change in the United States, as well as worldwide.

<span class="mw-page-title-main">Environmental issues in the United States</span> Overview of the environmental issues in the United States of America

Environmental issues in the United States include climate change, energy, species conservation, invasive species, deforestation, mining, nuclear accidents, pesticides, pollution, waste and over-population. Despite taking hundreds of measures, the rate of environmental issues is increasing rapidly instead of reducing. The United States is among the most significant emitters of greenhouse gasses in the world. In terms of both total and per capita emissions, it is among the largest contributors. The climate policy of the United States has a major influence on the world.

Air pollution is the introduction of chemicals, particulate matter, or biological materials into the atmosphere, causing harm or discomfort to humans or other living organisms, or damaging ecosystems. Air pollution can cause health problems including, but not limited to, infections, behavioral changes, cancer, organ failure, and premature death. These health effects are not equally distributed across the U.S. population; there are demographic disparities by race, ethnicity, socioeconomic status, and education. Air pollution can derive from natural sources, or anthropogenic sources. Anthropogenic air pollution has affected the United States since the beginning of the Industrial Revolution.

<span class="mw-page-title-main">Climate change in the United States</span> Emissions, impacts and responses of the United States related to climate change

Climate change has led to the United States warming by 2.6 °F since 1970. The climate of the United States is shifting in ways that are widespread and varied between regions. From 2010 to 2019, the United States experienced its hottest decade on record. Extreme weather events, invasive species, floods and droughts are increasing. Climate change's impacts on tropical cyclones and sea level rise also affects regions of the country.

New Energy for America was a plan led by Barack Obama and Joe Biden beginning in 2008 to invest in renewable energy sources, reduce reliance on foreign oil, address global warming issues, and create jobs for Americans. The main objective of the New Energy for America plan was to implement clean energy sources in the United States to switch from nonrenewable resources to renewable resources. The plan led by the Obama Administration aimed to implement short-term solutions to provide immediate relief from pain at the pump, and mid- to- long-term solutions to provide a New Energy for America plan. The goals of the clean energy plan hoped to: invest in renewable technologies that will boost domestic manufacturing and increase homegrown energy, invest in training for workers of clean technologies, strengthen the middle class, and help the economy.

<span class="mw-page-title-main">Wayback Machine</span> Digital archive founded by the Internet Archive

The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, an American nonprofit organization based in San Francisco, California. Created in 1996 and launched to the public in 2001, it allows the user to go "back in time" to see how websites looked in the past. Its founders, Brewster Kahle and Bruce Gilliat, developed the Wayback Machine to provide "universal access to all knowledge" by preserving archived copies of defunct web pages.

The climate change policy of the United States has major impacts on global climate change and global climate change mitigation. This is because the United States is the second largest emitter of greenhouse gasses in the world after China, and is among the countries with the highest greenhouse gas emissions per person in the world. In total, the United States has emitted over a trillion metric tons of greenhouse gasses, more than any country in the world.

<span class="mw-page-title-main">2013 Mayflower oil spill</span>

The 2013 Mayflower oil spill occurred on March 29, 2013, when the Pegasus Pipeline, owned by ExxonMobil and carrying Canadian Wabasca heavy crude from the Athabasca oil sands, ruptured in Mayflower, Arkansas, about 25 miles (40 km) northwest of Little Rock releasing about 3,190 barrels of oil. Approximately 3,190 barrels of oil and water mix was recovered. Twenty-two homes were evacuated. The United States Environmental Protection Agency (EPA) classified the leak as a major spill.

<span class="mw-page-title-main">Environmental policy of the Donald Trump administration</span> Environmental policy as enforced by the Donald Trump administration

The environmental policy of the Donald Trump administration represented a shift from the policy priorities and goals of the preceding Barack Obama administration. Where President Obama's environmental agenda prioritized the reduction of carbon emissions through the use of renewable energy with the goal of conserving the environment for future generations, the Trump administration policy was for the US to attain energy independence based on fossil fuel use and to rescind many environmental regulations. By the end of Trump's term, his administration had rolled back 98 environmental rules and regulations, leaving an additional 14 rollbacks still in progress. As of early 2021, the Biden administration was making a public accounting of regulatory decisions under the Trump administration that had been influenced by politics rather than science.

Fossil fuel regulations are part of the energy policy in the United States and have gained major significance with the nation's strong dependence on fossil fuel-based energy. Regulatory processes are established at the federal and state level due to the immense economic, socio-political and environmental impact of fossil fuel extraction and production. Over 80% of the United States' energy comes from fossil fuels such as coal, natural gas, and oil. The Bush administration was marked by the Energy Policy Act of 2005, which provided a monetary incentive for renewable energy adoption and addressed the issue of climate change. The Obama administration was made up of advocates for renewable energy and natural gas, while Donald Trump built his campaign on promises to revive the coal industry.

Data rescue is a movement among scientists, researchers and others to preserve primarily government-hosted data sets, often scientific in nature, to ward off their removal from publicly available websites. While the concept of preserving federal data existed before, it gained new impetus with the election in 2016 of U.S. President Donald Trump.

<span class="mw-page-title-main">Andrew R. Wheeler</span> 15th Administrator of the Environmental Protection Agency (born 1964)

Andrew R. Wheeler is an American attorney who served as the 15th administrator of the United States Environmental Protection Agency (EPA) from 2019 to 2021. He served as the deputy administrator from April to July 2018, and served as the acting administrator from July 2018 to February 2019. He has been a senior advisor to Governor of Virginia Glenn Youngkin since March 2022. He previously worked in the law firm Faegre Baker Daniels, representing coal magnate Robert E. Murray and lobbying against the Obama Administration's environmental regulations. Wheeler served as chief counsel to the United States Senate Committee on Environment and Public Works and to the chairman U.S. senator James Inhofe, prominent for his rejection of climate change. Wheeler is a critic of limits on greenhouse gas emissions and the Intergovernmental Panel on Climate Change.

References

  1. Dwyer, Jim (2016-12-02). "Harvesting Government History, One Web Page at a Time (Published 2016)". The New York Times. ISSN   0362-4331. Archived from the original on 18 Jan 2020. Retrieved 2020-12-07.
  2. Webster, Peter (2017). Brügger, Niels (ed.). "Users, technologies, organisations: Towards a cultural history of world web archiving". Web 25. Histories from 25 Years of the World Wide Web: 179–190. doi:10.3726/b11492. hdl: 2318/1770557 . ISBN   9781433140655. Archived from the original on 2020-10-21.
  3. "National Archives". Congressional & Federal Government Web Harvests. Archived from the original on 2017-09-18. Retrieved 2021-01-18.
  4. Seneca, Tracy; Grotke, Abbie; Hartman, Cathy Nelson; Carpenter, Kris (2012). "It Takes a Village to Save the Web: The End of Term Web Archive" (PDF). DTTP: Documents to the People. 40: 16. ISSN   0091-2085. Archived from the original (PDF) on 2015-09-08.
  5. "GitHub - end-of-term/eot2020". GitHub . Archived from the original on 2020-12-05. Retrieved 2020-12-14.
  6. "End of Term Web Archive: U.S. Government Websites". 2020-12-06. Archived from the original on 2020-12-06. Retrieved 2020-12-15.
  7. Gilmore, Courtney (4 Dec 2020). "UNT Part of Team Archiving Obama Administration Web Content". NBC 5 Dallas-Fort Worth. Archived from the original on 7 Dec 2020. Retrieved 2020-12-04.
  8. "Website Monitoring". Environmental Data and Governance Initiative. Archived from the original on 2020-12-06. Retrieved 2021-02-24.
  9. Mooney, Chris; Eilperin, Juliet. "EPA website removes climate science site from public view after two decades". Washington Post. ISSN   0190-8286. Archived from the original on 2017-04-29. Retrieved 2021-02-18.
  10. "Climate Change | US EPA". 2017-04-29. Archived from the original on 2017-04-29. Retrieved 2021-04-08.
  11. "Guerrilla Archiving". The Politics of Evidence. 2016-12-05. Archived from the original on 4 Aug 2020. Retrieved 2020-12-07.
  12. Jacobs, James R. (2020-08-10). "Nominations sought for the U.S. Federal Government Domain End of Term 2020 Web Archive". Free Government Information (FGI). Archived from the original on 4 Oct 2020. Retrieved 2020-12-07.
  13. "End of Term Archive on Twitter: "And so it begins. We have officially started crawling the websites nominated for the End of Term 2020 web archive! But don't worry, you still have time to nominate more! What are your favorite government sites? #WebArchiveWednesday #WebArchives #GovDocs"". 2020-10-07. Archived from the original on 7 Oct 2020. Retrieved 2020-11-06.
  14. O'Keefe, Ed (2015-10-08). "How many .gov sites exist? Thousands. - The Washington Post". The Washington Post . Archived from the original on 8 Oct 2015. Retrieved 2020-12-04.
  15. Young, Lauren J. "The Librarians Saving The Internet". Science Friday. Archived from the original on 9 Nov 2020. Retrieved 2020-12-04.
  16. EDGI, Toly Rinberg, Maya Anjur-Dietrich, Marcy Beck, Andrew Bergman, Justin Derry, Lindsey Dillon, Gretchen Gehrke, Rebecca Lave, Chris Sellers, Nick Shapiro, Anastasia Aizman, Dan Allan, Madelaine Britt, Raymond Cha, Janak Chadha, Morgan Currie, Sara Johns, Abby Klionsky, Stephanie Knutson, Katherine Kulik, Aaron Lemelin, Kevin Nguyen, Eric Nost, Kendra Ouellette, Lindsay Poirier, Sara Rubinow, Justin Schell, Lizz Ultee, Julia Upfal, Tyler Wedrosky, Jacob Wylie. "Changing the Digital Climate". 100days.envirodatagov.org. Archived from the original on 2018-04-04. Retrieved 2021-01-14.{{cite web}}: CS1 maint: multiple names: authors list (link)
  17. Littman, Justin (2017-11-04). "Suspended U.S. government Twitter accounts". Social Feed Manager. Archived from the original on 2017-11-07. Retrieved 2020-12-07.