Internet research

Last updated January 05, 2025

Internet research is the practice of using data from the Internet, especially free information on the World Wide Web and Internet-based resources (like online forums and social media), in research.

Internet research has had a profound impact on the way ideas are formed and knowledge is created. Common applications of Internet research include personal research on a particular subject (something mentioned on the news, a health problem, etc.), students doing research for academic projects and papers, and journalists and other writers researching stories. Through web search, pages with some relation to a given search entry can be visited, analyzed, and compiled. In addition, the Web can be used to connect with relevant sources of primary data (e.g., experts) and conduct online interviews. Communication tools used for this purpose on the Web include email (including mailing lists), online discussion groups (including message boards and BBS es), and other personal communication facilities (instant messaging, IRC, newsgroups, etc.).

Internet research is distinct from library research (focusing on library-bound resources) and commercial database research (focusing on commercial databases). While many commercial databases are delivered through the Internet, and some libraries purchase access to library databases on behalf of their patrons, searching such databases is generally not considered part of “Internet research”. It should also be distinguished from scientific research (research following a defined and rigorous process) carried out on the Internet, from straightforward retrieving of details like a name or phone number, and from research about the Internet.^{[ citation needed ]}

Internet research can provide quick, immediate, and worldwide access to information, although results may be affected by unrecognized bias, difficulties in verifying a writer's credentials (and therefore the accuracy or pertinence of the information obtained), and whether the researcher has sufficient skill to draw meaningful results from the abundance of material typically available.^[1] The first resources retrieved may not be the most suitable resources to answer a particular question. Popularity is often a factor used in structuring Internet search results, but popular information is not always most correct or representative of the breadth of knowledge and opinion on a topic.

While conducting commercial research fosters a deep concern with costs, and library research fosters a concern with access, Internet research fosters a deep concern for quality, managing the abundance of information and with avoiding unintended bias. This is partly because Internet research occurs in a less mature information environment: an environment with less sophisticated or poorly communicated search skills and much less effort in organizing information. Library and commercial research has many search tactics and strategies unavailable on the Internet and the library and commercial environments invest more deeply in organizing and vetting their information.^{[ citation needed ]}

Search tools

Search tools for finding information on the Internet include web search engines, the search engines on individual websites, the browsers' hotkey-activated feature for searching in the current page, meta search engines, web directories, and specialty search services.

Web search

A Web search allows a user to enter a search query, in the form of keywords or a phrase, into either a search box or on a search form, and then finds matching results and displays them on the screen. The results are accessed from a database, using search algorithms that select web pages based on the location and frequency of keywords on them, along with the quality and number of external hyperlinks pointing at them. The database is supplied with data from a web crawler that follows the hyperlinks that connect web pages, and copies their content, records their URLs, and other data about the page along the way. The content is then indexed, to aid retrieval.

To view this information, a user enters their search query, in the form of keywords or a phrase, into a search box or search form. Then, the search engine uses its algorithms to query a database, selecting

Websites' search feature

Websites often have a search engine of their own, for searching just the site's content, often displayed at the top of every page. For example, Wikipedia provides a search engine for exploring its content. A search engine within a website allows a user to focus on its content and find desired information with more precision than with a web search engine. It may also provide access to information on the website for which a web search engine does not.

Browsers' local search features

Browsers typically provide separate input boxes to search history titles, bookmarks, and the currently displayed web page, though the latter only shows up when a hot key is pressed.

Browsers' search hot key

Using a key combo (two or more keys pressed down at the same time), the user can search the current page displayed by the browser. This is especially useful for long articles. A common key combo for this is Ctrl+f.

Meta search engines

A Meta search engine enables users to enter a search query once and it runs against multiple search engines simultaneously, creating a list of aggregated search results. Since no single search engine covers the entire web, a meta search engine can produce a more comprehensive search of the web. Most meta search engines automatically eliminate duplicate search results. However, meta search engines have a significant limitation because the most popular search engines, such as Google, are not included because of legal restrictions.

Web directories

A Web directory organizes subjects in a hierarchical fashion that lets users investigate the breadth of a specific topic and drill down to find relevant links and content. Web directories can be assembled automatically by algorithms or handcrafted. Human-edited Web directories have the distinct advantage of higher quality and reliability, while those produced by algorithms can offer more comprehensive coverage. The scope of Web directories are generally broad, such as Curlie and The WWW Virtual Library, covering a wide range of subjects, while others focus on specific topics.

Specialty search tools

Specialty search tools enable users to find information that conventional search engines and meta search engines cannot access because the content is stored in databases. In fact, the vast majority of information on the web is stored in databases that require users to go to a specific site and access it through a search form. Often, the content is generated dynamically. As a consequence, Web crawlers are unable to index this information. In a sense, this content is "hidden" from search engines, leading to the term invisible or deep Web. Specialty search tools have evolved to provide users with the means to quickly and easily find deep Web content. These specialty tools rely on advanced bot and intelligent agent technologies to search the deep Web and automatically generate specialty Web directories, such as the Virtual Private Library.

Website authorship

When using the Internet for research, a large number of websites may appear in the search results for whatever search query is entered. Each of these sites has one or more authors or associated organizations providing content, and the accuracy and reliability of the content may be extremely variable. It is necessary to identify authorship of web content so that reliability and bias can be assessed.

The author or sponsoring organization of a website may be found in several ways. Sometimes the author or organization can be found at the bottom of the website home page. Another way is by looking in the ‘Contact Us’ section of the website. It may be directly listed, determined from the email address, or by emailing and asking. If the author's name or sponsoring organization cannot be determined, one should question the trustworthiness of the website. If the author's name or sponsoring organization is found, an Internet search might provide information that can be used to determine if the website is reliable and unbiased.

Internet research software

Internet research software captures information while performing Internet research. This information can then be organized in various ways included tagging and hierarchical trees. The goal is to collect information relevant to a specific research project in one place, so that it can be found and accessed again quickly.

These tools also allow captured content to be edited and annotated and some allow the ability to export to other formats. Other features common to outliners include the ability to use full text search which aids in quickly locating information and filters enable you to drill down to see only information relevant to a specific query. Captured and kept information also provides an additional backup, in case web pages and sites disappear or are inaccessible later.

Related Research Articles

Google Search is a search engine operated by Google. It allows users to search for information on the Web by entering keywords or phrases. Google Search uses algorithms to analyze and rank websites based on their relevance to the search query. It is the most popular search engine worldwide.

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page. They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

In computing, a search engine is an information retrieval software system designed to help find information stored on one or more computer systems. Search engines discover, crawl, transform, and store information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. The most widely used type of search engine is a web search engine, which searches for information on the World Wide Web.

Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating related and/or unrelated phrases, to manipulate the relevance or prominence of resources indexed in a manner inconsistent with the purpose of the indexing system.

A web portal is a specially designed website that brings information from diverse sources, like emails, online forums and search engines, together in a uniform way. Usually, each information source gets its dedicated area on the page for displaying information ; often, the user can configure which ones to display. Variants of portals include mashups and intranet dashboards for executives and managers. The extent to which content is displayed in a "uniform way" may depend on the intended user and the intended purpose, as well as the diversity of the content. Very often design emphasis is on a certain "metaphor" for configuring and customizing the presentation of the content and the chosen implementation framework or code libraries. In addition, the role of the user in an organization may determine which content can be added to the portal or deleted from the portal configuration.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid search traffic rather than direct traffic, referral traffic, social media traffic, or paid traffic.

An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.

The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not indexed by standard web search-engine programs. This is in contrast to the "surface web", which is accessible to anyone using the Internet. Computer scientist Michael K. Bergman is credited with inventing the term in 2001 as a search-indexing term.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

<span class="mw-page-title-main">Entrez</span> Cross-database search engine for health sciences

The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. The NCBI is a part of the National Library of Medicine (NLM), which is itself a department of the National Institutes of Health (NIH), which in turn is a part of the United States Department of Health and Human Services. The name "Entrez" was chosen to reflect the spirit of welcoming the public to search the content available from the NLM.

<span class="mw-page-title-main">Desktop search</span>

Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images, and video. A variety of desktop search programs are now available; see this list for examples. Most desktop search programs are standalone applications. Desktop search products are software alternatives to the search software included in the operating system, helping users sift through desktop files, emails, attachments, and more.

Federated search retrieves information from a variety of sources via a search application built on top of one or more search engines. A user makes a single query request which is distributed to the search engines, databases or other query engines participating in the federation. The federated search then aggregates the results that are received from the search engines for presentation to the user. Federated search can be used to integrate disparate information resources within a single large organization ("enterprise") or for the entire web.

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

Geographic information retrieval (GIR) or geographical information retrieval systems are search tools for searching the Web, enterprise documents, and mobile local search that combine traditional text-based queries with location querying, such as a map or placenames. Like traditional information retrieval systems, GIR systems index text and information from structured and unstructured documents, and also augment those indices with geographic information. The development and engineering of GIR systems aims to build systems that can reliably answer queries that include a geographic dimension, such as "What wars were fought in Greece?" or "restaurants in Beirut". Semantic similarity and word-sense disambiguation are important components of GIR. To identify place names, GIR systems often rely on natural language processing or other metadata to associate text documents with locations. Such georeferencing, geotagging, and geoparsing tools often need databases of location names, known as gazetteers.

A selection-based search system is a search engine system in which the user invokes a search query using only the mouse. A selection-based search system allows the user to search the internet for more information about any keyword or phrase contained within a document or webpage in any software application on their desktop computer using the mouse.

DeepPeep was a search engine that aimed to crawl and index every database on the public Web. Unlike traditional search engines, which crawl existing webpages and their hyperlinks, DeepPeep aimed to allow access to the so-called Deep web, World Wide Web content only available via for instance typed queries into databases. The project started at the University of Utah and was overseen by Juliana Freire, an associate professor at the university's School of Computing WebDB group. The goal was to make 90% of all WWW content accessible, according to Freire. The project ran a beta search engine and was sponsored by the University of Utah and a $243,000 grant from the National Science Foundation. It generated worldwide interest.

<span class="mw-page-title-main">Reverse image search</span> Content-based image retrieval

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.

Discoverability is the degree to which something, especially a piece of content or information, can be found in a search of a file, database, or other information system. Discoverability is a concern in library and information science, many aspects of digital media, software and web development, and in marketing, since products and services cannot be used if people cannot find it or do not understand what it can be used for.

The following outline is provided as an overview of and topical guide to search engines.

References

↑ Hargittai, E. (April 2002). "Second-Level Digital Divide: Differences in People's Online Skills". First Monday . 7 (4). doi: 10.5210/fm.v7i4.942 .

External links

Concordia University. "How to use the Web for research". Concordia University. Archived from the original on 2 April 2015. Retrieved 23 March 2015.
History Department (University of Colorado Boulder). "Using the Internet for Research". University of Colorado Boulder. Retrieved 23 March 2015.
MacDonald, W. Brock. "Research Using the Internet". University of Toronto. Retrieved 23 March 2015.
Pew Research Center (19 March 2015). "Internet Seen as Positive Influence on Education but Negative on Morality in Emerging and Developing Nations". Pew Research Center. Retrieved 23 March 2015.
Rod Library (University of Northern Iowa). "Using the Internet for Research". University of Northern Iowa. Archived from the original on 2 April 2015. Retrieved 23 March 2015.
University of Sussex. "Using the internet for research". University of Sussex. Retrieved 23 March 2015.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hargittai, E. (April 2002). "Second-Level Digital Divide: Differences in People's Online Skills". First Monday . 7 (4). doi: 10.5210/fm.v7i4.942 .

[1]