Website tracking

Last updated

Website tracking refers to the act of archiving existing websites and tracking changes to the website over time. Many applications exist for website tracking which can be applied to many different purposes.

Contents

Website monitoring

Website monitoring allows interested parties to track the health of a website or web application. A software program can periodically check to see if a website is down, if broken links exist, or if errors have occurred on specific pages. For example, a web developer who hosts and maintains a website for a customer may want to be notified instantly if the site goes down or if a web application returns an error.

Monitoring the web is a critical component for marketing, sales and product support strategies. Over the past decade transactions on the web have significantly multiplied the use of dynamic web page, secure web sites and integrated search capabilities which requires tracking of user behavior on web sites.

Website change detection

Website change detection allows interested parties to be alerted when a website has changed. A web crawler can periodically scan a website to see if any changes have occurred since its last visit. Reasons to track website changes include:

Web press clippings

This is a parallel application to the offline business of press clippings. For web press clippings, a crawler needs to scour the Internet to find terms that match keywords the clipping service is looking for. So for instance, the Vice President of the United States may have staff looking at web press clippings to see what is being said about the Vice President on any given day. To do this, a web press clipping service (aka Media monitoring service) needs to monitor mainstream websites as well as blogs.

Website archiving

This type of service archives a website so that changes to the website over time can be seen. Unless archived, older versions of a website cannot be viewed and may be lost permanently. Fortunately there is at least one web service (see Internet Archive) that tracks changes to most websites for free. Past information about a company can therefore be gleaned from this type of service, which can be very useful in some circumstances.

Related Research Articles

<span class="mw-page-title-main">Web crawler</span> Software which systematically browses the World Wide Web

A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing.

<span class="mw-page-title-main">World Wide Web</span> Linked hypertext system on the Internet

The World Wide Web is an information system that enables content sharing over the Internet through user-friendly ways meant to appeal to users beyond IT specialists and hobbyists. It allows documents and other web resources to be accessed over the Internet according to specific rules of the Hypertext Transfer Protocol (HTTP).

<span class="mw-page-title-main">Website</span> Set of related web pages served from a single domain

A website is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Websites are typically dedicated to a particular topic or purpose, such as news, education, commerce, entertainment or social networking. Hyperlinking between web pages guides the navigation of the site, which often starts with a home page. As of May 2023, the top 5 most visited websites are Google Search, YouTube, Facebook, Twitter, and Instagram.

<span class="mw-page-title-main">HTTP 404</span> Internet error message

In computer network communications, the HTTP 404, 404 not found, 404, 404 error, page not found, or file not found error message is a hypertext transfer protocol (HTTP) standard response code, to indicate that the browser was able to communicate with a given server, but the server could not find what was requested. The error may also be used when a server does not wish to disclose whether it has the requested information.

Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed in a manner inconsistent with the purpose of the indexing system.

The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not indexed by standard web search-engine programs. This is in contrast to the "surface web", which is accessible to anyone using the Internet. Computer scientist Michael K. Bergman is credited with inventing the term in 2001 as a search-indexing term.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

Google AdSense is a program run by Google through which website publishers in the Google Network of content sites serve text, images, video, or interactive media advertisements that are targeted to the site content and audience. These advertisements are administered, sorted, and maintained by Google. They can generate revenue on either a per-click or per-impression basis. Google beta-tested a cost-per-action service, but discontinued it in October 2008 in favor of a DoubleClick offering. In Q1 2014, Google earned US$3.4 billion, or 22% of total revenue, through Google AdSense. AdSense is a participant in the AdChoices program, so AdSense ads typically include the triangle-shaped AdChoices icon. This program also operates on HTTP cookies. In 2021, over 38.3 million websites use AdSense.

A media monitoring service, a press clipping service or a clipping service as known in earlier times, provides clients with copies of media content, which is of specific interest to them and subject to changing demand; what they provide may include documentation, content, analysis, or editorial opinion, specifically or widely. These services tend to specialize their coverage by subject, industry, size, geography, publication, journalist, or editor. The printed sources, which could be readily monitored, greatly expanded with the advent of telegraphy and submarine cables in the mid- to late-19th century; the various types of media now available proliferated in the 20th century, with the development of radio, television, the photocopier and the World Wide Web. Though media monitoring is generally used for capturing content or editorial opinion, it also may be used to capture advertising content.

Business software is any software or set of computer programs used by business users to perform various business functions. These business applications are used to increase productivity, measure productivity, and perform other business functions accurately.

Push technology, also known as server push, refers to a method of communication on the Internet where the initial request for a transaction is initiated by the server, rather than the client. This approach is different from the more commonly known "pull" method, where information transmission is requested by the receiver or client.

In web archiving, an archive site is a website that stores information on webpages from the past for anyone to view.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

Website monitoring is the process of testing and verifying that end-users can interact with a website or web application as expected. Website monitoring are often used by businesses to ensure website uptime, performance, and functionality is as expected.

<span class="mw-page-title-main">Geotargeting</span> Website content based on a visitors location

In geomarketing and internet marketing, geotargeting is the method of delivering different content to visitors based on their geolocation. This includes country, region/state, city, metro code/zip code, organization, IP address, ISP, or other criteria. A common usage of geotargeting is found in online advertising, as well as internet television with sites such as iPlayer and Hulu. In these circumstances, content is often restricted to users geolocated in specific countries; this approach serves as a means of implementing digital rights management. Use of proxy servers and virtual private networks may give a false location.

Social media measurement, also called social media controlling, is the management practice of evaluating successful social media communications of brands, companies, or other organizations.

Change detection and notification (CDN) is the automatic detection of changes made to World Wide Web pages and notification to interested users by email or other means.

Online presence management is the process of creating and promoting traffic to a personal or professional brand online. This process combines web design, and development, blogging, search engine optimization, pay-per-click marketing, reputation management, directory listings, social media, link sharing, and other avenues to create a long-term positive presence for a person, organization, or product in search engines and on the web in general.

IFTTT is a private commercial company that runs services that allow a user to program a response to events in the world.

References