Web traffic

Last updated

Web traffic is the amount of data sent and received by visitors to a website. This necessarily does not include the traffic generated by bots. Since the mid-1990s, web traffic has been the largest portion of Internet traffic. [1] This is determined by the number of visitors and the number of pages they visit. Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are popular and if there are any apparent trends, such as one specific page being viewed mostly by people in a particular country. There are many ways to monitor this traffic and the gathered data is used to help structure sites, highlight security problems or indicate a potential lack of bandwidth.

Contents

Not all web traffic is welcomed. Some companies offer advertising schemes that, in return for increased web traffic (visitors), pay for screen space on the site. There is also "fake traffic", which is bot traffic generated by a third party. This type of traffic can damage a website's reputation, its visibility on Google, and overall domain authority.[ citation needed ]

Sites also often aim to increase their web traffic through inclusion on search engines and through search engine optimization.

Analysis

Web analytics is the measurement of the behavior of visitors to a website. In a commercial context, it especially refers to the measurement of which aspects of the website work towards the business objectives of Internet marketing initiatives; for example, which landing pages encourage people to make a purchase. Notable vendors of web analytics software and services include Google Analytics, IBM Digital Analytics (formerly Coremetrics) and Adobe Omniture.

Measurement

Example graph of web traffic at Wikipedia in December 2004 WebTrafficGraph.gif
Example graph of web traffic at Wikipedia in December 2004

Web traffic is measured to see the popularity of websites and individual pages or sections within a site. This can be done by viewing the traffic statistics found in the web server log file, an automatically generated list of all the pages served. A hit is generated when any file is served. The page itself is considered a file, but images are also files, thus a page with 5 images could generate 6 hits (the 5 images and the page itself). A page view is generated when a visitor requests any page within the website – a visitor will always generate at least one page view (the main page) but could generate many more. Tracking applications external to the website can record traffic by inserting a small piece of HTML code in every page of the website. [2]

Web traffic is also sometimes measured by packet sniffing and thus gaining random samples of traffic data from which to extrapolate information about web traffic as a whole across total Internet usage.

The following types of information are often collated when monitoring web traffic: [3]

Websites produce traffic rankings and statistics based on those people who access the sites while using their toolbars and other means of online measurements. The difficulty with this is that it does not look at the complete traffic picture for a site. Large sites usually hire the services of companies such as the Nielsen NetRatings or Quantcast, but their reports are available only by subscription.

Control

The amount of traffic seen by a website is a measure of its popularity. By analysing the statistics of visitors it is possible to see shortcomings of the site and look to improve those areas. It is also possible to increase the popularity of a site and the number of people that visit it.

Limiting access

It is sometimes important to protect some parts of a site by password, allowing only authorized people to visit particular sections or pages.

Some site administrators have chosen to block their page to specific traffic, such as by geographic location. The re-election campaign site for U.S. President George W. Bush (GeorgeWBush.com) was blocked to all internet users outside of the U.S. on 25 October 2004 after a reported attack on the site. [4]

It is also possible to limit access to a web server both based on the number of connections and by the bandwidth expended by each connection. On Apache HTTP servers, this is accomplished by the limitipconn module and others.

From search engines

The majority of website traffic is driven by the search engines. Millions of people use search engines every day to research various topics, buy products, and go about their daily surfing activities. Search engines use keywords to help users find relevant information, and each of the major search engines has developed a unique algorithm to determine where websites are placed within the search results. When a user clicks on one of the listings in the search results, they are directed to the corresponding website and data is transferred from the website's server, thus counting the visitors towards the overall flow of traffic to that website.

Search engine optimization (SEO), is the ongoing practice of optimizing a website to help improve its rankings in the search engines. Several internal and external factors are involved which can help improve a site's listing within the search engines. The higher a site ranks within the search engines for a particular keyword, the more traffic they will receive.

Increasing traffic

Web traffic can be increased by placement of a site in search engines and purchase of advertising, including bulk e-mail, pop-up ads, and in-page advertisements.

Web traffic can also be increased by purchasing through web traffic providers who are experts at delivering targeted traffic, however, buying traffic in the past has seen many websites being penalized on search engines.

Web traffic can be increased not only by attracting more visitors to a site, but also by encouraging individual visitors to "linger" on the site, viewing many pages in a visit. (see Outbrain for an example of this practice)

If a web page is not listed in the first pages of any search, the odds of someone finding it diminishes greatly (especially if there is other competition on the first page). Very few people go past the first page, and the percentage that go to subsequent pages is substantially lower. Consequently, getting proper placement on search engines, a practice known as SEO, is as important as the website itself..[ citation needed ]

Traffic overload

Too much web traffic can dramatically slow down or prevent all access to a website. This is caused by more file requests going to the server than it can handle and may be an intentional attack on the site or simply caused by over-popularity. Large-scale websites with numerous servers can often cope with the traffic required, and it is more likely that smaller services are affected by traffic overload. Sudden traffic load may also hang your server or may result in a shutdown of your services.

Denial of service attacks

Denial-of-service attacks (DoS attacks) have forced websites to close after a malicious attack, flooding the site with more requests than it could cope with. Viruses have also been used to coordinate large-scale distributed denial-of-service attacks. [5]

Sudden popularity

A sudden burst of publicity may accidentally cause a web traffic overload. A news item in the media, a quickly propagating email, or a link from a popular site may cause such a boost in visitors (sometimes called a flash crowd or the Slashdot effect).

Overall worldwide

According to Mozilla since January 2017, more than half of the Web traffic is encrypted with HTTPS. [6] [7]

According to estimates cited by the Interactive Advertising Bureau in 2014, around one third of Web traffic is generated by Internet bots and malware. [8] [9]

See also

Related Research Articles

A web portal is a specially designed website that brings information from diverse sources, like emails, online forums and search engines, together in a uniform way. Usually, each information source gets its dedicated area on the page for displaying information ; often, the user can configure which ones to display. Variants of portals include mashups and intranet "dashboards" for executives and managers. The extent to which content is displayed in a "uniform way" may depend on the intended user and the intended purpose, as well as the diversity of the content. Very often design emphasis is on a certain "metaphor" for configuring and customizing the presentation of the content and the chosen implementation framework or code libraries. In addition, the role of the user in an organization may determine which content can be added to the portal or deleted from the portal configuration.

Proxy server server that acts as an intermediate between a client and its destination server

In computer networking, a proxy server is a server application or appliance that acts as an intermediary for requests from clients seeking resources from servers that provide those resources. A proxy server thus functions on behalf of the client when requesting service, potentially masking the true origin of the request to the resource server.

Alexa Internet American analytics company providing web traffic data

Alexa Internet, Inc. is an American web traffic analysis company based in San Francisco. It is a wholly owned subsidiary of Amazon.

URL redirection, also called URL forwarding, is a World Wide Web technique for making a web page available under more than one URL address. When a web browser attempts to open a URL that has been redirected, a page with a different URL is opened. Similarly, domain redirection or domain forwarding is when all pages in a URL domain are redirected to a different domain, as when wikipedia.com and wikipedia.net are automatically redirected to wikipedia.org.

The Webalizer is web log analysis software, which generates web pages of analysis, from access and usage logs. It is one of the most commonly used web server administration tools. It was initiated by Bradford L. Barrett in 1997. Statistics commonly reported by Webalizer include hits, visits, referrers, the visitors' countries, and the amount of data downloaded. These statistics can be viewed graphically and presented by different time frames, such as by day, hour, or month.

Usage share of web browsers relative market adoption of web browsers

The usage share of web browsers is the proportion, often expressed as a percentage, of visitors to a group of web sites that use a particular web browser.

Web log analysis software is a kind of web analytics software that parses a server log file from a web server, and based on the values contained in the log file, derives indicators about when, how, and by whom a web server is visited. Reports are usually generated immediately, but data extracted from the log files can alternatively be stored in a database, allowing various reports to be generated on demand.

Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage. However, Web analytics is not just a process for measuring web traffic but can be used as a tool for business and market research, and to assess and improve the effectiveness of a website. Web analytics applications can also help companies measure the results of traditional print or broadcast advertising campaigns. It helps one to estimate how traffic to a website changes after the launch of a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views. It helps gauge traffic and popularity trends which is useful for market research.

Domain parking is the registration of an Internet domain name without that domain being associated with any services such as e-mail or a website. This may have been done with a view to reserving the domain name for future development, and to protect against the possibility of cybersquatting. Since the domain name registrar will have set name servers for the domain, the registrar or reseller potentially has use of the domain rather than the final registrant.

Google Analytics is a web analytics service offered by Google that tracks and reports website traffic, currently as a platform inside the Google Marketing Platform brand. Google launched the service in November 2005 after acquiring Urchin.

According to IFABC Global Web Standards, a unique user (UU) is "An IP address plus a further identifier. The term "unique visitor" may be used instead of "unique user" but both terms have essentially the same meaning. Sites may use User Agent, Cookie and/or Registration ID." Note that where users are allocated IP addresses dynamically, this definition may overstate or understate the real number of individual users concerned.

An HTTP cookie is a small piece of data sent from a website and stored on the user's computer by the user's web browser while the user is browsing. Cookies were designed to be a reliable mechanism for websites to remember stateful information or to record the user's browsing activity. They can also be used to remember arbitrary pieces of information that the user previously entered into form fields such as names, addresses, passwords, and credit-card numbers.

Bounce rate is an Internet marketing term used in web traffic analysis. It represents the percentage of visitors who enter the site and then leave ("bounce") rather than continuing to view other pages within the same site. Bounce rate is calculated by counting the number of single page visits and dividing that by the total visits. It is then represented as a percentage of total visits.

In marketing, multivariate testing or multi-variable testing techniques apply statistical hypothesis testing on multi-variable systems, typically consumers on websites. Techniques of multivariate statistics are used. Companies that use multivariate testing include Optimizely, Mailchimp, and Marpipe.

RSS tracking is a methodology for tracking RSS feeds.

A click path or clickstream is the sequence of hyperlinks one or more website visitors follows on a given site, presented in the order viewed. A visitor's click path may start within the website or at a separate third party website, often a search engine results page, and it continues as a sequence of successive webpages visited by the user. Click paths take call data and can match it to ad sources, keywords, and/or referring domains, in order to capture data.

Compete.com was a web traffic analysis service. The company was founded in 2000 and ceased operations in December 2016.

Mobile web analytics studies the behavior of mobile website visitors in a similar way to traditional web analytics. In a commercial context, mobile web analytics refers to the use of data collected as visitors access a website from a mobile phone. It helps to determine which aspects of the website work best for mobile traffic and which mobile marketing campaigns work best for the business, including mobile advertising, mobile search marketing, text campaigns, and desktop promotion of mobile sites and services.

Search analytics is the use of search data to investigate particular interactions among Web searchers, the search engine, or the content during searching episodes. The resulting analysis and aggregation of search engine statistics can be used in search engine marketing (SEM) and search engine optimization (SEO). In other words, search analytics helps website owners understand and improve their performance on search engines, for example identifying highly valuable site visitors, or understanding user intent. Search analytics includes search volume trends and analysis, reverse searching, keyword monitoring, search result and advertisement history, advertisement spending statistics, website comparisons, affiliate marketing statistics, multivariate ad testing, et al.

In electronic commerce, conversion marketing is marketing with the intention of increasing conversions--that is, site visitors who are paying customers. The process of improving the conversion rate is called conversion rate optimization. However, different sites may consider a "conversion" to be a result other than a sale. Say a customer were to abandon an online shopping cart. The company could market a special offer, like free shipping, to convert the visitor into a paying customer. A company may also try to recover the customer through an online engagement method, such as proactive chat, to attempt to assist the customer through the purchase process.

References

  1. Jeffay, Kevin. "Tracking the Evolution of Web Traffic: 1995-2003*" (PDF). UNC DiRT Group's Publications. University of North Carolina at Chapel Hill.
  2. Malacinski, Andrei; Dominick, Scott; Hartrick, Tom (1 March 2001). "Measuring Web Traffic". IBM. Archived from the original on 19 July 2008. Retrieved 10 October 2011.|
  3. "Web Analytics Definitions" (PDF). Web Analytics Association. 22 September 2008. Retrieved 18 May 2015.
  4. Miller, Rich (2004-10-26). "Bush Campaign Web Site Rejects Non-US Visitors".
  5. "Denial of Service". Cert.org. Retrieved 28 May 2012.
  6. "We're Halfway to Encrypting the Entire Web". Electronic Frontier Foundation. 21 February 2017. Retrieved 3 May 2017.
  7. Finley, Klint. "Half the Web Is Now Encrypted. That Makes Everyone Safer". WIRED. Retrieved 1 May 2017.
  8. Vranica, Suzanne (23 March 2014). "A 'Crisis' in Online Ads: One-Third of Traffic Is Bogus". Wall Street Journal. Retrieved 3 May 2017.
  9. "36% Of All Web Traffic Is Fake". Business Insider. Retrieved 3 May 2017.

Bibliography