Common Log Format

Last updated June 19, 2023

For computer log management, the Common Log Format,^[1] also known as the NCSA Common log format,^[2] (after NCSA HTTPd) is a standardized text file format used by web servers when generating server log files.^[3] Because the format is standardized, the files can be readily analyzed by a variety of web analysis programs, for example Webalizer and Analog.

Example

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

A dash (-) in a field indicates missing data.

127.0.0.1 is the IP address of the client (remote host) which made the request to the server.
user-identifier is the RFC 1413 identity of the client. Usually "-".
frank is the userid of the person requesting the document. Usually "-" unless .htaccess has requested authentication.
[10/Oct/2000:13:55:36 -0700] is the date, time, and time zone that the request was received, by default in strftime format %d/%b/%Y:%H:%M:%S %z.
"GET /apache_pb.gif HTTP/1.0" is the request line from the client. The method GET, /apache_pb.gif the resource requested, and HTTP/1.0 the HTTP protocol.
200 is the HTTP status code returned to the client. 2xx is a successful response, 3xx a redirection, 4xx a client error, and 5xx a server error.
2326 is the size of the object returned to the client, measured in bytes.

Usage

Log files are a standard tool for computer systems developers and administrators. They record the "what happened, when, by whom" of the system. This information can record faults and help their diagnosis. It can identify security breaches and other computer misuse. It can be used for auditing. It can be used for accounting purposes.^{[ citation needed ]}

The information stored is only available for later analysis if it is stored in a form that can be analysed. This data can be structured in many ways for analysis. For example, storing it in a relational database would force the data into a query-able format. However, it would also make it more difficult to retrieve if the computer crashed, and logging would not be available unless the database was available. A plain text format minimises dependencies on other system processes, and assists logging at all phases of computer operation, including start-up and shut-down, where such processes might be unavailable.^{[ citation needed ]}

Related Research Articles

The Apache HTTP Server is a free and open-source cross-platform web server software, released under the terms of Apache License 2.0. Apache is developed and maintained by an open community of developers under the auspices of the Apache Software Foundation.

In computing, Common Gateway Interface (CGI) is an interface specification that enables web servers to execute an external program, typically to process user requests.

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

In computer network communications, the HTTP 404, 404 not found, 404, 404 error, page not found or file not found error message is a hypertext transfer protocol (HTTP) standard response code, to indicate that the browser was able to communicate with a given server, but the server could not find what was requested. The error may also be used when a server does not wish to disclose whether it has the requested information.

HTTPd is a software program that usually runs in the background, as a process, and plays the role of a server in a client–server model using the HTTP and/or HTTPS network protocol(s).

<span class="mw-page-title-main">Squid (software)</span> Caching and forwarding HTTP web proxy

Squid is a caching and forwarding HTTP web proxy. It has a wide variety of uses, including speeding up a web server by caching repeated requests, caching web, DNS and other computer network lookups for a group of people sharing network resources, and aiding security by filtering traffic. Although primarily used for HTTP and FTP, Squid includes limited support for several other protocols including Internet Gopher, SSL, TLS and HTTPS. Squid does not support the SOCKS protocol, unlike Privoxy, with which Squid can be used in order to provide SOCKS support.

NCSA HTTPd is an early, now discontinued, web server originally developed at the NCSA at the University of Illinois at Urbana–Champaign by Robert McCool and others. First released in 1993, it was among the earliest web servers developed, following Tim Berners-Lee's CERN httpd, Tony Sanders' Plexus server, and some others. It was for some time the natural counterpart to the Mosaic web browser in the client–server World Wide Web. It also introduced the Common Gateway Interface, allowing for the creation of dynamic websites.

Server Side Includes (SSI) is a simple interpreted server-side scripting language used almost exclusively for the World Wide Web. It is most useful for including the contents of one or more files into a web page on a web server, using its #include directive. This could commonly be a common piece of code throughout a site, such as a page header, a page footer and a navigation menu. SSI also contains control directives for conditional features and directives for calling external programs. It is supported by Apache, LiteSpeed, nginx, IIS as well as W3C's Jigsaw. It has its roots in NCSA HTTPd.

An .htaccess file is a directory-level configuration file supported by several web servers, used for configuration of website-access issues, such as URL redirection, URL shortening, access control, and more. The 'dot' before the file name makes it a hidden file in Unix-based environments.

NCSA may refer to:

<span class="mw-page-title-main">AWStats</span> Web analytics reporting tool

AWStats is an open source Web analytics reporting tool, suitable for analyzing data from Internet services such as web, streaming media, mail, and FTP servers. AWStats parses and analyzes server log files, producing HTML reports. Data is visually presented within reports by tables and bar graphs. Static reports can be created through a command line interface, and on-demand reporting is supported through a Web browser CGI program.

Digest access authentication is one of the agreed-upon methods a web server can use to negotiate credentials, such as username or password, with a user's web browser. This can be used to confirm the identity of a user before sending sensitive information, such as online banking transaction history. It applies a hash function to the username and password before sending them over the network. In contrast, basic access authentication uses the easily reversible Base64 encoding instead of hashing, making it non-secure unless used in conjunction with TLS.

The Webalizer is web log analysis software, which generates web pages of analysis, from access and usage logs. It is one of the most commonly used web server administration tools. It was initiated by Bradford L. Barrett in 1997. Statistics commonly reported by Webalizer include hits, visits, referrers, the visitors' countries, and the amount of data downloaded. These statistics can be viewed graphically and presented by different time frames, such as by day, hour, or month.

Web analytics is the measurement, collection, analysis, and reporting of web data to understand and optimize web usage. Web analytics is not just a process for measuring web traffic but can be used as a tool for business and market research and assess and improve website effectiveness. Web analytics applications can also help companies measure the results of traditional print or broadcast advertising campaigns. It can be used to estimate how traffic to a website changes after launching a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views, or create user behavior profiles. It helps gauge traffic and popularity trends, which is useful for market research.

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.

A web desktop or webtop is a desktop environment embedded in a web browser or similar client application. A webtop integrates web applications, web services, client–server applications, application servers, and applications on the local client into a desktop environment using the desktop metaphor. Web desktops provide an environment similar to that of Windows, Mac, or a graphical user interface on Unix and Linux systems. It is a virtual desktop running in a web browser. In a webtop the applications, data, files, configuration, settings, and access privileges reside remotely over the network. Much of the computing takes place remotely. The browser is primarily used for display and input purposes.

HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. The newer HTTP/2 protocol uses the same idea and takes it further to allow multiple concurrent requests/responses to be multiplexed over a single connection.

A file inclusion vulnerability is a type of web vulnerability that is most commonly found to affect web applications that rely on a scripting run time. This issue is caused when an application builds a path to executable code using an attacker-controlled variable in a way that allows the attacker to control which file is executed at run time. A file include vulnerability is distinct from a generic directory traversal attack, in that directory traversal is a way of gaining unauthorized file system access, and a file inclusion vulnerability subverts how an application loads code for execution. Successful exploitation of a file inclusion vulnerability will result in remote code execution on the web server that runs the affected web application. An attacker can use remote code execution to create a web shell on the web server, which can be used for website defacement.

In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or log entry is recorded for each such event. These log messages can then be used to monitor and understand the operation of the system, to debug problems, or during an audit. Logging is particularly important in multi-user software, to have a central overview of the operation of the system.

When an HTTP client requests a URL that points to a directory structure instead of an actual web page within the directory structure, the web server will generally serve a default page, which is often referred to as a main or "index" page.

References

↑ "Logging in W3C httpd". World Wide Web Consortium. 1995-10-12. Retrieved 2015-04-16.
↑ "Log File Formats: NCSA Common". IBM. 2004-05-19. Archived from the original on 2021-02-24. Retrieved 2013-05-07.
↑ stevewhims. "NCSA Logging - Win32 apps". learn.microsoft.com. Retrieved 2023-02-17.

External links

"Logging Control In W3C httpd: The Common Logfile Format". W3C. July 1995. Retrieved 2013-05-07.

"Common Logfile Format". Apache webserver. 2013. Retrieved 2013-05-07.

"Extended Log File Format". W3C Working Draft WD-logfile-960323. W3C. 1996-03-23. Retrieved 2013-05-07.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Logging in W3C httpd". World Wide Web Consortium. 1995-10-12. Retrieved 2015-04-16.

[2] "Log File Formats: NCSA Common". IBM. 2004-05-19. Archived from the original on 2021-02-24. Retrieved 2013-05-07.

[3] stevewhims. "NCSA Logging - Win32 apps". learn.microsoft.com. Retrieved 2023-02-17.

[1]

[2]

[3]