HyperText Transfer Protocol (HTTP)

HTTP (Hypertext Transfer Protocol) is perhaps the most popular application protocol used on the Internet (or The WEB).

  • HTTP is an asymmetric request-response client-server protocol as illustrated. An HTTP client sends a request message to an HTTP server. The server, in turn, returns a response message. In other words, HTTP is a pull protocol, the client pulls information from the server (instead of server pushes information down to the client).

  • HTTP is a stateless protocol. In other words, the current request does not know what has been done in the previous requests.

  • HTTP permits negotiating of data type and representation, so as to allow systems to be built independently of the data being transferred.

Browser

Whenever you issue a URL from your browser to get a web resource using HTTP, e.g. http://www.example.com/index.html, the browser turns the URL into a request message and sends it to the HTTP server. The HTTP server interprets the request message and returns you an appropriate response message, which is either the resource you requested or an error message. This process is explained stepwise below:

  1. User issues URL from browser http://host:port/path/file

  2. Browser sends a request message

  3. Server maps the URL to a file or program under the document directory

  4. Server returns a response message

  5. Browser formats the response and delays

Uniform Resource Locator (URL):

A URL (Uniform Resource Locator) is used to uniquely identify a resource over the web. URL has the following syntax:

protocol://hostname:port/path-and-file-name

There are 4 parts in a URL:

  1. Protocol: The application-level protocol used by the client and server, e.g., HTTP, FTP, and telnet.

  2. Hostname: The DNS domain name (e.g. www.example.com) or IP address (e.g., 192.128.1.2) of the server.

  3. Port: The TCP port number that the server is listening for incoming requests from the clients.

  4. Path-and-file-name: The name and location of the requested resource, under the server document base directory.

For example, in the URL, http://www.example.com/docs/index.html, the communication protocol is HTTP, the hostname is www.example.com. The port number was not specified in the URL and takes on the default number, which is TCP port 80 for HTTP. The path and file name for the resource to be located is /docs/index.html.

HTTP Protocol

As mentioned, whenever we enter a URL in the address box of the browser, the browser translates the URL into a request message according to the specified protocol; and sends the request message to the server.

For example, the browser translates the URL http://www.example.com/docs/index.html into the following request message:

GET /docs/index.html HTTP/1.1
Host: www.example.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)

When this request message reaches the server, the server can take either one of these actions:

  1. The server interprets the request received, maps the request into a file under the server's document directory, and returns the file requested to the client.

  2. The server interprets the request received, maps the request into a program kept in the server, executes the program, and returns the output of the program to the client.

  3. The request cannot be satisfied, the server returns an error message.

An example of the HTTP response message is as shown:

HTTP/1.1 200 OK
Date: Sun, 18 Oct 2009 08:56:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT
ETag: "10000000565a5-2c-3e94b66c2e680"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug
  
<html><body><h1>It works!</h1></body></html>

The browser receives the response message, interprets the message and displays the contents of the message on the browser's window according to the media type of the response (as in the Content-Type response header). Common media types include "text/plain", "text/html", "image/gif", "image/jpeg", "audio/mpeg", "video/mpeg", "application/msword", and "application/pdf".

In its idling state, an HTTP server does nothing but listening to the IP address(es) and port(s) specified in the configuration for the incoming requests. When a request arrives, the server analyzes the message header, applies rules specified in the configuration, and takes the appropriate action. The webmaster's main control over the action of the webserver is via the configuration, which will be dealt with in greater detail in the later sections.

HTTP Request and Response Messages

HTTP client and server communicate by sending text messages. The client sends a request message to the server. The server, in turn, returns a response message.

An HTTP message consists of a message header and an optional message body, separated by a blank line, as illustrated below:

The HTTP protocol defines a set of request methods. A client can use one of these request methods to send a request message to an HTTP server. The methods are:

  • GET: A client can use the GET request to get a web resource from the server.

  • POST: Used to post data up to the web-server.

  • PUT: Ask the server to store the data.

  • DELETE: Ask the server to delete the data.

The first line of the response message (i.e., the status line) contains the response status code, which is generated by the server to indicate the outcome of the request.

The status code is a 3-digit number:

  • 1xx (Informational): Request received, the server is continuing the process.

  • 2xx (Success): The request was successfully received, understood, accepted, and serviced.

  • 3xx (Redirection): Further action must be taken in order to complete the request.

  • 4xx (Client Error): The request contains bad syntax or cannot be understood.

  • 5xx (Server Error): The server failed to fulfill an apparently valid request.

References

You can refer to the following links to know more about the HTTP protocol.

Last updated