HTTP protocol basics
The HTTP protocol is the system that browsers use to talk to web servers.
The acronym stands for "HyperText Transfer Protocol",
and this exemplifies the concept that it is used for web (hypertext) transmissions.
The World Wide Web is a late-comer to the Internet. Those of you Internet
old-timers will recall the time when there was no WWW, and in order to communicate
using this medium, you had to use the UNIX VI editor to send mail and participate
in newsgroups.
Some of the other common protocols in use for Internet data
transmission are:
- SMTP/POP/IMAP - used for mail transmission and retrieval
- FTP - This is one of many protocols used for exchanging files.
- Gopher - Multi-use protocol used for displaying and transferring hierarchical data
- NNTP - Network News Transfer Protocol - Used to share discussion messages
- TELNET - A very simple protocol used for text transmission.
The protocols that I mentioned are all high level protocols-
that is, they rely on piggy-backing on lower level protocols to get from one
point to another. Usually they ride on TCP/IP, but there are also a multitude
of other transmission protocols that are in use today.
The main point to keep in mind with most high level Internet
protocols is that they are usually text based and human readable. That is,
you can usually very easily read and write data, as if you were the protocol.
There are many Telnet-type tools available for talking with servers of all
protocol flavors. It is a relatively simple matter to log into a mail server
with a telnet client, compose a mail message, complete with headers, and send
it.
HTTP connections can also be established using a Telnet client,
and this is a good way to learn the about the construction of HTTP headers.
Headers are blocks of data the HTTP server and client (browser) use to exchange
meta data. Meta data is simply data that describes the attributes of other
data. A header, as the name implies, is the first piece of data that gets
transferred in every HTTP conversation. There are several standard bits of
meta data that are transmitted in the header, and many non-standard bits (use
Interceptor to view the data from this site, and you'll see what I mean).
There are two basic types of headers: Request and Response.
The request headers are sent by the client to the server, and the response
headers are sent by the server to the client. Normally, this transfer takes
place in a step-by-step order: I send a request, the server sends a response
to that request. In some situations, this is not the case, such as in pipelining
(sending multiple requests, and not waiting for the responses, or visa versa).
Request Headers
Among the request headers, are three basic and the most common types:
Head, Post and Get.
"Get" is the most often used of the three - this is
the header that your browser sends to a server to request a web page.
"Head" is a header that requests that the server only
return the response header of a resource. In other words, it will send the
header as if you did a "Get", but no data will follow. This is very useful
for testing purposes, and when using a Telnet client to analyze a resource.
I've used this technique extensively in the past to troubleshoot complex
web pages.
"Post" is a little different. This is used when a web
user sends HTML form data to a server. This is an example of a request header
that will actually send data other than meta data to the server.
There are several other types of headers, which I will not get into right now. The HTTP protocol covers a lot of ground, and especially since the implementation of version 1.1, much functionality has been added. I will be adding more information on the other types of headers, and different communication mechanisms in a future article. For now, I highly suggest that you read the RFC that describes all of the parameters of the HTTP protocol. This may be found at www.faqs.org/rfcs/rfc2616.html.
In future articles, I will discuss many of the HTTP headers in depth, including their construction, the different fields and how they are used, and I will provide methods of manipulating headers to enable you to maintain absolute control over how your web sites communicate with the world. Though most of my programming examples will deal with PHP, I will try to include some ASP tricks as well. I will probably also post some snippets of code that I used in designing an HTML based chatroom, as this is probably the most extreme example of pure server side HTTP programming, necessitating absolute control over HTTP transmissions.
So check back often, as I will be adding lots of useful information.