- Hypertext Transfer Protocol is an application layer transfer protocol, primary designed for hypertext documents, later extended to support any data types. Development was coordinated by Internet Engineering Task Force (IETF) and the World Wide Web Consortium (W3C).
- Communication model is client-server. Client sends HTTP request message to the server, which handles requested resource and sends back HTTP response message. Server is typically listening on the port 80 (443 in the case of https).
- Example, using telnet:$ telnet localhost 80 Trying ::1... Connected to localhost. Escape character is '^]'. GET / HTTP/1.0 Host: localhost HTTP/1.1 200 OK Date: Tue, 23 Oct 2012 06:31:30 GMT Server: Apache/2.2.22 (Fedora) Last-Modified: Tue, 23 Oct 2012 06:30:51 GMT Accept-Ranges: bytes Content-Length: 124 Connection: close Content-Type: text/html; charset=UTF-8 <html> <head><title>Server test</title></head> <body> <h1>Congratulations</h1> <p>Server works fine !!!</p> </body> </html> Connection closed by foreign host.
- HTTP command, informs server about the goal of the request
GET | requests specified resource, only retrieves data, no other effects |
HEAD | similar to GET, but without response body |
POST | submits data to be processed on server |
PUT | uploads specified resource on the server |
DELETE | deletes specified resource |
OPTIONS | Returns the HTTP methods, supported on the specified URL |
TRACE | Allows to find out what changes have been made by the server for the given request |
CONNECT | SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy |
PATCH | Partial resource modifications |
- Components of the message header of requests and responses, operating parameters of an HTTP transaction. Each header field is defined as follows:
header field: value
- Four groups:
Accept | Acceptable Content-Types (e.g. text/plain) |
Accept-Charset | Acceptable character sets (e.g. utf-8) |
Accept-Encoding | Acceptable encoding (e.g. gzip) |
Accept-Language | Acceptable languages for response (e.g. en-US) |
Host | The domain name of the server and the TCP port number on which the server is listening (e.g. example.com:80) |
... | ... |
Content-Location | Location of the data content (e.g. /image.jpg) |
Content-Language | The language of returned content (e.g. cs) |
Content-Length | Length of the returned content (e.g. 408) |
Expires | Date/time after which the response is considered stale (e.g. Thu, 16 Sep 2012 18:00:00 GMT) |
WWW-Authenticate | Authentication scheme that should be used to access the requested entity (e.g. Basic) |
... | ... |
- Numeric status code is always located in the first line of the response, informs about the status of processed request.
- Five groups of codes:
- Still one of the most popular http server (serves 54.98% of all active websites, according to estimates, made in September 2012). - Open source software, developed and maintained by Apache Software Foundation, originally based on NCSA HTTd code, being among the first developed web servers.
Installation:
Fedora/Cent OS: # yum install httpd
RHEL: # up2date httpd
Debian: # apt-get install apache2
- Configuration parameters defined in the main configuration file, could be named differently, depends on the Linux distribution - Three groups of directives:
ServerAdmin | E-mail, where problems should be addressed (e.g. root@example.com) |
ServerName | Name and port of the server (e.g. www.example.com:80) |
KeepAlive | Allows persistent communication (on or off) |
MaxKeepAliveRequest | Maximum number of requests allowed within a single connection |
DocumentRoot | Default root directory, handling the requests (e.g. /var/www/html) |
... | ... |
- Apache allows running more than one web site on a single machine. - IP-based vs. Name-based virtual hosts.
Example of Name-based configuration:
NameVirtualHost *:80 <VirtualHost *:80> ServerName www.company.com ServerAlias company.com *.company.com DocumentRoot /www/company </VirtualHost> <VirtualHost *:80> ServerName www.other.company.com DocumentRoot /www/othercompany </VirtualHost>
Example of IP-based Configuration:
<VirtualHost 193.29.200.130> DocumentRoot /www/company1 ServerAdmin webmaster@first.company.com ServerName www.first.company.com </VirtualHost> <VirtualHost 193.29.200.140> DocumentRoot /www/company2 ServerAdmin webmaster@second.company.com ServerName www.second.company.com </VirtualHost>
- Apache functionality can be easily extended by loadable modules - More than 500 modules for different purposes, some of them developed by Apache Software Foundation, others by custom open source developers - Modules can be compiled with the apache in the core, others can be loaded dynamically, using LooadModule directive - Some of modules:
- SSL provides cryptographically secure transactions between web server and clients - In most cases only the server end is authenticated, that gives a client guarantee that the server is who it claims to be - Self-signed certificates vs. Certificates, signed by CA - Problem with virtual hosts: impossible to host more than one SSL virtual host on the same address and port. - mod_ssl should be loaded as it provides an API for Openssl - SSL Configuration example:
Listen 443 NameVirtualHsot *:443 <VirtualHost *:443> SSLEngine on SSLCertificateFile /etc/pki/tls/certs/localhost.cert SSLCertificateKeyFile /etc/pki/tls/private/localhost.key </VirtualHost>
- CGI(Common Gateway Interface) - mechanism that allows web server to delagate running of web scenarios to executable files, i.e. cgi scripts - Programs, written in scripting language - Module mod_cgi should be loaded for cgi support - According to common convention cgi scripts shoud be located in cgi-bin/ directory or to have a .cgi extension
Example of defining cgi-bin/ directory:
ScriptAlias /cgi-bin/ "/usr/local/apache2/cgi-bin/"
Configuration, that allows to execute any cgi scripts, ending in .cgi in home directories:
<Directory /home/*/public_html> Options +ExecCGI AddHandler cgi-script .cgi </Directory>
- Decentralized management of web server configuration - File .htaccess(hypertext access), original purpose was to allow per-directory access control - Global configuration settings can be overriden, by the ones defined in .htaccess - Usage: