Archived Document Notice: This document has been archived by the LDP because it is severely out-of-date. If you are interested in maintaining this document, contact The Linux Documentation Project.
Many people are trying Linux because they are looking for a really good Internet capable operating system. Also, there are institutes, universities, non-profits, and small businesses which want to set up Internet sites on a small budget. This is where the WWW-HOWTO comes in. This document explains how to set up clients and servers for the largest part of the Internet - The World Wide Web.
All prices in this document are stated in US dollars. This document assumes you are running Linux on an Intel platform. Instructions and product availability my vary from platform to platform. There are many links for downloading software in this document. Whenever possible use a mirror site for faster downloading and to keep the load down on the main server.
The US government forbids US companies from exporting encryption stronger than 40 bit in strength. Therefore US companies will usually have two versions of software. The import version will usually support 128 bit, and the export only 40 bit. This applies to web browsers and servers supporting secure transactions. Another name for secure transactions is Secure Sockets Layer (SSL). We will refer to it as SSL for the rest of this document.
This document is Copyright (c) 1997 by Wayne Leister. The original author of this document was Peter Dreuw.(All versions prior to 0.8)
This HOWTO is free documentation; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This document is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.
You can obtain a copy of the GNU General Public License by writing to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
Trademarks are owned by there respective owners.
Any feedback is welcome. I do not claim to be an expert. Some of this information was taken from badly written web sites; there are bound to be errors and omissions. But make sure you have the latest version before you send corrections; It may be fixed in the next version (see the next section for where to get the latest version). Send feedback to email@example.com.
New versions of this document can be retrieved in text format from Sunsite at http://sunsite.unc.edu/pub/Linux/docs/HOWTO/WWW-HOWTO and almost any Linux mirror site. You can view the latest HTML version on the web at http://sunsite.unc.edu/LDP/HOWTO/WWW-HOWTO.html. There are also HTML versions available on Sunsite in a tar archive.
The following chapter is dedicated to the setting up web browsers. Please feel free to contact me, if your favorite web browser is not mentioned here. In this version of the document only a few of the browsers have there own section, but I tried to include all of them (all I could find) in the overview section. In the future those browsers that deserve there own section will have it.
The overview section is designed to help you decide which browser to use, and give you basic information on each browser. The detail section is designed to help you install, configure, and maintain the browser.
However I use Lynx when I don't feel like firing up the X-windows/Netscape monster.
Lynx is the one of the smallest web browsers. It is the king of text based browsers. It's free and the source code is available under the GNU public license. It's text based, but it has many special features.
Kfm is part of the K Desktop Environment (KDE). KDE is a system that runs on top of X-windows. It gives you many features like drag an drop, sounds, a trashcan and a unified look and feel. Kfm is the K File Manager, but it is also a web browser. Don't be fooled by the name, for a young product it is very usable as a web browser. It already supports frames, tables, ftp downloads, looking into tar files, and more. The current version of Kfm is 1.39, and it's free. Kfm can be used without KDE, but you still need the librarys that come with KDE. For more information about KDE and Kfm visit the KDE website at http://www.kde.org.
Emacs is the one program that does everything. It is a word processor, news reader, mail reader, and web browser. It has a steep learning curve at first, because you have to learn what all the keys do. The X-windows version is easier to use, because most of the functions are on menus. Another drawback is that it's mostly text based. (It can display graphics if you are running it under X-windows). It is also free, and the source code is available under the GNU public license.
Mosaic is an X-windows browser developed by the National Center for Supercomputing Applications (NCSA) at the University of Illinois. NCSA spent four years on the project and has now moved on to other things. The latest version is 2.6 which was released on July 7, 1995. Source code is available for non-commercial use. Spyglass Inc. has the commercial rights to Mosaic. Its a solid X-windows browser, but it lacks the new HTML features. For more info visit the NCSA Mosaic home page at http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/. The software can be downloaded from ftp://ftp.ncsa.uiuc.edu/Mosaic/Unix/binaries/2.6/Mosaic-linux-2.6.Z.
Arena was a X-windows concept browser for the W3C (World Wide Web Consortium) when they were testing HTML 3.0. Hence it supports all the HTML 3.0 standards such as style sheets and tables. Development was taken over by Yggdrasil Computing, with the idea to turn it into a full fledge free X-windows browser. However development has stopped in Feb 1997 with version 0.3.11. Only part of the HTML 3.2 standard has been implemented. The source code is released under the GNU public licence. For more information see the web site at http://www.yggdrasil.com/Products/Arena/. It can be downloaded from ftp://ftp.yggdrasil.com/pub/dist/web/arena/.
Amaya is the X-windows concept browser for the W3C for HTML 3.2. Therefore it supports all the HTML 3.2 standards. It also supports some of the features of HTML 4.0. It supports tables, forms, client side image maps, put publishing, gifs, jpegs, and png graphics. It is both a browser and authoring tool. The latest public release is 1.0 beta. Version 1.1 beta is in internal testing and is due out soon. For more information visit the Amaya web site at http://www.w3.org/Amaya/. It can be downloaded from ftp://ftp.w3.org/pub/Amaya-LINUX-ELF-1.0b.tar.gz.
Red Baron is an X-windows browser made by Red Hat Software. It is bundled with The Official Red Hat Linux distribution. I could not find much information on it, but I know it supports frames, forms and SSL. If you use Red Baron, please help me fill in this section. For more information visit the Red Hat website at http://www.redhat.com
Chimera is a basic X-windows browser. It supports some of the features of HTML 3.2. The latest release is 2.0 alpha 6 released August 27, 1997. For more information visit the Chimera website at http://www.unlv.edu/chimera/. Chimera can be downloaded from ftp://ftp.cs.unlv.edu/pub/chimera-alpha/chimera-2.0a6.tar.gz.
Qweb is yet another basic X-windows browser. It supports tables, forms, and server site image maps. The latest version is 1.3. For more information visit the Qweb website at http://sunsite.auc.dk/qweb/ The source is available from http://sunsite.auc.dk/qweb/qweb-1.3.tar.gz The binaries are available in a Red Hat RPM from http://sunsite.auc.dk/qweb/qweb-1.3-1.i386.rpm
Grail is an X-windows browser developed by the Corporation for National Research Initiatives (CNRI). Grail is written entirely in Python, a interpreted object-oriented language. The latest version is 0.3 released on May 7, 1997. It supports forms, bookmarks, history, frames, tables, and many HTML 3.2 things.
There are rumors, that Microsoft is going to port the Internet Explorer to various Unix platforms - maybe Linux. If its true they are taking their time doing it. If you know something more reliable, please drop me an e-mail.
In my humble opinion most of the above software is unusable for serious web browsing. I'm not trying to discredit the authors, I know they worked very hard on these projects. Just think, if all of these people had worked together on one project, maybe we would have a free browser that would rival Netscape and Internet Explorer.
In my opinion out of all of the broswers, Netscape and Lynx are the best. The runners up would be Kfm, Emacs-W3 and Mosaic.
Lynx is one of the smaller (around 600 K executable) and faster web browsers available. It does not eat up much bandwidth nor system resources as it only deals with text displays. It can display on any console, terminal or xterm. You will not need an X Windows system or additional system memory to run this little browser.
Both the Red Hat and Slackware distributions have Lynx in them. Therefore I will not bore you with the details of compiling and installing Lynx.
The latest version is 2.7.1 and can be retrieved from http://www.slcc.edu/lynx/fote/ or from almost any friendly Linux FTP server like ftp://sunsite.unc.edu under /pub/Linux/apps/www/broswers/ or mirror site.
For more information on Lynx try these locations:
http://www.crl.com/~subir/lynx/lynx_help/lynx_help_main.html (the same pages you get from lynx --help and typing ? in lynx)
Note: The Lynx help pages have recently moved. If you have an older version of Lynx, you will need to change your lynx.cfg (in /usr/lib) to point to the new address(above).
I think the most special feature of Lynx against all other web browsers is the capability for batch mode retrieval. One can write a shell script which retrieves a document, file or anything like that via http, FTP, gopher, WAIS, NNTP or file:// - url's and save it to disk. Furthermore, one can fill in data into HTML forms in batch mode by simply redirecting the standard input and using the -post_data option.
For more special features of Lynx just look at the help files and the man pages. If you use a special feature of Lynx that you would like to see added to this document, let me know.
There are several different flavors of Emacs. The two most popular are GNU Emacs and XEmacs. GNU Emacs is put out by the Free Software Foundation, and is the original Emacs. It is mainly geared toward text based terminals, but it does run in X-Windows. XEmacs (formerly Lucid Emacs) is a version that only runs on X-Windows. It has many special features that are X-Windows related (better menus etc).
Both the Red Hat and Slackware distributions include GNU Emacs.
The most recent GNU emacs is 19.34. It doesn't seem to have a web site. The FTP site is at ftp://ftp.gnu.ai.mit.edu/pub/gnu/.
The latest version of XEmacs is 20.2. The XEmacs FTP site is at ftp://ftp.xemacs.org/pub/xemacs. For more information about XEmacs goto see its web page at http://www.xemacs.org.
Both are available from the Linux archives at ftp://sunsite.unc.edu under /pub/Linux/apps/editors/emacs/
If you got GNU Emacs or XEmacs installed, you probably got the W3 browser running to.
The Emacs W3 mode is a nearly fully featured web browser system written in the Emacs Lisp system. It mostly deals with text, but can display graphics, too - at least - if you run the emacs under the X Window system.
To get XEmacs in to W3 mode, goto the apps menu and select browse the web.
I don't use Emacs, so if someone will explain how to get it into the W3 mode I'll add it to this document. Most of this information was from the original author. If any information is incorrect, please let me know. Also let me know if you think anything else should be added about Emacs.
Netscape Navigator is the King of WWW browsers. Netscape Navigator can do almost everything. But on the other hand, it is one of the most memory hungry and resource eating program I've ever seen.
There are 3 different versions of the program:
Netscape Navigator includes the web browser, netcaster (push client) and a basic mail program.
Netscape Communicator includes the web browser, a web editor, an advanced mail program, a news reader, netcaster (push client), and a group conference utility.
Netscape Communicator Pro includes everything Communicator has plus a group calendar, IBM terminal emulation, and remote administration features (administrators can update thousands of copies of Netscape from their desk).
In addition to the three versions there are two other options you must pick.
The first is full install or base install. The full install includes everything. The base install includes enough to get you started. You can download the additional components as you need them (such as multimedia support and netcaster). These components can be installed by the Netscape smart update utility (after installing goto help->software updates). At this time the full install is not available for Linux.
The second option is import or export. If you are from the US are Canada you have the option of selecting the import version. This gives you the stronger 128 bit encryption for secure transactions (SSL). The export version only has 40 bit encryption, and is the only version allowed outside the US and Canada.
The latest version of the Netscape Navigator/Communicator/Communicator Pro is 4.03. There are two different versions for Linux. One is for the old 1.2 series kernels and one for the new 2.0 kernels. If you don't have a 2.0 kernel I suggest you upgrade; there are many improvements in the new kernel.
Beta versions are also available. If you try a beta version, they usually expire in a month or so!
The best way to get Netscape software is to go through their web site at http://www.netscape.com/download/. They have menu's to guide you through the selection. When it ask for the Linux version, it is referring to the kernel (most people should be using 2.0 by now). If your not sure which version kernel you have run 'cat /proc/version'. Going through the web site is the only way to get the import versions.
If you want an export version you can download them directly from the Netscape FTP servers. The FTP servers are also more up to date. For example when I first wrote this the web interface did not have the non-beta 4.03 for Linux yet, but it was on the FTP site. Here are the links to the export Linux 2.0 versions:
Netscape Navigator 4.03 is at ftp://ftp.netscape.com/pub/communicator/4.03/shipping/english/unix/linux20/navigator_standalone/navigator-v403-export.x86-unknown-linux2.0.tar.gz
Netscape Communicator 4.03 for Linux 2.0 (kernel) is at ftp://ftp.netscape.com/pub/communicator/4.03/shipping/english/unix/linux20/base_install/communicator-v403-export.x86-unknown-linux2.0.tar.gz
Communicator Pro 4.03 for Linux was not available at the time I wrote this.
These url's will change as new versions come out. If these links break you can find them by fishing around at the FTP site ftp://ftp.netscape.com/pub/communicator/.
These servers are heavily loaded at times. Its best to wait for off peak hours or select a mirror site. Be prepared to wait, these archives are large. Navigator is almost 8megs, and Communicator base install is 10megs.
This section explains how to install version 4 of Netscape Navigator, Communicator, and Communicator Pro.
First unpack the archive to a temporary directory.
Then run the
ns-install script (type
./ns-install). Then make a symbolic link
/usr/local/netscape/netscape binary to
ln -s /usr/local/netscape/netscape /usr/local/bin/netscape).
Finally set the system wide environment variable
/usr/local/netscape so Netscape can find its files. If you are using bash
for your shell edit your
/etc/profile and add the lines:
After you have it installed the software can automatically update itself with smart update. Just run Netscape as root and goto help->software updates. If you only got the base install, you can also install the Netscape components from there.
Note: This will not remove any old versions of Netscape, you must manually remove them by deleting the Netscape binary and Java class file (for version 3).
This section contains information on different http server software packages and additional server side tools like script languages for CGI programs etc. There are several dozen web servers, I only covered those that are fully functional. As some of these are commercial products, I have no way of trying them. Most of the information in the overview section was pieced together from various web sites. If there is any incorrect or missing information please let me know.
For a technical description on the http mechanism, take a look at the RFC documents mentioned in the chapter "For further reading" of this HOWTO.
I prefer to use the Apache server. It has almost all the features you would ever need and its free! I will admit that this section is heavily biased toward Apache. I decided to concentrate my efforts on the Apache section rather than spread it out over all the web servers. I may cover other web servers in the future.
This was the first web server. It was developed by the European Laboratory for Particle Physics (CERN). CERN httpd is no longer supported. The CERN httpd server is reported to have some ugly bugs, to be quite slow and resource hungry. The latest version is 3.0. For more information visit the CERN httpd home page at http://www.w3.org/Daemon/Status.html. It is available for download at ftp://sunsite.unc.edu/pub/Linux/apps/www/servers/httpd-3.0.term.tpz (no it is not a typo, the extension is actually .tpz on the site; probably should be .tgz)
The NCSA HTTPd server is the father to Apache (The development split into two different servers). Therefore the setup files are very similar. NCSA HTTPd is free and the source code is available. This server not covered in this document, although reading the Apache section may give you some help. The NCSA server was once popular, but most people are replacing it with Apache. Apache is a drop in replacement for the NCSA server(same configuration files), and it fixes several shortcomings of the NCSA server. NCSA HTTPd accounts for 4.9% (and falling) of all web servers. (source September 1997 Netcraft survey). The latest version is 1.5.2a. For more information see the NCSA website at http://hoohoo.ncsa.uiuc.edu.
Apache is the king of all web servers. Apache and its source code is free. Apache is modular, therefore it is easy to add features. Apache is very flexible and has many, many features. Apache and its derivatives makes up 44% of all web domains (50% if you count all the derivatives). There are over 695,000 Apache servers in operation (source November 1997 Netcraft survey).
The official Apache is missing SSL, but there are two derivatives that fill the gap. Stronghold is a commercial product that is based on Apache. It retails for $995; an economy version is available for $495 (based on an old version of Apache). Stronghold is the number two secure server behind Netscape (source C2 net and Netcraft survey). For more information visit the Stronghold website at http://www.c2.net/products/stronghold/. It was developed outside the US, so it is available with 128 bit SSL everywhere.
Apache-SSL is a free implementation of SSL, but it is not for commercial use in the US (RSA has US patents on SSL technology). It can be used for non-commercial use in the US if you link with the free RSAREF library. For more information see the website at http://www.algroup.co.uk/Apache-SSL/.
Fast Track was developed by Netscape, but the Linux version is put out by Caldera. The Caldera site lists it as Fast Track for OpenLinux. I'm not sure if it only runs on Caldera OpenLinux or if any Linux distribution will do (E-mail me if you have the answer). Netscape servers account for 11.5% (and falling) of all web servers (source September 1997 http://www.netcraft.com/survey/). The server sells for $295. It is also included with the Caldera OpenLinux Standard distribution which sells for $399 ($199.50 educational). The web pages tell of a nice administration interface and a quick 10 minute setup. The server has support for 40-bit SSL. To get the full 128-bit SSL you need Netscape Enterprise Server. Unfortunately that is not available for Linux :( The latest version available for Linux is 2.0 (Version 3 is in beta, but its not available for Linux yet). To buy a copy goto the Caldera web site at http://www.caldera.com/products/netscape/netscape.html For more information goto the Fast Track page at http://www.netscape.com/comprod/server_central/product/fast_track/
WN has many features that make it attractive. First it is smaller than the CERN, NCSA HTTPd, an Apache servers. It also has many built-in features that would require CGI's. For example site searches, enhanced server side includes. It can also decompress/compress files on the fly with its filter feature. It also has the ability to retrieve only part of a file with its ranges feature. It is released under the GNU public license. The current version is 1.18.3. For more information see the WN website at http://hopf.math.nwu.edu/.
AOLserver is made by America Online. I'll admit that I was surprised by the features of a web server coming from AOL. In addition to the standard features it supports database connectivity. Pages can query a database by Structured Query Language (SQL) commands. The database is access through Open Database Connectivity (ODBC). It also has built-in search engine and TCL scripting. If that is not enough you can add your own modules through the c Application Programming Interface (API). I almost forgot to mention support for 40 bit SSL. And you get all this for free! For more information visit the AOLserver site at http://www.aolserver.com/server/
Zeus Server was developed by Zeus Technology. They claim that they are the fastest web server (using WebSpec96 benchmark). The server can be configured and controlled from a web browser! It can limit processor and memory resources for CGI's, and it executes them in a secure environment (whatever that means). It also supports unlimited virtual servers. It sells for $999 for the standard version. If you want the secure server (SSL) the price jumps to $1699. They are based outside the US so 128 bit SSL is available everywhere. For more information visit the Zeus Technology website at http://www.zeus.co.uk. The US website is at http://www.zeus.com. I'll warn you they are cocky about the fastest web server thing. But they don't even show up under top web servers in the Netcraft Surveys.
CL-HTTP stands for Common Lisp Hypermedia Server. If you are a Lisp programmer this server is for you. You can write your CGI scripts in Lisp. It has a web based setup function. It also supports all the standard server features. CL-HTTP is free and the source code is available. For more information visit the CL-HTTP website at http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html (could they make that url any longer?).
If you have a commercial purpose (company web site, or ISP), I would strongly recommend that you use Apache. If you are looking for easy setup at the expense of advanced features then the Zeus Server wins hands down. I've also heard that the Netscape Server is easy to setup. If you have an internal use you can be a bit more flexible. But unless one of them has a feature that you just have to use, I would still recommend using one of the three above.
This is only a partial listing of all the servers available. For a more complete list visit Netcraft at http://www.netcraft.com/survey/servers.html or Web Compare at http://webcompare.internet.com.
The current version of Apache is 1.2.4. Version 1.3 is in beta testing. The main Apache site is at http://www.apache.org/. Another good source of information is Apacheweek at http://www.apacheweek.com/. The Apache documentation is ok, so I'm not going to go into detail in setting up apache. The documentation is on the website and is included with the source (in HTML format). There are also text files included with the source, but the HTML version is better. The documentation should get a whole lot better once the Apache Documentation Project gets under way. Right now most of the documents are written by the developers. Not to discredit the developers, but they are a little hard to understand if you don't know the terminology.
Apache is included in the Red Hat, Slackware, and OpenLinux distributions. Although they may not be the latest version, they are very reliable binaries. The bad news is you will have to live with their directory choices (which are totally different from each other and the Apache defaults).
The source is available from the Apache web site at http://www.apache.org/dist/ Binaries are are also available at apache at the same place. You can also get binaries from sunsite at ftp://sunsite.unc.edu/pub/Linux/apps/www/servers/. And for those of us running Red Hat the latest binary RPM file can usually be found in the contrib directory at ftp://ftp.redhat.com/pub/contrib/i386/
If your server is going to be used for commercial purposes, it is highly recommended that you get the source from the Apache website and compile it yourself. The other option is to use a binary that comes with a major distribution. For example Slackware, Red Hat, or OpenLinux distributions. The main reason for this is security. An unknown binary could have a back door for hackers, or an unstable patch that could crash your system. This also gives you more control over what modules are compiled in, and allows you to set the default directories. It's not that difficult to compile Apache, and besides you not a real Linux user until you compile your own programs ;)
First untar the archive to a temporary directory. Next change to the src
directory. Then edit the Configuration file if you want to include any special
modules. The most
commonly used modules are already included. There is no need to change the
rules or makefile stuff for Linux. Next run the Configure shell script
./Configure). Make sure it says Linux platform and gcc as the compiler.
Next you may want to edit the httpd.h file to change the default directories.
The server home (where the config files are kept) default is
/usr/local/etc/httpd/, but you may want to change
it to just
/etc/httpd/. And the server root (where the HTML pages are
served from) default is
/usr/local/etc/httpd/htdocs/, but I like the directory
Red Hat default for Apache). If you are going to be using su-exec (see
special features below) you
may want to change that directory too. The server root can also be changed from the
config files too. But it is also good to compile it in, just encase Apache
can't find or read the config file. Everything else should be changed
from the config files.
Finally run make to compile Apache.
If you run in to problems with include files missing, check the following things. Make sure you have the kernel headers (include files) installed for your kernel version. Also make sure you have these symbolic links in place:
Links can be made with
/usr/include/linux should be a link to /usr/src/linux/include/linux
/usr/include/asm should be a link to /usr/src/linux/include/asm
/usr/src/linux should be a link to the Linux source directory (ex.linux-2.0.30)
ln -s, it works just like the cp command except it
makes a link (
ln -s source-dir destination-link)
When make is finished there should be an executable named httpd in the
directory. This needs to be moved in to a bin directory.
/usr/local/sbin would be good choices.
Copy the conf, logs, and icons sub-directories from the source to the server
home directory. Next rename 3 of the files files in the conf sub-directory
to get rid of the
-dist extension (ex.
There are also several support programs that are included with Apache. They
are in the
support directory and must be compiled and installed separately.
Most of them can be make by using the makefile in that directory (which is
made when you run the main
Configure script). You don't need any of them to
run Apache, but some of them make the administrators job easier.
Now you should have four files in your
conf sub-directory (under
your server home directory). The
httpd.conf sets up the server daemon (port
number, user, etc). The
srm.conf sets the root document tree, special
handlers, etc. The
access.conf sets the base case for access. Finally
mime.types tells the server what mime type to send to the browser for each
The configuration files are pretty much self-documented (plenty of comments), as long as you understand the lingo. You should read through them thoroughly before putting your server to work. Each configuration item is covered in the Apache documentation.
mime.types file is not really a configuration file. It is used by the
server to translate file extensions into mime-types to send to the browser.
Most of the common mime-types are already in the file. Most people should
not need to edit this file. As time goes on, more mime types will be added
to support new programs. The best thing to do is get a new mime-types file
(and maybe a new version of the server) at that time.
Always remember when you change the configuration files you need to restart
Apache or send it the SIGHUP signal with
kill for the changes to take
effect. Make sure you send the signal to the parent process and not any of
the child processes. The parent usually has the lowest process id number. The
process id of the parent is also in the
httpd.pid file in the log
directory. If you accidently send it to one of the child processes the
child will die and the parent will restart it.
I will not be walking you through the steps of configuring Apache. Instead I will deal with specific issues, choices to be made, and special features.
I highly recommend that all users read through the security tips in the Apache documentation. It is also available from the Apache website at http://www.apache.org/docs/mics/security_tips.html.
Virtual Hosting is when one computer has more than one domain name. The old way was to have each virtual host have its own IP address. The new way uses only one IP address, but it doesn't work correctly with browsers that don't support HTTP 1.1.
My recommendation for businesses is to go with the IP based virtual hosting until most people have browsers that support HTTP 1.1 (give it a year or two). This also gives you a more complete illusion of virtual hosting. While both methods can give you virtual mail capabilities (can someone confirm this?), only IP based virtual hosting can also give you virtual FTP as well.
If it is for a club or personal page, you may want to consider shared IP virtual hosting. It should be cheaper than IP based hosting and you will be saving precious IP addresses.
You can also mix and match IP and shared IP virtual hosts on the same server. For more information on virtual hosting visit Apacheweek at http://www.apacheweek.com/features/vhost.
In this method each virtual host has its own IP address. By determining the IP address that the request was sent to, Apache and other programs can tell what domain to serve. This is an incredible waste of IP space. Take for example the servers where my virtual domain is kept. They have over 35,000 virtual accounts, that means 35,000 IP addresses. Yet I believe at last count they had less than 50 servers running.
Setting this up is a two part process. The first is getting Linux setup to accept more than one IP address. The second is setting up apache to serve the virtual hosts.
The first step in setting up Linux to accept multiple IP addresses is to make a new kernel. This works best with a 2.0 series kernel (or higher). You need to include IP networking and IP aliasing support. If you need help with compiling the kernel see the kernel howto.
Next you need to setup each interface at boot. If you are using the Red Hat Distribution then this can be done from the control panel. Start X-windows as root, you should see a control panel. Then double click on network configuration. Next goto the interfaces panel and select your network card. Then click alias at the bottom of the screen. Fill in the information and click done. This will need to be done for each virtual host/IP address.
If you are using other distributions you may have to do it manually.
You can just put the commands in the
rc.local file in
/etc/rc.d (really they should go in with the networking
stuff). You need to have a
route command for each device. The
aliased addresses are given a sub device of the main one. For example eth0
would have aliases eth0:0, eth0:1, eth0:2, etc. Here is an example of
configuring a aliased device:
You can also add a broadcast address and a netmask to the ifconfig command.
If you have alot of aliases you may want to make a for loop to make it
easier. For more information see the
IP alias mini howto.
ifconfig eth0:0 192.168.1.57
route add -host 192.168.1.57 dev eth0:0
Then you need to setup your domain name server (DNS) to serve these new domains. And if you don't already own the domain names, you need to contact the Internic to register the domain names. See the DNS-howto for information on setting up your DNS.
Finally you need to setup Apache to server the virtual domain correctly.
This is in the
httpd.conf configuration file near the end. They give you an
example to go by. All commands specific to that virtual host are put in
virtualhost directive tags. You can put almost any command in there.
Usually you set up a different document root, script directory, and log
files. You can have almost unlimited number of virtual hosts by adding
virtualhost directive tags.
In rare cases you may need to run separate servers if a directive is needed for a virtual host, but is not allowed in the virtual host tags. This is done using the bindaddress directive. Each server will have a different name and setup files. Each server only responds to one IP address, specified by the bindaddress directive. This is an incredible waste of system resources.
This is a new way to do virtual hosting. It uses a single IP address, thus conserving IP addresses for real machines (not virtual ones). In the same example used above those 30,000 virtual hosts would only take 50 IP addresses (one for each machine). This is done by using the new HTTP 1.1 protocol. The browser tells the server which site it wants when it sends the request. The problem is browsers that don't support HTTP 1.1 will get the servers main page, which could be setup to provide a menu of virtual hosts available. That ruins the whole illusion of virtual hosting. The illusion that you have your own server.
The setup is much simpler than the IP based virtual hosting. You still need to get your domain from the Internic and setup your DNS. This time the DNS points to the same IP address as the original domain. Then Apache is setup the same as before. Since you are using the same IP address in the virtualhost tags, it knows you want Shared IP virtual hosting.
There are several work arounds for older browsers. I'll explain the best
one. First you need to make your main pages a virtual host (either IP based
or shared IP). This
frees up the main page for a link list to all your virtual hosts. Next you
need to make a back door for the old browsers to get in. This is done using
ServerPath directive for each virtual host inside the
directive. For example by adding
ServerPath /mysite/ to www.mysite.com old
browsers would be able to access the site by www.mysite.com/mysite/. Then
you put the default page on the main server that politely tells them to get
a new browser, and lists links to all the back doors of all the sites you
host on that machine. When an old browser accesses the site they will be
sent to the main page, and get a link to the correct page. New browsers
will never see the main page and will go directly to the virtual hosts. You
must remember to keep all of your links relative within the web sites,
because the pages will be accessed from two different URL's (www.mysite.com
I hope I didn't lose you there, but its not an easy workaround. Maybe you should consider IP based hosting after all. A very similar workaround is also explained on the apache website at http://www.apache.org/manual/host.html.
If anyone has a great resource for Shared IP hosting, I would like to know about it. It would be nice to know what percent of browsers out there support HTTP 1.1, and to have a list of which browsers and versions support HTTP 1.1.
There are two different ways to give your users CGI script capability. The
first is make everything ending in
.cgi a CGI script. The second is to make
script directories (usually named
could also use both methods. For either method to work the scripts must be
world executable (
chmod 711). By giving your users script
access you are creating a big security risk. Be sure to do your homework to
minimize the security risk.
I prefer the first method, especially for complex scripting. It allows you
to put scripts in any directory. I like to put my scripts with the web pages
they work with. For sites with allot of scripts it looks much better than
having a directory full
of scripts. This is simple to setup. First uncomment the
at the end of the
srm.conf file. Then make sure all your directories have
option ExecCGI or
All in the
Making script directories is considered more secure.
To make a script directory you use the ScriptAlias directive in the
srm.conf file. The first argument is the Alias the second is the actual
directory. For example
ScriptAlias /cgi-bin/ /usr/httpd/cgi-bin/ would make
/usr/httpd/cgi-bin able to execute scripts. That directory would be used
whenever someone asked for the directory
/cgi-bin/. For security reasons
you should also change
the properties of the directory to
Options none, AllowOveride none in the
access.conf (just uncomment the example that is there). Also do not make
your script directories subdirectories of your web page directories.
For example if you are serving pages from
/home/httpd/html/, don't make the
/home/httpd/html/cgi-bin; Instead make it
If you want your users to have there own script directories you can use
ScriptAlias commands. Virtual hosts should have there
ScriptAlias command inside the
virtualhost directive tags.
Does anyone know a simple way to allow
all users to have a cgi-bin directory without individual ScriptAlias
There are two different ways to handle user web directories. The first is to
have a subdirectory under the users home directory (usually
The second is to have an entirely different directory tree for web directories.
With both methods make sure set the access options for these directories
The first method is already setup in apache by default. Whenever a request
/~bob/ comes in it looks for the
public_html directory in bob's
home directory. You can change the directory with the
UserDir directive in
srm.conf file. This directory must be world readable and executable.
This method creates a security risk because for Apache to
access the directory the users home directory must be world executable.
The second method is easy to setup. You just need to change the
UserDir directive in the
srm.conf file. It has many
different formats; you may want
to consult the Apache documentation for clarification. If you want each
user to have their own directory under
/home/httpd/, you would use
UserDir /home/httpd. Then when the request
/~bob/ comes in it would translate to
/home/httpd/bob/. Or if you want to have a subdirectory under bob's
directory you would use
UserDir /home/httpd/*/html. This would translate to
/home/httpd/bob/html/ and would allow you to have a script directory
too (for example
There are two ways that apache can be run. One is as a daemon that is always running (Apache calls this standalone). The second is from the inetd super-server.
Daemon mode is far superior to inetd mode. Apache is setup for daemon mode by default. The only reason to use the inetd mode is for very low use applications. Such as internal testing of scripts, small company Intranet, etc. Inetd mode will save memory because apache will be loaded as needed. Only the inetd daemon will remain in memory.
If you don't use apache that often you may just want to keep it in daemon mode and just start it when you need it. Then you can kill it when you are done (be sure to kill the parent and not one of the child processes).
To setup inetd mode you need to edit a few files. First in
see if http is already in there. If its not then add it:
Right after 79 (finger) would be a good place. Then you need to edit the
/etc/inetd.conf file and add the line for Apache:
Be sure to change the path if you have Apache in a different location. And
the second httpd is not a typo; the inet daemon requires that. If you are
not currently using the inet daemon, you may want to comment out the rest of
the lines in the file so you don't activate other services as well (FTP,
finger, telnet, and many other things are usually run from this daemon).
http stream tcp nowait root /usr/sbin/httpd httpd
If you are already running the inet deamon (
inetd), then you only need to
send it the SIGHUP signal (via kill; see kill's man page for more info) or
reboot the computer for changes to take effect. If you are not running
inetd then you can start it manually. You should also add it to your
init files so it is loaded at boot (the
rc.local file may be a good
The newer web publishing tools support this new method of uploading web pages by http (instead of FTP). Some of these products don't even support FTP anymore! Apache does support this, but it is lacking a script to handle the requests. This script could be a big security hole, be sure you know what you are doing before attempting to write or install one.
If anyone knows of a script that works let me know and I'll include the address to it here.
For more information goto Apacheweek's article at http://www.apacheweek.com/features/put.
This is one of my favorite features. It allows you to password protect a directory or a file without using CGI scripts. It also allows you to deny or grant access based on the IP address or domain name of the client. That is a great feature for keeping jerks out of your message boards and guest books (you get the IP or domain name from the log files).
To allow user authentication the directory must have
AuthConfig set in the
access.conf file. To allow access control (by domain
or IP address) AllowOverrides Limit must be set for that directory.
Setting up the directory involves putting an
.htaccess file in the
directory. For user authentication it is usually used with
.htpasswd and optionally a
.htgroup file. Those files can be shared among
.htaccess files if you wish.
For security reasons I recommend that everyone use these directives in there access.conf file:
<files ~ "/\.ht">
deny from all
If you are not the administrator of the system you can also put it in your .htaccess file if AllowOverride Limit is set for your directory. This directive will prevent people from looking into your access control files (.htaccess, .htpasswd, etc).
There are many different options and file types that can be used with access control. Therefore it is beyond the scope of this document to describe the files. For information on how to setup User Authentication see the Apacheweek feature at http://www.apacheweek.com/features/userauth or the NCSA pages at http://hoohoo.ncsa.uiuc.edu/docs-1.5/tutorials/user.html.
The su-exec feature runs CGI scripts as the user of the owner. Normally it is run as the user of the web server (usually nobody). This allows users to access there own files in CGI scripts without making them world writable (a security hole). But if you are not careful you can create a bigger security hole by using the su-exec code. The su-exec code does security checks before executing the scripts, but if you set it up wrong you will have a security hole.
The su-exec code is not for amateurs. Don't use it if you don't know what you are doing. You could end up with a gaping security hole where your users can gain root access to your system. Do not modify the code for any reason. Be sure to read all the documentation carefully. The su-exec code is hard to setup on purpose, to keep the amateurs out (everything must be done manually, no make file no install scripts).
The su-exec code resides in the
support directory of the source. First you
need to edit the
suexec.h file for your system. Then you need to compile
the su-exec code with this command:
Then copy the suexec executable to the proper directory. The Apache default is
gcc suexec.c -o suexec
/usr/local/etc/httpd/sbin/. This can be changed by editing
httpd.h in the
Apache source and recompiling Apache. Apache will only look in this directory,
it will not search the path. Next the file needs to be changed to user root
chown root suexec) and the suid bit needs to be set
chmod 4711 suexec).
Finally restart Apache, it should display a message on the console that
su-exec is being used.
CGI scripts should be set world executable like normal. They will
automaticaly be run as the owner of the CGI script. If you set the SUID (set user id) bit on the
CGI scripts they will not run. If the directory or file is world or group
writable the script will not run. Scripts owned by system users will not be
run (root, bin, etc.). For other security conditions that must
be met see the su-exec documentation. If you are having problems see the
su-exec log file named
Su-exec does not work if you are running Apache from inetd, it only works in daemon mode. It will be fixed in the next version because there will be no inetd mode. If you like playing around in source code, you can edit the http_main.c. You want to get rid of the line where Apache announces that it is using the su-exec wrapper (It wrongly prints this in front of the output of everything).
Be sure and read the Apache documentation on su-exec. It is included with the source and is available on the Apache web site at http://www.apache.org/docs/suexec.html
Apache has the ability to handle server side imagemaps. Imagemaps are
images on webpages that take users to different locations depending on
where they click. To enable imagemaps first make sure the imagemap module
is installed (its one of the default modules). Next you need to uncomment
.map handler at the end of the
srm.conf file. Now all files ending in
.map will be imagemap files. Imagemap files map different areas on the
image to separate links. Apache uses map files in the standard NCSA
format. Here is an example of using a map file in a web page:
In this example
<img src="picture.gif" ISMAP>
mapfile.map is the mapfile, and
picture.gif is the image to
There are many programs that can generate NCSA compatible map files or you can create them yourself. For a more detailed discussion of imagemaps and map files see the Apacheweek feature at http://www.apacheweek.com/features/imagemaps.
Server Side Includes (SSI) adds dynamic content to otherwise static web pages. The includes are embedded in the web page as comments. The web server then parses these includes and passes the results to the web server. SSI can add headers and footers to documents, add date the document was last updated, execute a system command or a CGI script. With the new eXtended Server Side Includes (XSSI) you can do a whole lot more. XSSI adds variables and flow control statements (if, else, etc). Its almost like having an programming language to work with.
Parsing all HTML files for SSI commands would waste allot of system
resources. Therefore you need to distinguish normal HTML files from those
that contain SSI commands. This is usually done by changing the extension
of the SSI enhanced HTML files. Usually the
.shtml extension is used.
To enable SSI/XSSI first make sure that the includes module is installed.
srm.conf and uncomment the
AddHandler directives for
.shtml files. Finally you must set
Options Includes for all directories where
you want to run SSI/XSSI files. This is done in the
access.conf file. Now
all files with the extension
.shtml will be parsed for SSI/XSSI commands.
Another way of enabling includes is to use the
XBitHack directive. If you
turn this on it looks to see if the file is executable by user. If it is
Options Includes is on for that directory, then
it is treated as an SSI file. This only works for files with the mime type
.html .htm files). This is not the preferred method.
There is a security risk in allowing SSI to execute system commands and CGI
scripts. Therefore it is possible to lock that feature out with the
Option IncludesNOEXEC instead of Option Includes in the
access.conf file. All the
other SSI commands will still work.
For more information see the Apache mod_includes documentation that comes with the source. It is also available on the website at http://www.apache.org/docs/mod/mod_include.html.
For a more detailed discussion of SSI/XSSI implementation see the Apacheweek feature at http://www.apacheweek.com/features/ssi.
For more information on SSI commands see the NCSA documentation at http://hoohoo.ncsa.uiuc.edu/docs/tutorials/includes.html.
For more information on XSSI commands goto ftp://pageplus.com/pub/hsf/xssi/xssi-1.1.html.
Apache can be extended to support almost anything with modules. There are allot of modules already in existence. Only the general interest modules are included with Apache. For links to existing modules goto the
Apache Module Registry at http://www.zyzzyva.com/module_registry/.
For module programming information goto http://www.zyzzyva.com/module_registry/reference/
Sorry this section has not been written yet.
Coming soon: mSQL, PHP/FI, cgiwrap, Fast-cgi, MS frontpage extentions, and more.
There aren't any frequent asked questions - yet...
In my humble opinion O'Reilly & Associates make the best technical books on the planet. They focus mainly on Internet, Unix and programming related topics. They start off slow with plenty of examples and when you finish the book your an expert. I think you could get by if you only read half of the book. They also add some humor to otherwise boring subjects.
And remember if it doesn't say O'Reilly & Associates on the cover, someone else probably wrote it.