W3 SERVER SOFTWARE
                                   
 A W3 server, like the ftp daemon , is a program which responds to an
 incoming tcp connection and provides a service to the caller.  There
 are many varieties of W3 server software to serve different forms of
 data.
 
Basic W3 servers

  CERN server             The basic W3  daemon program serves files
                         already in hypertext or plain text.  This
                         daemon then is used as a basis for many other
                         types of server and gateways .
                         
  NCSA server             A server for files, written in C, public
                         domain.  Runs on top of a gopher-style
                         database just like "gopherd".
                         
  Perl server             from Marc VanHeyningen at Indiana
                         University. Wriiten in perl .
                         
  Plexus                  Tony Sander's engversion of Marc VH's.
                         
  MacHTTPD                Server for the Macintosh
                         
  REXX for VM             A server consisting of a amall C program
                         which passes control to a  server written in
                         REXX.
                         
 Whatever server you are running, you will probably be interested in:
 
      Tools for information providers
      
      Syle Guide for Online Hypertext
      
Making a new server

 This daemon is often used as a basis for a more specific server for a
 given application.  A server which allows a world of data to be seen
 as part of the W3 universe is known as a gateway.  (Most servers
 could therefore be regarded as gateways, but the term implies some
 conversion or mapping between dissimilar worlds) .  For  short
 tutorials with examples, see:
 
      Writing a server in C
      
      Writing a server as a script
      
 It is a good idea to pick the basic daemon or one of the servers in
 the list as a starting point when making a new server.
 
Other servers and Gateways




T. Berners-Lee                                                       1

                                 WWW Server Guide)        14 July 1993

 These are servers which provide data extracted from other systems.
 they are built using code from the basic daemon, or scripts. See
 
      List of Gateways available .
      
                                                                Tim BL
                                                                      
About documents generated from hypertext

 Paper manuals generated from hypertext are made for convenience, for
 example for reading when one has no computer to turn to.  We have
 tried to make the hypertext into fairly conventional paper documents,
 but they may seem a little strange in some ways.
 All the links have been removed. Therefore, it is worth looking at
 the table of contents to see what there is in the manual.  Something
 which is not explained in place may be explained in detail elsewhere.
 We have tried to keep related matter together, but sometimes
 necessarily you might have to check the table of contents to find it.
 Please remember that these are for the most part "living documents".
 That is, they are constantly changing to reflect current knowledge.
 If you see a statement such as "Product xxx does not support this
 feature", remember that it was the case when the document was
 generated, and may not be the same now.   So if in doubt, check the
 online version. Of course, the living document may be out of date
 too, in which case it is helpful to mail its author.
 
                                                                Tim BL
                                                                      
                        WWW SERVER USER GUIDE
                                   
 The basic WWW server allows files and directories in a file system to
 be server to the world as menu trees, multimedia, and/or hypertext.
 The http daemon, httpd , is a general server program which runs a w3
 protocol, " HTTP ".   This is a TCP/IP based protocol running by
 convention on port 80.
 
In this guide

  Distribution            How to get the code.
                         
  Compilation             The daemon is compiled in the same way as
                         the library and line mode browser -- see WWW
                         distributed code .
                         
  Installation            How to install a server under unix internet
                         daemon
                         
  Options                 Command line options at run time
                         
  Rule File               The format of a rule file. By default,
                         /etc/httpd.conf
                         
  Etiquette               Conventions you should follow to make life

T. Berners-Lee                                                       2

                                 WWW Server Guide)        14 July 1993

                         smoother
                         
  Debugging               If it doesn't seem to work
                         
  Known bugs              and improvements desired
                         
  Change History          change list of improvements made and bug
                         fixes.
                         
Related documents

  HTML specification      A description of the hypertext markup
                         language used for representing menus, etc
                         
  HTTP specification      A desription of the protocol used by the
                         server.
                         
Status of basic WWW server

 A basic fast information server for files.
 
  Author                  TBL
                         
  Status:                 Version  2 available by anonymous FTP, with
                         no index search but file access, name mapping
                         and security filter, ability to act as
                         gateway for anything in the WWW library's
                         repertoire, including WAIS.
                         
  Plans:                  A version which will allow general unix
                         users to set up an index search daemon. As
                         index search tools are not generally
                         available, we may use the NeXT digital
                         Librarian or WAIS as an basis.
                         
  Platforms               Unix, VMS, VM/CMS (VM/XA).
                         
  Next Milestone:         Run shell scripts to implement virtual
                         documents and searches.
                         
  More information:       User guide ,  Bug list , Internals ,  Change
                         history .
                         
  Wider scope:            W3 servers , Other WWW software
                         
 Features include
 
      Installation under inetd or run stand-alone
      
      Can be run stand-alone by normal user
      
      Automatically generates hypertext view of directory tree


T. Berners-Lee                                                       3

                                 WWW Server Guide)        14 July 1993

      
      Uses "README" files to document directory listings
      
      Handles multimple formats of same file, selects format
      apropriate for client  capabilities
      
      Document name to filename mapping for longer-lived document
      names
      
      Can act as gateway for WAIS, news, etc if needed
      
WorldWideWeb distributed code

 See the CERN copyright .  This is the README file which you get when
 you unwrap one of our tar files. These files contain information
 about hypertext, hypertext systems, and the WorldWideWeb project. If
 you have taken this with a .tar file, you will have only a subset of
 the files.
 THIS FILE IS A VERY ABRIDGED VERSION OF THE INFORMATION AVAILABLE ON
 THE WEB.   IF IN DOUBT, READ THE WEB DIRECTLY. If you have not got
 ANY browser installed yet, do this by telnet to info.cern.ch (no
 username or password).
 
  ARCHIVE DIRECTORY STRUCTURE
 Under /pub/www, besides this README file, you'll find bin, src and
 doc directories.  The main archives are as follows:
 
  bin/xxx/bbbb            Executable binaries of program bbbb for
                         system xxx. Check what's there before you
                         bother compiling. (Note HP700/8800 series is
                         "snake")
                         
  bin/next/WorldWideWeb_v.vv.tar.Z
                         The Hypertext Browser/editor for the NeXT --
                         binary.
                         
  src/WWWLibrary_v.vv.tar.Z
                          The W3 Library. All source, and Makefiles
                         for selected systems.
                         
  src/WWWLineMode_v.vv.tar.Z
                          The Line mode browser - all source, and
                         Makefiles for selected systems. Requires the
                         Library .
                         
  src/WWWDaemon_v.vv.tar.Z
                          The HTTP daemon, and WWW-WAIS  gateway
                         programs. Source.  Requires the Library.
                         
  src/WWWMailRobot_v.vv.tar.Z
                          The Mail Robot.
                         


T. Berners-Lee                                                       4

                                 WWW Server Guide)        14 July 1993

  doc/WWWBook.tar.Z       A snapshot of our internal documentation -
                         we prefer you to access this on line -- see
                         warnings below.
                         
  BASIC WWW SOFTWARE INSTALLATION FROM SOURCE
 This applies to the line mode client and the server.  Below, $prod
 means LineMode or Daemon depending on which you are building.
 
    Generated Directory structure
 The tar files are all designed to be unwrapped in the same (this)
 directory. They create different parts of a common directory tree
 under that directory. There may be some duplication. They also
 generate a few files in this directory: README.*, Copyright.*, and
 some installation instructions (.txt).
 The directory structure is, for product $prod  and machine $WWW_MACH
 
  WWW/$prod/Implementation
                          Source files for a given product
                         
  WWW/$prod/Implementation/CommonMakefile
                         The machine-independent parts of the Makefile
                         for this product
                         
  WWW/$prod/$WWW_MACH/    Area for compiling for a given system
                         
  WWW/All/$WWW_MACH/Makefile.include
                         The machine-dependent parts of the makefile
                         for any product
                         
  WWW/All/Implementation/Makefile.product
                         A makefile which includes both parts above
                         and so can be used from any product, any
                         machine.
                         
    Compilation on already supported platforms
 You must get the WWWLibrary tar file as well as the products you want
 and unwrap them all from the same directory.
 You must define the environmant variable WWW_MACH to be the
 architecure of your machine (sun4, decstation, rs6000, sgi, snake,
 etc)
 In directory WWW, type BUILD.
 
    Compilation on new platforms
 If your machine is not on the list:
 
      Make up a new subdirectory of that name under WWW/$prod and
      WWW/All, copying the contents of a basically similar
      architecture's directory.
      
      Check the  WWW/All/$WWW_MACH/Makefile.include for suitable
      directory and flag definitions.
      
      Check the file tcp.h for the system-specific include file

T. Berners-Lee                                                       5

                                 WWW Server Guide)        14 July 1993

      coordinates, etc.
      
      Send any changes you have to make back to
      www-request@info.cern.ch for inclusion into future releases.
      
      Once you have this set up, type BUILD.
      
  NEXTSTEP BROWSER/EDITOR
 The browser for the NeXT is those files contained in the application
 directory WWW/Next/Implementation/WorldWideWeb.app and is compiled.
 When you install the app, you may want to configure the default page,
 WorldWideWeb.app/default.html. These must point to some useful
 information! You should keep it up to date with pointers to info on
 your site and elsewhere. If you use the CERN home page note there is
 a link at the bottom to the master copy on our server.   You should
 set up the address of your local news server with
 
                      dwrite WorldWideWeb NewsHost  news

 replacing the last word with the actual address of your news host.
 See Installation instructions .
 
  LINE MODE BROWSER
 Binaries of this for some systems are available in /pub/www/bin/ .
 The binaries can be picked up, set executable, and run immediately.
 If there is no binary, see "Installation from source" above.
  (See Installation notes ).  Do the same thing (in the same
 directory) to the WWWLibrary_v.cc.tar.Z file to get the common
 library.
 You will have an ASCII printable manual in the file
 WWW/LineMode/Defaults/line-mode-guide.txt which you can print out at
 this stage. This is a frozen copy of some of the online
 documentation.
 Whe you install the browser, you may configure a default page. This
 is /usr/local/lib/WWW/default.html for the line mode browser. This
 must point to some useful information! You should keep it up to date
 with pointers to info on your site and elsewhere. If you use the CERN
 home page note there is a link at the bottom to the master copy on
 our server.
 Some basic documentation on the browser is delivered with the home
 page in the directory WWW/LineMode/Defaults. A separate tar file of
 that directory (WWWLineModeDefaults.tar.Z) is available if you just
 want to update that.
 The rest of the documentation is in hypertext, and so wil be readable
 most easily with a browser. We suggest that after installing the
 browser, you browse through the basic documentation so that you are
 aware of the options and customisation possibilities for example.
 
  SERVER
 The server can be run very simply under the internet  daemon, to
 export a file directory tree as a browsable hypertext tree.  Binaries
 are avilable for some platofrms, otherwise follow instructions above


T. Berners-Lee                                                       6

                                 WWW Server Guide)        14 July 1993

 for compiling and then go on to " Installing the basic W3 server ".
 
  XMOSAIC
 XMosaic is an X11/Motif  W3 browser.
 The sources and binaries are distributed separately from
 FTP.NCSA.UIUC.EDU, in  /Web/xmosaic.  Binaries are available for some
 platforms.  If you have to build from source, check the README in the
 distribution.
 The binaries can be picked up, uncompressed, set "executable" and run
 immediately.
 
  VIOLA BROWSER FOR X11
 Viola is an X11 application for reading global hypertext.  If a
 binary is available from your machine, in /pub/www/bin/.../viola*,
 then take that and also the Viola "apps" tar file which contains the
 scripts you will need.
 To generate this from source, you will need both the W3 library and
 the Viola source files.  There is an Imakefile with the viola source
 directory. You will need to generate the XPA and XPM libraries and
 the W3 library befere you make viola itself.
 
  DOCUMENTATION
 In the /pub/www/doc directory are a number articles, preprints and
 guides on the web.
 See the online WWW bibliography for a list of these and other
 articles, books, etc. and also the list of WWW Manuals available in
 text and postscript form.
 
  GENERAL
 Your comments will of course be most appreciated, on code, or
 information on the web which is out of date or misleading. If you
 write your own hypertext and make it available by anonymous ftp or
 using a server, tell us and we'll put some pointers to it in ours.
 Thus spreads the web...
 
                                                       Tim Berners-Lee
                                                                      
                                                  WorldWideWeb project
                                                                      
                                     CERN, 1211 Geneva 23, Switzerland
                                                                      
 Tel: +41 22 767 3755; Fax: +41 22 767 7155; email: timbl@info.cern.ch
                                                                      
Installing the basic WWW server

 Instructions for installing it under unix using the inet daemon are
 here.
 There are special instructions if you are installing under VMS .
 The usual way to install a daemon is to either run it from the
 bootstrap command file (for example /etc/rc) so that it runs
 continuously, or to set up the internet daemon (inetd) to run it when
 a call comes in.


T. Berners-Lee                                                       7

                                 WWW Server Guide)        14 July 1993

 See a csh script which does everything below for unix BSD systems but
 which you should modify with care for your own system.
 Note: With  version 2.0 on, a rule file is no longer essential if you
 want to just export a directory tree.
 The installation normally requires superuser status, but it is
 poosible to run httpd from a terminal session as a normal user.
 
  LOG FILE
 If  a log file is required,  make sure that the user name under which
 the daemon is run  has the right to write the file
 
                                                                Tim BL
                                                                      
  PRIVILIGED PORTS
  
   The TCP/IP port numbers below 1024 are special in that normal users
   are not  allowed to run servers on them.  This is a security
   feaure, in that if you connect to a service on one of these ports
   you are fairly sure that you have the real thing, and not a fake
   which some hacker has put up for you.
   
   The normal port number for W3 servers is port 80, which is such a
   port. (This number is assigned by the Internet Assigned Numbers
   Authority, IANA).
   
   When you run a server as a test from a non-priviliged account, you
   will normally test it on other ports, such as 2784 or 5000
   typically.
   
    Under unix
    
   The inet daemon (running as root) can listen for incomming
   conections on port 80 and pass them down to a process with a safer
   uid for the server itself. Of course, you have to be root to set up
   the inet daemon.
   
    Under VMS
    
   Under UCX, The process running as a server needs BYPASS privilege
   to listen to ports below 1024.  This might mean you have to install
   the server.  With other TCP/IP packages, privilege of some sort is
   similarly required.
   
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
  INSTALLING A DAEMON UNDER INETD
 This is how to to set up the internet daemon (inetd) to run your
 HTTPD server whenever a request comes in.   (These steps are the same
 for any daemon under unix: you will probably find a similar thing has
 been done for the FTP daemon, ftpd, for example.)


T. Berners-Lee                                                       8

                                 WWW Server Guide)        14 July 1993

 
    Step1
 Copy the daemon program or shell script ( httpd in this example) into
 a suitable directory such as /usr/etc. Protect it from anyone writing
 to it except root.
 
    Step2
 Put "http" in the /etc/services file, or use the name of a specific
 service of your own if you want to use have a special port number.
  (Exceptions: on a NeXT, see  using the NetInfomanager . On any
 machine running NIS (yellow pages), see specicial instructions ).
 For example,
 
http            80/tcp                  # WorldWideWeb server

    Step3
 Put a line in the internet daemon configuration file,
 /etc/inetd.conf. For example,
 
http    stream  tcp     nowait  nobody  /usr/etc/httpd          httpd
/Public

 (That was all one line.) Here "http" is used as a link between the
 services file and inetd.conf: it could have been any identifier.
 "nobody" is the user name under which you want the daemon to run,
 which determines what privileges it has for example to read data.
 "/usr/etc/httpd" is the actual file name of the server. The rest of
 the line is the arguments passed to httpd: arg0 is the program name,
 "httpd",  by convention. Here the argument "/Public"  is the
 directory tree to be exported. This is in fact the default if no
 directory is given. See command line syntax for more details.
 Note: The inted.conf format varies from system to system. If in
 doubt, copy the format of other lines in your existing inted.conf.
 For example, under ultrix there is no user name field -- everything
 runs as root.
 Note: there seem to be, on the NeXT at least, a limit of 4 arguments
 passed across by inetd!
 
    Step 4
 When you have updated inted.conf, find out which process is running
 inetd, and send it a "HUP" signal.  On BSD unix (For system V, use
 ps-el for ps aux) this looks like:
 
                
                > ps aux | grep inetd | grep -v grep
                root        85   0.0  0.9 1.24M  304K ?  S     0:01 /u
sr/etc/inetd
                > kill -HUP 85
                >


    Test it


T. Berners-Lee                                                       9

                                 WWW Server Guide)        14 July 1993

 Test the server with the line mode browser by giving its address
 explicitly:
 
                        www http://myhost.dom.ain/welcome.html

 This assumes that you have a file "welcome.html" in your exported
 directory.  If it doesn't work, you have probably missed something.
 See notes on debugging .
 
                                                                Tim BL
                                                                      
  USING NIS (YELLOW PAGES)
 If your machine is running Sun's "Network Information Service",
 originally know as 'yellow pages", read this.
 You must:
 
      First make an addition to the /etc/services file just as for a
      normal unix system.
      
      Then, change directory to /var/yp and type "make".
      
 This will  load the /etc/services file info the yellow pages
 information system.
 Some peopl ehave found that they needed to reboot he system afterward
 for the change to take effect.
 
                                                                Tim BL
                                                                      
  ADDING A SERVICE ON THE NEXT
 The NeXT uses the the "netinfo" database instead of the /etc/services
 file.  This is managed with the /NextAdmin/NetInforManager
 application. Here's how to add the service "www":
 
      Start the NetInfomanager by  double-clicking on its icon.
      
      If you are operating in a cluster,  open either your local
      domain (/hostname) or if you have authority, the whole cluster
      domain (/). If you're not in a cluster,  just use the domain you
      are presented with.
      
      Select "services" from the browser tree.
      
      Select "ftp" from the list of services
      
      Select "dupliacte" from the edit menu.
      
      Select "copy of  ftp" and double-click on its icon to get
      theproperty editor.
      
      Click on  "name" and then on the value "copy of ftp". Change
      this to "www" by typing "www" in the window at the botton, and
      hitting return.


T. Berners-Lee                                                       10

                                 WWW Server Guide)        14 July 1993

      
      Click on "port", and then on the value "21". Change it to "80".
      
      Use "Directory:Save" menu (Command/s) to save the result. You
      will have to give a root password or netinfo manager password.
      
                                                                Tim BL
                                                                      
The Rule File

 The rule file (configuration file) defines how the WWW software will
 translate a request into a document name.   For a server, it allows
 one to provide an extra level of  name mapping above that given by
 links in the file system. It allows, for example, out of date names
 to mapped onto their more recent counterparts.
 For the client, it allows access to certain servers to be remapped
 for example caching servers, or to local copies of the same
 information.
 The rule file also allows access to be restricted.  This is
 essential, to prevent, for example, unauthorized access to your
 password file.
 By default, the rule file /etc/httpd.conf is loaded, unless specified
 otherwise with the -R or -r options .
 See also: example rule files , Old format for software before 2.0,
 Setting up gateways, Firewall gateways.
 
  FORMAT
 Each line consists of an operation code and one or two parameters,
 referred to as the template and the result. Anything on a line after
 and including a hash sign (#) is ignored, as are empty lines.
 The server uses the top rule first, then EACH SUCCESSIVE RULE  unless
 told otherwise by PASS or FAIL. The operation codes are as follows
 
  map template result     If the address matches the template, use the
                         result string from now on for future rules.
                         
  pass template           If the address maches the template, use it
                         as it is, porocessing no further rules.
                         
  pass template result    If the string matches the template, use the
                         result string as it is, processing no futher
                         rules.
                         
  fail template           If the address matches the template,
                         prohibit access, processing no futher rules.
                         
 The template string may contain at most one wildcard asterisk ("*").
 The result string may have one wildcard only if the template has one.
 When matching,
 
      Rules are scanned from the top of the file to the bottom.
      


T. Berners-Lee                                                       11

                                 WWW Server Guide)        14 July 1993

      If a request matches a "map" template exactly, the result string
      is used instead of the original string and applied to successive
      rules.
      
      If the request maches a "map" template with wildcard, then the
      text of the request which matches the wildcard is inserted in
      place of the wildcard in the result string to form the
      translated request. If the result string has no wildcard, it is
      used as it is.
      
      When a map substitution takes place, the rule scan continues
      with the next rule using the new string in place of the request.
       This is not the case if a pass ro fail is matched: they
      terminate the rule scan.
      
  SUFFIX DEFINITIONS
 As well as any mapping lines in the rule file, the rule file may be
 used to define the data types of files with particular suffixes.  The
 syntax
 
                suffix  <suffix>  <representation> <encoding> [ <quali
ty> ]

 for example:
 
                suffix  .pc     text/plain          7bit        1.0
                suffix  *.*     application/binary  binary      0.1
                suffix  *       text/plain          7bit


 The parameters are as follows:
 
  <suffix>                The last part of the filename. There are two
                         special cases. "*.*" matches to all files
                         which have not been matched by any explicit
                         suffixes but do contain a dot. "*" by itself
                         matches to any file which does not match any
                         other suffix.
                         
  <representation>        A MIME "content-type" style description of
                         the repreentation in fact in use in the file.
                          See the HTTP spec.  This need not be a real
                         MIME type -- it will only be used if it
                         matches a type given by a client.
                         
  <encoding>              A MIME content transfer encoding type.  Much
                         more limited in variety than representations,
                         basically whether the file is ASCII (7bit or
                         8bit) or binary. A few other encodings are
                         allowed, and maybe extension to compression.
                         
  <quality>               Optional. A floating point number between
                         0.0 and 1.0 which determines the relative

T. Berners-Lee                                                       12

                                 WWW Server Guide)        14 July 1993

                         merits of files xxx.* which differ in their
                         suffix only, when a link to xxx.multi is
                         being resolved.  Defaults to 1.0.
                         
  PRESENTATION DEFINITIONS
 In the rule file for a client, you can define the presentation of a
 given data type. The syntax is
 
                presentation   <representation>  <command-string>

 where the parameters are
 
  <representation>        A MIME-style content type. You can use
                         regulare MIME types, such as image/jpeg, or
                         your own extensions which start with x-, such
                         as image/x-tiff, application/x-my-app.  See
                         also above .
                         
  <command string>        The command needed to display a temporary
                         file of this type.  A "%s" within this string
                         will be replaces with the name of the
                         temporary file.  Note that is any file suffix
                         has been specified as corresenponding to this
                         representation, then the temporarty file will
                         be give that (or the first if there is a
                         choice) suitable suffix.
                         
                                                                Tim BL
                                                                      
  RULE FILE EXAMPLES
 A basic rule file for the http daemon might look like this (it looked
 different before version 2.0 ):
 

pass    /          file:/u/john/welcome.html
pass    /*         file:/u/john/public/*
fail   *

 The first line maps the root document onto a specific document about
 the server, and accepts it.  (see etiquette about the welcome page)
 The second line maps all document names onto filenames in a
 particular directory and accepts them.
 The third line disallows access to all other documents. (There won't
 be in any in this case because of the mapping, but its wise to put in
 for later).
 
    Second example
    

map    /            /tnotes/welcome.html
map    /tnotes/*    file:/u/john/public/*
map    /seminars/*  file:/u/jane/seminars/*


T. Berners-Lee                                                       13

                                 WWW Server Guide)        14 July 1993

pass   file:/u/john/public/*
pass   file:/u/jane/seminars/*.html
fail   *

 The first line maps the root document onto a specific document about
 the server.   Because it is "map and not "pass",  it DOESN'T accept
 it  but passes it on for futher mapping by lines futher down.
 The second line maps all document names starting with /tnote/ onto
 filenames in a particular directory where john maintains the
 technical notes. If someone else takes over the technical notes, we
 can change this. Here we are starting to distinguish between document
 names and file names. This can be carried much further if necessary,
 but one level of mapping is enough to allow for changes of
 administration of different areas.
 The third line separately maps the seminar information into Jane's
 directory.
 The fourth and fifth line enable access to anything in John's
 "public" directory, and any .html file in Jane's "seminar" directory
 tree. Note here that the * maps to any sequence INCLUDING SLASHES so
 all files in any subdirectory of /u/jane/seminars will be enabled so
 long as they end in .html.
 The bottom line will pick up for example any attempt to use the
 server to access non-html files in Jane's seminars directory.
 
    Configuration file for a WAIS gateway
 The httpd daemon can be used as a WAIS gateay if it has been compiled
 with the necessary options and linked with the freeWAIS software. A
 suitable configuration file is
 
map     /*              wais://*
pass    wais://*
fail    *


Server Command Line

 The command line syntax for the basic www server allows a number of
 options and an optional directory argument.
 
                        httpd  [options] [directory]

 The directory argument, if present, indicates the directory to be
 exported. (Version 2.0 and later only.)  If not present, either a
 rule file is be used, to export combinations of directories, or else
 the default is to export the "/Public" directory tree.
 
  EXAMPLES
  
                        httpd -p 80  -dyt /ftp/pub

 This exports the entire /ftp/pub tree with browsable directories and
 README files included at the top of directory listings.


T. Berners-Lee                                                       14

                                 WWW Server Guide)        14 July 1993

 
                        httpd

 This comamnd in the inetd configuration file inetd.conf exports the
 /Public directory tree.  This tree may contain soft links to other
 directory trees.
 
  -dn                     Disable directory browsing. An attempt to
                         access a directory will generate an error
                         response.
                         
  -dy                     Enable direcory browsing.  Directories are
                         returned as hypertext documents. See browsing
                         directories . This is the default.
                         
  -ds                     Enable directory browsing only for
                         directories containing a file named
                         ".www_browsable".
                         
  -dt                     For any browsable directory which contains a
                         README file, include the text of the README
                         file at the top of the document before the
                         listing. This is the default.
                         
  -db                     As -dt but put the README at the bottom,
                         after the listing.  The -db and -dt options
                         may be combined with -dy as -dyb, -dty etc.
                         
  -dr                     Disables the README inclusion feature .
                         
  -l  file                Log all calls to the given file. The file is
                         appended to if it already exists.
                         
  -p port                 Specify the port number. If this option is
                         not given, the daemon assumes that it has
                         been run by inetd, and uses stdin and stdout
                         as its communication channel . Note that port
                         numbers under 1024 are privileged .
                         
  -v                      Verbose mode. Copious trace messages are
                         written to the standard output stream. Mainly
                         for debugging.
                         
  -r file                 Load a rule file . The rules are added after
                         any rules already loaded.  Inhibits the
                         loading of the default rule file.
                         
  -R                      Do not use. Inhibit the loading of the
                         default rule file.  Warning: running without
                         a rule file  normally poses a security
                         problem.  It won't work in general as only
                         the path part of a URL is input into the rule


T. Berners-Lee                                                       15

                                 WWW Server Guide)        14 July 1993

                         file, and a fully qualifiue URL (with file:
                         in front for example) is required on output.
                         
                                                                Tim BL
                                                                      
Debugging the daemon

 Suppose you think you have installed a W3 server but it doesn't work.
 That is, you have followed the installation instructions and the test
 at the end fails. Here we assume you have used port 80.  If you have
 a situation not handled by this problem-solving guide, please mail
 me.
 Type
 
        www http://myhost.domain:80/


 What happens?
 
      "Cannot connect to information server" message, "Unable to
      access document" or some other generic-sounding error message
      
      An empty document is displayed
      
      A document containing the words "Document address invalid or
      access not authorised", or some "Error 500" message is displayed
      
      A document is displayed, but not what you wanted the server to
      give in response to that document name (/)
      
                                                                Tim BL
                                                                      
  DOCUMENT ADDRESS INVALID
 You have accessed a W3 server and you get back a message "Document
 address invalid or access not authorized", or some other error
 message from the server.
 The 1.x server does not (originally for security reasons)
 distringuish between a document which does not exist, and one to
 which you are not allowed access.  However, most server are public
 servers which allow access to anyone, so if you are following a bona
 fide link, this could mean
 
      You have been passed a bad document address. If you are
      following a link, check with the author of the document which
      contained the link.
      
      The document has been moved. Check with the server
      administrator. You should be able to find out who runs the
      server by going to the welcome page (type "g /" with the line
      mode browser) and seeing a link to information about the
      maintainers.
      


T. Berners-Lee                                                       16

                                 WWW Server Guide)        14 July 1993

 If you are the server administrator, and you can't  understand why
 the daemon refuses to deliver the file,
 
      Check the rule file if you have one.  Think out way the document
      name will be mapped successively by each line, and what the
      result will be. Checking the trace below may help clarify this.
      
      Run the daemon with trace from a terminal session to get trace
      information
      
                                                                Tim BL
                                                                      
  CAN'T CONNECT TO SERVER
  
   There is more information you can get.  use the "verbose" option on
   the browser to find out what went wrong:
   
                        www -v http://myhost.domain:80/

   What do you get? A load of trace messages. There are several cases.
   
      The browser can't look up the name of the host. If it can, it
      will display "Parsed address as" message. If not, try fixing
      your name server or /etc/hosts file, or quoting the IP number of
      the host in decimal notation (like 128.141.77.45) instead.
      
      The browser can get to the host but gets "Connection refused"
      status back .
      
      Your browser gets an error number but prints "error message not
      translated". This is because when it was compiled on your
      platform it didn't know what form the error message table took.
      Try the same thing form a unix platform for example.
      
      You get some network error like "network unreachable". Depending
      on whether the IP network is your responsibility or not, and
      your attitude to life, either fix it,  try again in an hour's
      time, or complain to someone.
      
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
  "CONNECTION REFUSED"
  
The browser tries to connect to the daemon but gets this status in the
                                                                trace.
                                                                      
This means that noone was listening on that port number. Check the por
t numbers match btween server and client.  Make sure you specify the p
                ort number explicitly in the document address for www.
                                                                      
If you are running the daemon without the inet daemon, (with the -a op

T. Berners-Lee                                                       17

                                 WWW Server Guide)        14 July 1993

tion) then try running it from the terminal with -v as well.  The trac
e for the server should say "socket, bind and listen all ok". If it do
es, and you still get "connection refused", then you must be talking t
 o the wrong host (or, conceivably, different ethernet adapters on the
                                                            same host)
                                                                      
 If you are running with the inet daemon, then check both the services
file (/etc/service) or database (yellow pages, netinfo) if your system
 uses it,  and the /etc/inetd.conf file. Check the service name matche
                                                  s between these two.
                                                                      
Did you remember to kill -HUP the inet daemon when you changed the int
                                                         ed.conf file?
                                                                      
Try running the deamon from a shell window to see what happens better.
                                                                Tim BL
                                                                      
  YOU GET AN EMPTY DOCUMENT
 The document sent back is empty, but there is no error message.
 The inet daemon has started a process to run your server but it
 immediately failed.  Possibilities include:
 
      The daemon may not be in the file specified, or may not be
      executable by the specified user (or, if a user id is not
      specified in your variety of inetd.conf, root)
      
      You have written your own daemon and it crashes.
      
      You are using ours and it crashes (mail us!)
      
 Try running the daemon from a terminal window to see what happens.
 
                                                                Tim BL
                                                                      
  BAD OUTPUT FROM THE DAEMON
  
                                                 These are some ideas:
                                                                      
      Try running the server from the terminal .
      
      Check the HTML source the daemon produces with
      
        www -source http://myost.domain:80/

      Try telnetting to the daemon and simulating the client:
      

        > telnet myhost.domain 80
        Connected to myhost.domain on port 80
        Escape is ^[
        GET /documentname



T. Berners-Lee                                                       18

                                 WWW Server Guide)        14 July 1993

                                                                Tim BL
                                                                      
  TELNETTING TO A SERVER
  
Most implementations of telnet allow you to specify a port number. Und
er unix this is often just a second parameter, under VMS a /PORT optio
                                                                    n.
                                                                      
The HTTP protocol is a telnet protocol, so you can simulate it just by
 typing things in.  This will help you to see exactly what a sending b
ack, and it will check you that it really is the server not the browse
                                                r which has a problem.
                                                                      
            Here is an example. (You type "telnet..." and  "GET ...").
                                                                      
        > telnet myhost.domain 80
        Connected to myhost.domain on port 80
        Escape is ^[
        GET /documentname
        <PLAINTEXT>
        Document name "/documentname" invalid.

  RUNNING UNDER SHELL
 You don't have to run the daemon under the inted if it doesn't work.
 You can run it from a shell session.
 If the daemon is httpd, then run it from your terminal, with a
 different port number like 8000.  You use the -p option .
 
                httpd -p 8000

 Note: You must be root (under VMS, have some privilege) to run with a
 port number below 1024.
 If you select a port above 1024, then you can run as a normal user.
 This way, anyone can publish files on the net. Howeever, it isn't
 very reliable, as your server will not automatically come back up if
 the machine is rebooted. In the long term it is best to install it
 under "inetd".
 You can't use a port number which has been used by a daemon process
 recently, so you may have to switch port number if you ^C and restart
 the daemon.  When it is running like this, you can read the trace
 messages and use a debugger on it if necessary. (See also: telnetting
 to the server )
 
    Debugging using Trace
 If you can't understand why a server refuses to give back a document,
 then run wiith the -v option to get trace.  You will see the daemon
 setting up the rules for translating requests into local URLs, and
 you will see its attept to access the file (assuming you map requests
 onto files).
 
                httpd -v -p 8000

 Try to access the document from a client using another terminal

T. Berners-Lee                                                       19

                                 WWW Server Guide)        14 July 1993

 window. Look at the trace printout.  It will probably explain what is
 happening.  If it includes specific messages below, follow them to
 detailed help.
 
      Can't find internet hostname `'
      
 If you still can't figure out the problem, mail your local guru help
 desk or if desperate www-request@info.cern.ch ENCLOSING a copy of
 that trace.
 
    Even simpler
 For testing a daemon very simply, without using a client, you can
 make the terminal be the client.  With httpd, or if the server is a
 shell script "myserver", try just running it with the terminal and
 typing GET /documentname into its input:
 
                        > httpd
                        GET /

 Try it with the -v option if what comes back isn't a formatted
 document.
 
                                                                Tim BL
                                                                      
The basic W3 server:  Internals

 This describes the generic hypertext daemon (server) program. The
 daemon is part of the WWW project. See also:
 
      User guide .
      
      Bugs and Features
      
      Other servers
      
 The hypertext daemon, like the ftp daemon, is a program which
 responds to an incomming tcp connection and provides a service to the
 caller.
 
  SOURCES
 A compilation option (SELECT) controls whether more than one
 connection can be handled at a time. This is a function of whether
 the TCP/IP implementation beneath the application has a working
 "select()" routine. If  it is not true, this implementation services
 one connection, then drops it before accepting another one. In
 neither case does the daemon concurrently serve two clients, nor does
 it fork off a process to do that.
 The basic server loop is in the file HTDaemon.c .  A separate module
 ( for example HTRetrieve.c ) contains the code to handle one request.
 Various specific versions of this may be written for different
 flavours of server. Also used are various modules of WWW common code.
  The httpd released from CERN uses almost the entire W3 library and


T. Berners-Lee                                                       20

                                 WWW Server Guide)        14 July 1993

 can therefore access any object which a browser running on that
 machine can access, and return it as HTML or some other format.
 
                                                                Tim BL
                                                                      
Bugs and Improvements needed

 Improvements to be made in the HTTP daemon program are as follows.
 (Se also Features )
 
      Call shell scripts to perform searches on directory trees or
      documents.
      
      The HTRetrieve() routine ought to be able to pick up the user
      node and userid, etc...
      
      Ought to have chroot option. (wwwww July 93)
      
                                                                Tim BL
                                                                      
Daemon features: Update history

 History list for the WWW daemon . (See also bugs ).  Many other
 changes to the daemon are in fact changes to the common code library.
 
  2.06  7 JUNE 93
  
      Bug fix: Load error 500 returned as proper HTTP status, not as
      simple document.
      
      WAIS gateway now caches source files again.
      
      Bug fix: Daemon used to try to display graphics file locally on
      the server when the client couldn't display them!  Cause of much
      confusion  :-)
      
  2.05
  
      Big bug fix in local file directory handling .. didn't work in
      2.04!
      
  2.04  28 APRIL 93
  
      With the properly compiled libwww library, this daemon will
      operate as a WAIS, news etc gaetway if so configured.
      
      WAIS gateway operation bug fix.
      
  2.03-BETA: UNRELEASED
  
      Bug fix: operation with no rule file didn't work as expected.
      


T. Berners-Lee                                                       21

                                 WWW Server Guide)        14 July 1993

  2.02-BETA: 17 MARCH 93
  
      Misleading error trace removed.
      
      Compiled on HP, SGI, Sun, DEC, NeXT and binaries available
      
      Binary handling fixed in library.
      
      Reference to missing HTDirRead.h removed.
      
      Assumes that user can handle files of unknown format
      (application/binary).
      
  2.00-ALPHA  15 MAR 93
  
      Simple command line -- with no parameters, exports the /Public
      directory.
      
      Multiformat handling -- see library changes for 2.0.  Links to
      .multi filenames resolve to any file with same root, any
      recognised extension.
      
  UNREALEASED 0.9B
  
      Bug fix: If a PASS or FAIL line in the configuration file acted
      on a single document id (ie no wildcard) then it crashed the
      daemon. (HTRules.c, 17-Jun-92, TBL).
      
  SEPT 1991 V0.3
  
      Bug fix: Plain text files were returned to be parsed as SGML,
      causing them to come out as garbage. (Mike Sendall)
      
  AUGUST 1991 V 0.2
  
      -R option now suppresses default rule file.
      
      Rule file format changed completely. Now allows authorisation of
      specific paths only.
      
  JUNE 1991 VERSION 0.1
  
      -r and -R options for rules
      
      Default address is now for Inet daemon working. (29 June)
      
      -l option to log to a file.
      
      -a option for address other than default
      
 _________________________________________________________________
 
                                                                Tim BL

T. Berners-Lee                                                       22

                                 WWW Server Guide)        14 July 1993

                                                                      
                       A SHELL SERVER FOR HTTP
                                   
 The HTTP protocol is very simple. The following is an example of a
 server program written in sh:
 
#! /bin/sh
read get docid
echo "<TITLE>$docid</TITLE>"
echo Here is the data

 The docid may have a trailing carriage return to be stripped off on
 some systems. You can modify that script to produce the data you
 actually want. The HTML syntax for marked-up text is fairly simple,
 but if you want just to send plain text, then just send the
 .PLAINTEXT.tag first:
 
#! /bin/sh
read get docid
sed -f txt2html.sed $docid

 or in csh
 
#! /bin/csh
request = ( `echo $<`)
if ($#request <2) exit
sed -f txt2html.sed $request[2]


 When you have written your script, set the execute bit and then
 configure the inet daemon to run it . A few more examples:
 
      A sh script to generate a menu for files in a directory
      
      An awk script to generate menu from a list of files .
      
      A perl script for all kinds of stuff on the ASIS server
      
      The shell script of the Hytelnet gateway
      
 If you know the perl language, then that is a powerful (if otherwise
 incomprehensible) language with which to hack together a server.
 See also a case study of mapping a database onto the web .
 All contributions to these examples welcome!
 
                                                                Tim BL
                                                                      
Making a server

   Here is a run-through of what is needed to make a www server , with
   examples from a suggested server for the HEPDATA base of Mike
   Whalley . See also etiquette .


T. Berners-Lee                                                       23

                                 WWW Server Guide)        14 July 1993

   
   Basically, to make the data available, you make a server which is a
   modified version of your program. When a user follows a link to
   HEPDATA (or runs a command to jump straight there), the client
   program opens a connection to a server program on a VM machine
   (say, but could be VMS or unix). The server in turn runs your
   program.
   
   Let me just describe the essence of the changes needed so that you
   can get an idea of how much effort would be involved.
   
   The first thing you do is to make up an arbitrary naming method for
   anything which HEPDATA can display.  In this I include the welcome
   page, any menu, any article, any help text.  Typically one invents
   a hierarchical naming scheme, like
   
        /HEPDATA                        The first "welcome" menu
        /HEPDATA/HELP                   The top-level help

        /HEPDATA/HELP/REAC              The help on the reaction datab
ase.

        /HEPDATA/REAC                   The reaction database itself

        /HEPDATA/REAC?P+PBAR            list of reactions involving p
and pbar (?)

        /HEPDATA/DATA/RD125V687         Some article (say).

   You do this because, whereas an interactive user follows a path
   through the program, the W3 user calls the program once for each
   thing. There is no "state" information. This allows one to make a
   hypertext link to any part of the scheme and jump back in again
   later. For example, one might want to quote an article, or the
   reaction database, or a particular list of reactions.
   
   Now all you do is modify the program so that, given a name above,
   it will
   
   return the required document.  This means basically turning it from
   a sequence the user goes through into a set of conditionals to
   isolate each of the individual cases above. Apart from that, the
   data retrieval code is unchanged apart from the output formatting.
   Many of the options in fact mean mapping the name onto a fixed
   
   file's name its the searches which have to activate real code.
   
   The hypertext trick you need to use in the menus. Where an option
   is normally output to the screen, you have to tell the client what
   to ask for is the user selects that option. For example, in the
   main menu /HEPDATA you have an option which gives the help. You
   would represnt this "anchor" as


T. Berners-Lee                                                       24

                                 WWW Server Guide)        14 July 1993

   
<A NAME=4 HREF=/HEPDATA/HELP> Help </A>

   "Help" is all that is displayed, with some indication that it is an
   option. If the user choses (clicks a mouse on, choses by number
   depending on which client he has) then the client asks the server
   for /HEPDATA/HELP. ("A" is for "anchor", "HREF" is for "hypertext
   reference")
   
   For the index searches, it's as simple. When the server sends the
   text called /HEPDATA/REAC it also sends a special tag . This tells
   the client to enable a FIND command, or find panel etc (depending
   on the client). You don't have to do any human interface work. The
   client automatically comes back with a search coded up in the form
   /HEPDATA/REAC?P+PBAR etc. Your server in turn returns a menu (say)
   with pointers to the data which has been found.
   
   You can also put some formatting tags (like headings) which will
   make the data look really nice on a window system.
   
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
                           W3 AND HTMLTOOLS
                                   
 These tools aid managements of W3 servers, generation of hypertext,
 etc.
 
  W3 basic daemon         Part of the W3 project code.
                         
  Index search server    which is a slight modification to basic CERN
                         daemon, with a couple of scripts and WAIS
                         programs. Implements searches on entire
                         directory trees of WWW documents using WAIS
                         inverted indexing.
                         
  Gateway servers        which you can take and adapt.
                         
  Framemaker interface    There are some tar files on the anonymous
                         FTP archive on file://info.cern.ch/www/src
                         which allow FRAMEmaker to be used as a W3
                         tool. Dan Conolly, Convex. Incldues MIF HTML
                         translation.
                         
  Making HTML into TeX    We did this with the "WWW Book" to print it.
                         See the Makefile for example, and the scripts
                         html2latex.sed and sub1.sed . We wrote a
                         special introduction, but otherwise all the
                         text was hypertext from the W3 project.
                         
  Generating HTML         These are scripts for generating SGML


T. Berners-Lee                                                       25

                                 WWW Server Guide)        14 July 1993

                         hypertext from things like directory
                         listings, etc. Also, for checking and
                         correcting dubious HTML.
                         
  WP5.1 to HTML          WordPerfect 5.1 to HTML conversion
                         
  LaTex to HTML           Code from Nikos Drakos, Computer Based
                         Learning Unit, University of Leeds.
                         
  Server log analysis     Analysing server logs requires first of all
                         changing the numeric internet node numbers
                         into domain names. httpd-analyse.c is a
                         program to do that. Feed the results through
                         awk and grep of your choice!
                         
  Server log analysis     Getsites .c is a program which generates
                         reports on a weekly or monthly basis.
                         
  Web-roaming  robot etc
                          Guido van Rossum's knobot code in "Python"
                         language.
                         
  Telnet server           Setting up a service machine for anonymous
                         users to log in to a www client.
                         
  Mail Robot              A program to return any information in the
                         web information by electronic mail
                         
                                                                Tim BL
                                                                      
HTMLGeneration

 Here are some example files you can use for generating HTML from
 lists of files and other things.
 
  RTF to HTML            Convert RTF (using specific styles) into
                         HTML.
                         
  fix-html.pl            written by Dan Connolly, is a perl script to
                         legitimize old HTML files into SGML-abiding
                         HTML (as per the DTD that Dan created).
                         
  text2html.sed           A sed script to turn plain text into
                         plain-looking valid HTML markup so that it
                         will be rendered just as it was.
                         
  ls2html.awk            is an awk script which will just take a list
                         of names and generate a menu.
                         
  dir2html               is a shell script which generates a menu of
                         pointers to files with particular suffixes in
                         a set of directories. It also includes a


T. Berners-Lee                                                       26

                                 WWW Server Guide)        14 July 1993

                         README file at the head of the hypertext list
                         if one exists.
                         
  htn2html.c              See the Hytelnet gateway for the program to
                         convert hytelnet data into HTML.
                         
  findrefs.pl             Written by Ari Lemmke, finds references
                         http:... in plain text files and generates
                         anchors out of them.
                         
 You can make any variations on these you like of course. [CERN does
 not accept any responsability for things quoted in these lists].
 
Updating the Newsgroup lists

 To update some of the news pages automatically you must be logged on
 to the news server or have the news directories mounted.
  Carl mentioned that you must be a member of the UNIX group news
 (otherwise you won't have permission to read the news directories)
 but that doesn't seem to be necessary for these functions.
 
  UPDATEGROUPS
 This script updates the list of newsgroups. For the overview list ,
 it saves everything before the "Others" heading, and adds on a list
 of pointers to newsgroup stems not already mentioned in the saved
 hypertext.
 For each stem, it saves any command before the glossary list of
 groups, and then regenerates that list of groups.
 
  NEWSPAGE_UPDATE (OLD)
 The script NewsPage_Update creates complete lists of active groups
 for the following groups: alt, bionet, bit, biz, cern, ch, comp,
 eunet, gnu, news, rec, sci, soc, talk, vmsnet. It does this by
 writing the header in explicitly for each group, and then generating
 a list of of subgroups using FindGroups
 For comp and news, a full list is placed in fullcomp.html and
 fullnews.html. The files comp.html and news.html are formatted by
 hand already, and so are not touched by the script.
 NewsPage_Update works by writing some HTML text into a file for each
 group to be updated, called [newsgroup_name].html.new, then calling
 the script FindNewsGroups.  This checks the file
 /usr/local/lib/news/newsgroups for the groups within the current
 group which are active.  Finally the new file is renamed to remove
 the .new.
 The list of stems to search, and their titles and any other comment
 is hardcoded into the NewsPage_Update script, and the list is
 DUPLICATED in Others_Update.
 
  OTHERS_UPDATE
 The Others_Update script finds stems which are not included in the
 Overview.html file, but which are active.  This list of which groups
 not to include is hardcoded into the script.  For each group, it


T. Berners-Lee                                                       27

                                 WWW Server Guide)        14 July 1993

 calls GrpCreate.  This adds the name to OtherGroups/Overview.    It
 then runs FindNewsGroups for each group.
 
    NOTE
 Once the script has completed all the .new groups must be renamed
 manually to remove the .new extension.
 
  GRPCREATE
 This reads a newsgroup stem name from stdin.
 It then creates the top of a file for the list of groups with that
 stem. This will be called ${nn}.html.new. where ${nn} is the stem
 name. Unfortunately there is no way to get a description of the stem
 to include in this file. However, if the .html file already exists,
 it will use everything up to an excluding the first DL tag from the
 .html file for the .html.new file. Therefore, everything above the DL
 tag may be hand edited.
 GrpCreate adds a pointer from OtherGroups/Overview.html.new to the
 .html file.
 The .html file is renamed .html.old, and teh .html.new becomes .html,
 with diffs being stored in a .diffs file under the date.
 
.\" Macros for HTML .\" Jim Davis 6 Nov 92 .ps 12 .in 5 .de B ..
 .de R .. .de H1 .ti -5 .ps 18 \fB\\$1\fR .ps 12 .br .. .de H
2 .ti -3 .ps 14 \fB\\$1\fR .ps 12 .br .. .de H3 \\$1 .br ..
 .de H4 \\$1 .. .de H5 \\$1 .. .de H6 \\$1 .. .de H7 \\$1 .
. .de H8 \\$1 .. .de H9 \\$1 .. .de DL .in +5 .. .de DE .in
                -5 .. .de DT .ti -3 * \\$1 .. .de DD .br .. 
                                                                      

Date: Wed, 4 Nov 1992 16:48:34 -0500
From: Jim Davis <davis@dri.cornell.edu>
To: wei@xcf.berkeley.edu, www-talk@nxoc01.cern.ch
Subject: improved printing of WWW files

If you can't quite manage to live without hardcopy, you may wish somet
imes to print WWW files.  I have written a couple of scripts to do thi
s.  They are particularly useful with Pei Wei's excellent Viola WWW br
                                                                owser.
                                                                      
                         A tar archive is available for anonymous FTP:
                                                                      
                               dri.cornell.edu/pub/davis/print-www.tar
                                                                      
                                                          It contains:
                                                                      

README
print-www
print-www.l
html-to-latex
html2latex.sed (modified version of original CERN version)



T. Berners-Lee                                                       28

                                 WWW Server Guide)        14 July 1993

The hardest part was writing the perl script to obtain documents via h
      ttp protocol - turns out you cant just run pipes through telnet.
                                                                      
The conversion from HTML to LaTex is not really robust yet -  this  is
  doubly hard since there is no guarentee that the HTML is legal.  But
 at least it works for my test cases.  No doubt it will be improved in
                                                                 time.
                                                                      
                                                           best wishes
                                                                      
                           GATEWAY SOFTWARE
                                   
 See also: W3 server software , W3 client software
 These are servers which provide data extracted from other systems.
 they are built using code from the basic daemon, or scripts.
 
  FIND gateway           for CERN/VM XFIND which calls a REXX exec to
                         get the information from the XFIND system
                         running on the CERNVM mainframe.
                         
  Hytelnet gateway        A gateway to Peter Scott's list of telnet
                         sites
                         
  VMS Help gateway        This allows any VMS help files to be made
                         available to WWW clients. Runs on VAX/VMS.
                         
  WAISGate                A gateway to information available using the
                         W.A.I.S. protocol.
                         
  DCLServer               A server for VMS systems which allows you to
                         write a gateway to your own favorite
                         information system using DCL.
                         
  System33                A (big) csh script server providing data
                         including Xerox System33 documents, man pages
                         in plain text, phone numbers, etc. etc...!
                         
  Oracle                  A generic server to oracle. Could be used as
                         a basis for gateways to specific Oracle
                         databases.
                         
  Geography               Gateway to the Geography server at U
                         Michigan
                         
  TechInfo                TechInfo is the CWIS from MIT.  A gateway
                         exists thanks to Linda Murphy/Upenn.
                         
                                                                Tim BL
                                                                      
Geography gateway

                                                      Wed, 18 Nov 1992
                                                                      

T. Berners-Lee                                                       29

                                 WWW Server Guide)        14 July 1993

Jim Davis  Here is a quickly hacked up Gateway from WWW to the Univers
ity of Michigan Geography server.  It expects one argument, a  WWW doc
 id.  It ignores the "pathname", extracts the search words, then passe
s those to the server.  It does NOT parse the data returned by the ser
 ver (that is an improvment yet to be done) but you can understand the
                                                               output.
                                                                      
To use this, you would need to have an HTTP server running someplace w
 here you can attach this gateway.  I can provide the very simple HTTP
server I use here, but this subject is already documented in the WWW o
                                                  nline documentation.
                                                                      
                                                   Source code in perl
                                                                      
The WWW TechInfo gateway

 This is a gateway built using the basic server code, plus one source
 file in C. Thanks to Linda Murphy of Univerity of Pennsylvania for
 the etchinfo code.
 
      The gateway data as running at CERN
      
      The source file
      
                                                                Tim BL
                                                                      
The W.A.I.S. - WWW gateway

 This is an example of a WWW server and a WAIS client. It is just the
 regular httpd daeomon linked with:
 
      a version of the libwww library which was compiled with the
      DIRECT_WAIS option, and includes the HTWAIS module;
      
      the freeWAIS libraries from CNIDR.
      
 See a summary of some data available through the gateway .
 
  WSRC FILES
 The gateway keeps a cache of WAIS "source" files. These are files
 describing WAIS servers. They are normally picked up automatically by
 searching a "directory of servers" index. Once the gateway has picked
 up a desciption of  a server,  it uses the description to describe
 the server to those who follow links to it. (See the HTWSRC module of
 libwww)
 These source files are parsed, and are kept in the directory
 /usr/local/lib/WAIS under the server name, port, and database name.
 
                                                                Tim BL
                                                                      
VMS Help server

 This server can provide WWW users with any information stored in VMS

T. Berners-Lee                                                       30

                                 WWW Server Guide)        14 July 1993

 Help format.
 
    Additional information available:       :->
    
  Try me !               An example server running at CERN
                         
  Status                 The current state, pointers to more
                         information
                         
                                                                   JFG
                                                                      
  GATEWAY TO VMS HELP: INTERNALS
  
   These are technical and installation notes about the gateway to VMS
   Help . Please send bug reports and suggestions to Jean-Francois
   Groff (jfg@cernvax.cern.ch).
   
    Sources
    
   The program consists of the generic daemon HTDaemon.c , and a
   special function, stored in VMSHelpGate.c , to retrieve VMS Help
   data and convert it to HTML.
   
    Installation
    
   The files you need are as follows. You should customise them,
   putting in your own directory names.:
   
  launchgate.com         Runs the server as a detached process. Put a
                         call to this from your sys$startup procedure,
                         wherever that is. This detaches a job to use
                         www_server.com ans input, and a log file as
                         output.
                         
  www_server.com         The server command file, a wrapper for the
                         actual server executable.  In this file, set
                         the temporary directory for the storage of a
                         cache of .HLP files. This file runs the
                         executable.
                         
  test.com               Here is just an example of  a file to build
                         and test the server.
                         
  descrip.mms            This is an MMS file to build the executable.
                         If you don't have MMS, you may be able to
                         figure out from loking at it which commands
                         you should use.  You can find a machine
                         running MMS and generate the equivalent .com
                         files. See comments at the top of this file
                         on how to run it.
                         
   The source files and executable .EXE are currently (October 92)


T. Berners-Lee                                                       31

                                 WWW Server Guide)        14 July 1993

   available on HEP  decnet in vxcrna::disk$d1:[jfg.www...].  Note
   also you can pick up the master sources from dxcern:: automatically
   by running
   
   MMS /MACRO=(U=DXCERN::).
   
   If you are not in HEP decnet, you should find the sources in the
   WWWDaemon_v.vv.tar.Z file in the distribution. See the README file.
   
   _________________________________________________________________
   
                                                                   JFG
                                                                      
  VMS HELP SERVER BUGS
  
This is a list of known bugs and desired improvements. Don't let it sh
rink too fast : send your bug reports and suggestions to Jean-Francois
                                          Groff (jfg@cernvax.cern.ch).
                                                                      
      The keyword search works fine on any number of levels down, but
      then the generic daemon doesn't know how deep the server went,
      so anchor names lack the intermediate levels. Solution :
      generate anchor names relative to the input path (before '?').
      
      DANGER : Attempts to access VMS topics with a weird name like
      ":=" will crash the server because VMS will try to create a .HLP
      file with an invalid file specification due to these special
      characters. Solution : Make a good escaping system (that works
      with VMS and Un*x styles as well). Crude and bulletproof
      solution : Ignore any offending topic name !
      
      Reference to another help library through @ will only search
      SYS$HELP for the corresponding .HLB file.
      
      We need an overview page that lists all help libraries
      available.
      
        __________________________________________________________ JFG
                                                                      
  VMS HELP SERVER FEATURES
  
   This lists the main features of the VMS Help gateway, with
   improvements in reverse chronological order. Help make it grow fast
   : send your bug reports and suggestions to Jean-Francois Groff
   (jfg@cernvax.cern.ch).
   
    Experimental gateway 0.4 -- 2 Oct 91
    
      Accepts user queries by number or by name. In the latter case,
      can go down several levels, for instance, from the main help
      page : "cc /lib" will go to topic CC, subtopic /LIBRARY.
      


T. Berners-Lee                                                       32

                                 WWW Server Guide)        14 July 1993

      On invocation with only //node:port/HELP, displays the contents
      of the standard VMS Help library SYS$HELP:HELPLIB.HLB (function
      lis_to_html).
      
      Address format : //node:port/HELP/[@library/][topic[/subtopic]*]
      
   __________________________________________________________
   
                                                                   JFG
                                                                      
                             STYLE GUIDE
                                   
 This guide is designed to help you create a hypertext database
 effectively communicates your knowledge to the reader.  It has been
 prepared in the light of comments by readers, and many demands by
 providers of online documentation.   Some of the points made may be
 influenced by personal preference, and some may be common sense, but
 a collection of points has been demanded, and so here it is.
 The guide is designed to be read sequentially, but feel free to
 depart from this.  The sections are as follows:
 
      Introduction
      
      Overall structure of your work
      
      Within each document
      
      Test your document
      
      Background reading
      
      Reader comments
      
This document is open to comment

 Suggestions are strongly invited, if you think of anything mail it to
 timbl@info.cern.ch, mentioning the Style Guide for Online Hypertext
 or its URL.
 
                                                                Tim BL
                                                                      
Introduction

 You are going to write (or generate ) some online hypertext. Because
 hypertext is potentially unconstrained you are a little daunted. Do
 not be. You can write a document as simplly as you like.  In many
 ways, the simpler the better.
 You will be writing a number of separate files.  These files will be
 linked to each other, and to external documents, to make your final
 work.
 You may think of your work as a "document", and if it were on paper,
 then you would call it that.  In the online case though, we tend to
 refer to each individual file as a document. A  document may

T. Berners-Lee                                                       33

                                 WWW Server Guide)        14 July 1993

 correspond, in the book analogy, to a section or a subsection, or
 even a footnote. In this guide, we'll refer to the whole collection
 as a work.
 The document is the unit by which information is picked up.  At any
 one time, a document is completely loaded into the reader's computer.
 It is also normally the amount you edit at any one time, though with
 a good editor you will probably have a number of documents open at a
 time.
 The section on structure discusses how you organize your material
 into documents.   Another section discusses how to organise your
 material within a document .
 (Up to overview ,  on to structure )
 
                                                                Tim BL
                                                                      
Structure

 If you have in mind a body of information to put across to your
 reader, you probably have a mental organisation for it.  Normally
 this is a sort of hierarchical tree, like the chapters of a book if
 you were to write a book.
 Keep this structure.  It helps readers to have a tree structure as a
 basis for the book: it gives them a feeling of knowing where they
 are.   You can also us this structure for oganising your files in
 directories.
 You should also bear in mind:
 
      The reader's preconceived structure
      
      The idea of overlapping trees
      
      How big to make each document
      
 (Up to overview , back to Introduction, on to: writing each document)
 
                                                                Tim BL
                                                                      
  THE READER'S STRUCTURE .
 Remember always the audience for whom you are writing.  If they are
 novices in the subject,  it will normally help if you are firm about
 the structure of your work, so that they can learn the structure of
 the knowledge itself.   For example, if you feel that the subject
 falls into three distinct areas, then that is an importnat thing to
 teach.
 If, however, your readers will already have some knowledge in the
 subject, then they will already have formed their own structure for
 it.  In this case they will conciously or subconsiouly know where
 they expect to find things. If your structure is different from
 theirs,  enforcing it too strongly will confuse them and put them
 off.
 You may in this case have resist a strong tendency to put across your
 own structure strongly and to the detriment of all others.  There are


T. Berners-Lee                                                       34

                                 WWW Server Guide)        14 July 1993

 two solutions.
 If you have a single well-defined audience in mind, who will share a
 similar world view, then try to write excatly for that world view
 rather than yours.
 If you are simultaneously writing for more than one group, then you
 must provide for both.
 When you make a reference,  qualify it  with a clue to allow soime
 people to skip it. For example, "If you really want to know how it
 works inside, see the Internals guide", or "A step-by-step
 introduction is in the tutorial".
 Provide links for both reader's views. Your work will be more
 connected than a simple tree, but with proper qualifiaction, noone
 should get lost.
 Provide two sepate tree "roots". For example, you can write a
 step-by-step tutorial  and a functionaly direct reference tree for
 the same data. Both will at the lowest level have the same data, but
 while the first will deal with the simple things first, the second
 may be functionnaly grouped.   This is just like having several
 indexes to a book.  The tutirial might also include information which
 the reference work does not.
 
 (Up to overview , back to Introduction , on to: writing each document
 )
 
                                                                Tim BL
                                                                      
  OVERLAPPING TREES
 Here is an example of a work (describing some programming functions,
 say) with two separate structures:
 
                        Tutorial                        Reference
                           |                                |
                  Let's do it togther                  ---------------
--
                from simple to difficult              |
|
                            |                   by Functional      Alp
habetical
                            |                       group            b
y name
                  Task oriented examples              |
|
                            |                          ---------------
--
                            |                               |
                  Examples of use of               Syntax definition f
or
                  specific functions   <-------->    specific function
s

 The novice user starts at the top left, and works his way down. Where
 he needs specific details, he will get down to the examples and from


T. Berners-Lee                                                       35

                                 WWW Server Guide)        14 July 1993

 them a link to the underlying definitive desctiptions of each. As far
 as he is concerned, he is reading a tree-strucured work.   In fact,
 he is reading the same information as the expert who, coming in to
 check on one particular function, then looks up an example of its
 use.
 (Up to structure , back to user's structure , on to: document size )
 
                                                                Tim BL
                                                                      
  HOW BIG TO MAKE EACH DOCUMENT
 The most important point here is that a document should put across a
 well-defined concept.  It is not generally worth splitting one idea
 arbitrarily into two bits in order to make the bits smaller.  Nor is
 it a good idea to put together ideas which area really separate just
 to make a bigger document.
 A document can be as small as a footnote .
 There are two upper limits on a document's size.  One is that long
 documents will take longer to transfer, and so a reader will not be
 able to simply jump to it and back as fast as he or she can think.
 This depends a lot on the link speed of course.
 The other limit is the difficulty for a reader to scroll through
 large documents. Readers with character based terminals don't general
 read more than a few screens.  They often only absorb what is on the
 first screen, as if that is not interesting they won't be bothered to
 scroll down.  Readers are also put off by being left at the top of a
 large document.
 Readers with graphic interfaces generally scroll through long
 documents with a scroll bar.   When the scroll bar is moved a small
 amount, the document should move a sufficiently small amount so that
 some of the original window-full is still left in the window.  This
 allows the reader to scan the document. If the document is any
 bigger, then it is basically unreadable, in that any movement of the
 scroll bar will loses the place and leaves the reader disoriented.
 Advantages with longer documents are that it is easier for readers
 with scrollbars to read through in an uninterrupted flow, if that is
 how the document is written.
 Also,  one doesn't have to go to the trouble of making (or
 generating) so many links and keeping them up to date if things are
 altered.  If making the links is a problem, just settle for one link
 to a contents page.  Some browsers have "next" and "previous" buttons
 to allow a document to be browsed serially according to a list.
 (In fact, one can normally scroll up and down explicitly page by
 page, but this is gives the same feeling as the terminal interface.)
 A rough guide, then, for the size of a document is:
 
      For online help, menus giving access to other things: small
      enough to fit on 24 lines.  Check this by using a terminal
      browser.
      
      For textual documents, of the order of half a letter-sized (A4)
      page to 5 pages.
      


T. Berners-Lee                                                       36

                                 WWW Server Guide)        14 July 1993

 (Up to structure , back to overlapping trees , on to: within each
 document )
 
                                                                Tim BL
                                                                      
Within each document

 This section of the style guide deals with the layout of text within
 a "document", the unit of retrieval of information on the web.
 To be completed.
 You should try to:
 
      Sign your work
      
      Give its status
      
      Make links into context .
      
      Use context-free document titles
      
      Format device-independantly
      
      Write for the printed work too
      
      Write readable text despite the links
      
 (up to overview , back to structure , on to testing )
 
                                                                Tim BL
                                                                      
  SIGN IT!
 An important aspect of information which helps keep it up to date is
 that one can trace its author.  Doing this with hypertext is easy --
 all you have to do is put a link to a page about the author (or
 simply to the author's phone book entry).
 Make a page for yourself with your mail address and phone number. At
 the bottom of files for which you are responsible, put a small note
 -- say just your initials -- and link it to that page. The address
 style (typically right justified) is useful for this.
  Your author page is also a convenient place to put and disclaimers,
 copyright noitices, etc which law or convention require. It saves
 cluttering up the mesages themselves with a long signature.
 If you are using the NeXT hypertext editor, then you can put this
 link from your default blank page so that it turns up on the bottom
 of each new document.
 ( up , back to ..., on to  giving your document's status)
 
  THE STATUS OF YOUR DOCUMENT
 Some information is definitive, some is hastily put together and
 incomplete. Both are useful to readers, so do not be shy to put
 information up which is incomplete or out of date -- it may be the
 best there is. However, do remember to state what the status is. When
 was it last updated? Is it complete? What is its scope? For a phone

T. Berners-Lee                                                       37

                                 WWW Server Guide)        14 July 1993

 book for example, what set of people are in it?
 Not every document needs a status declaration, if  there is something
 in the overview page of the work which covers it.
 You can of course also give a feel for the status of the text by its
 language ... bad spelling, missing capitals, and relaxed grammer all
 indicate informal notes.     Careful use of verbs such as "shall" and
 "should", and the introduction of Long Capitalised Noun Phrases
 (LCNPs) will give at least the impression of an ISO standard.  ;-)
 
    Date it
 In some cases it can be useful to put creation dates and last
 modified dates on your work.  (Note that this is the sort of thing
 which one could make a server do automatically with a little
 programming).
 Figure out whether putting one might later save the reader from
 following out of date information.
 (back to Sign It, On to links into context )
 
  LINKING TO CONTEXT
 A major difference between writing part of a serial text, and an
 online document, is that your readers may have jumped in from
 anywhere.   Even though you have only made links to it from one
 place, any other person may want to refer to that particular point,
 and will so make a link to that particular part of your work from
 their own. So  you can't rely on your reader having followed your
 path through your work.
 Of course if you are writing a tutorial, it will be important to keep
 the flow from one document to the next in the order you intended for
 its primary audience.   You may not wish to cater specially for those
 who jump in out of the blue, but it is wise to leave them with enough
 clues so as not to be hopelessly lost. Some ways of doing this are:
 
      Watch that your text and vocabulary stands by itself. Starting a
      document with "The next thing we we consider is..." or "The only
      solution to this problem is..." will certainly confuse.
      
      Sometimes the opening words refer to the context, and can be
      linked to background information.   For example, in the WWW
      project documentation, the first occurence of the acronym WWW is
      often linked back to the central project document.
      
      The navigation hints at the top or bottom of the document can
      give explicit pointers.  Examples are at the bottom of this
      document.
      
 It can also be useful to imagine as you are writing that  you
 yourself may wish to reuse the document. some day.
 (Part of style guide for online hypertext . Up to Writing each
 document , on to Title tag)
 
                                                                Tim BL
                                                                      


T. Berners-Lee                                                       38

                                 WWW Server Guide)        14 July 1993

  DEVICE INDEPENDENCE
 The hypertext you write is stored in HTML language, which does not
 contain information about the fonts and paragraph shapes and spacing
 which should be used for displaying the document.
 This gives great advantages in that your document will be rendered
 successfully on whatever platform it is viewed, including a plain
 text terminal.
 You should be aware that different clients do use different spacing
 and fonts.   You should be careful to use the structuring elements
 such as headers and lists in the way in which they were intended.  If
 you don't like the rendering on your particular client, don't try to
 fix it by using inappropriate elements, or trying for example to
 force extra spacing with empty elements.  This may well end up being
 interpreted differently by other clients and looking very strange.
 You can in many cases configure the client displays each element.
 For example:
 
      Always use heading levels in order, with one heading level 1 at
      the top of the document, and if necessary several level 2
      headings, and then if necessary several level 3 headings under
      each level 2 heading.  If you don't like the way heading level 2
      is formatted, fix it on your client, don't just skip to heading
      level 3.
      
      Don't put extra spaces or blank lines into your text to pad it
      out, except in preformatted (PRE) sections.
      
      Don't refer in your text to facets of particular browesrs.
      Asking someone to "click here" won't make sense without a mouse,
      just as asking someone to "select a link by number" will betray
      the fact that you were using the line mode browser.  Just leave
      a link.  The instructions get boring as the user will normally
      know how to select a link.
      
 See also: testing your document .
 Following these guidelines you may find that the end result does not
 appear on your screen exactly as you would like, but your readers
 will probably be happier.
 (Part of the Style Guide for Online Hypertext .  Up to within each
 document , back to , on to printable hypertext)
 
                                                                Tim BL
                                                                      
  PRINTABLE HYPERTEXT
 In an ideal world, paper might not be necessary.  In a next to ideal
 world, one would have enough time to write a hypertext version of a
 document and also a completely reauthor a paper version.  In the real
 world, you wilkl probably want to generate any printed documents and
 online documents from the same file.
 Suppose the HTML files will be the master, and you will generate the
 printable from this, by translation into TeX, etc.
 If you might one day want to do this, try to avoid references in the
 text to online aspects.  "See the section on device independence" is

T. Berners-Lee                                                       39

                                 WWW Server Guide)        14 July 1993

 better than "For more on device independence, click here.".  In fact
 we are talking about a form of device independence.
 Unfortunately the recommended practices of signing each document and
 giving navigational links  tend to mess up the printable copy, though
 one can of course develop ways of stripping them out if they follow a
 common format.
 (Up to:  within each document;  back to device independece, on to
 ...)
 
                                                                Tim BL
                                                                      
Test your document

 In a way your hypertext is like a book, which you should have
 proof-read. In a way, it is like a program which you should have
 tested.  At least get someone from the group for which you wrote the
 document to read it and give you some feedback.  Other ideas are:
 
      Read the document several different client programs, to ensure
      that you have formatted it in a device independent way.
      
      Monitor the readership of your document. You can do this by
      analysing the server log files .    You may find that some parts
      are not being read, perhaps because people are looking in the
      wrong place for them.  You may see that people often follow a
      path and backtrack. If you can guess what they were looking for,
      you can make the clues around the link more helpful.  (Remember
      to keep log information confidential until you have removed user
      information from it.)
      
      Make it clear whether your will accept criticism or suggestions
      from your readers, and how they should send it.
      
      Ask people to solve problems using the document, and report on
      their success. If they fail, find out what they were looking
      for, whether it was in the document at all,
      
  HOW MUCH TESTING?
 Testing takes time.    The decision of how  much testing you do is
 based on the quality of the document you wish to provide.  You are
 balancing your reader's time and effort against yours.   If your
 document is "selling" an idea, or if you are selling the document or
 providing a service, you will want  to make it as easy as possible
 for the reader.   If many people will read your work, a little of
 your time will save a lot of theirs.
 If however you are documenting some obscure part of a system in which
 no one other than yourself is likely to be interested,  or if you
 feel that your readers are lucky to have anything available at all,
 there is no point wasting time testing it.  In the event of someone
 needing the information, they might have to go to some extra trouble
 to follow several links to find what they want, and then to
 understand what you have written.  This may be the most efficient way


T. Berners-Lee                                                       40

                                 WWW Server Guide)        14 July 1993

 of working.  I emphasize this because there is very much information
 which is for a fleeting moment in people's minds, or is hastily
 scribbled down on some file, and which may be important to posterity.
  It is better for this information to be available even in unpolished
 form than for it to be hidden out of embarrassment for its form.
 Before electronic technology, the effort of publishing was such that
 this information was never seen, and it was a waste, and and
 considered an insult to one's readers, to publish something which was
 not of high quality.  Nowadays, there is "publishing" at all levels,
 and both high quality and hasty documents have their value.    It is
 important, though, to make it clear what the quality of a document is
 when making a reference to it, to avoid disappointment.
 Monitoring the server log files will tell you which documents are
 really being read.  You can use your time most efficiently to improve
 the quality of those.  Of course, analysing the server log files also
 takes time!
 (Part of the Style Guide for Online Hypertext . Back to Within each
 doument, On to Background reading)
 
                                                                Tim BL
                                                                      
Within each document

 This section of the style guide deals with the layout of text within
 a "document", the unit of retrieval of information on the web.
 To be completed.
 You should try to:
 
      Sign your work
      
      Give its status
      
      Make links into context .
      
      Use context-free document titles
      
      Format device-independantly
      
      Write for the printed work too
      
      Write readable text despite the links
      
 (up to overview , back to structure , on to testing )
 
                                                                Tim BL
                                                                      
Background reading

 Some other documents which may be of relevance, if you are reading
 the Style Guide for Online Hypertext :
 
      The HTML Specification and references from it


T. Berners-Lee                                                       41

                                 WWW Server Guide)        14 July 1993

      
      A Beginner's Guide to writing HTML
      
      World-Wide Web server software - a list of pointers
      
      Web Ettiquette -- for Server Administrators
      
 (Back to testing, on to ...)
 
                              MAIL ROBOT
                                   
 The mail robot is a program which will accept incoming mail and allow
 remote users to:
 
      Subscribe to mailing lists (and unsubscribe)
      
      Retrieve information given a W3 addresss (URL)
      
 Originally from UC Berkeley, an enhanced robot is distributed as part
 of the world-wide web global information initiative . Futhur
 information available is:
 
  Help                    The help file for users of the robot service
                         
  Installation            Installation instructions for unix system
                         managers
                         
  Bugs                    Lists of improvements requested or needed.
                         
  Change history          A list of features introduced and bugs
                         fixed.
                         
  See also               Other WWW software
                         
Using the W3 mailing robot

 This robot maintains the W3 mailing lists, and allows W3 documents to
 be retrieved on request.
 You can subscribe or unsubscribe to any of the various WWW mailing
 lists by sending email to the robot "listserv@info.cern.ch" -- see
 the commands listed below.
 If you have any problems, requests or questions for a human being,
 mail "www-request@info.cern.ch". Lists are:
 
  www-announce            Anyone interested in WWW, who would like
                         information about new releases or new online
                         data available. Please refrain from posting
                         administrivia to this large list !
                         
  www-talk                Developers of WWW code, or those interested
                         in discussions of technical details
                         


T. Berners-Lee                                                       42

                                 WWW Server Guide)        14 July 1993

 You can also find information on WWW (as well as many other things!)
 by telnetting to info.cern.ch (no username, no password).
 If you want to pick up the WWW software, then use anonymous FTP to
 info.cern.ch and look in directory /pub/www. Subdirectories are src
 for the latest source packages, bin for executables for various
 machines, doc for "paper copies" of articles on WWW in PostScript and
 ASCII form. To read the latest documentation, use WWW !
 
  COMMANDS
 The commands understood by the listserv program are:
 
  HELP                    lists this file.  This is also sent whenever
                         a message to listserv is received from which
                         no valid command could be parsed.
                         
  HELP groupname          lists a brief description of the group
                         requested.
                         
  ADD listname            Add yourself to the list
                         
  DELETE listname         take yourself off the list
                         
  ADD address listname    Add yourself with a given mail address to
                         the given list. The address must not contain
                         spaces!
                         
  DELETE address listname
                          Remove the given name from the given list.
                         For all ADD/DELETE commands, mail is sent to
                         the address given to confirm the add or
                         delete operation.
                         
  SEND document-address   returns a document with the requested W3
                         address.
                         
  STOP                    Stop processing requests: ignore the rest of
                         the message. Needed if you send a signature
                         on the end of your message (or if some
                         gateway adds one). If in doubt, use it.
                         
 A command must be the first word on each line in the message.  Lines
 which do not start with a command word are ignored.  If no commands
 were found in the entire message, this help file will be returned to
 you. A single message may contain multiple commands; a separate
 response will be sent for each.
 
    Examples
    

        add www-announce

        add me@host.uni.edu www-announce


T. Berners-Lee                                                       43

                                 WWW Server Guide)        14 July 1993

        delete me@host.uni.edu www-talk
        
        send http://info.cern.ch/hypertext/DataSources/bySubject/Overv
iew.html

  SUBSCRIPTION
 If you are not sending mail from your preferred mail address, then
 you can use the second form of the command to give your mail address.
 If you are not on the internet, please convert your address into arpa
 stye. (For example, UK users please use international ordering
 joe@host.ac.uk) Just speficy the mailbox, without any spaces.
 If you omit the 'address' the command will assume the mailbox that is
 in the From: line of the message.  Note that SUBSCRIBE is a synonym
 for ADD; UNSUBSCRIBE for DELETE.
 Please note that is IS possible to add or delete someone else's
 subscription to a mailing list.  This facility is provided so that
 subscribers may alter their own subscriptions from a new or different
 computer account. There is therefore some potential for abuse; we
 have chosen to limit this by mailing a confirmation notification of
 any addition or deletion to the address added or deleted including a
 copy of the message which requested the operation.  At least you can
 find out who's doing it to you.
 Note that although you would mail submissions to a mailing list by
 addressing mail to e.g., www-talk@info.cern.ch, in a subscription
 request you specify the name of the list simply (without the
 @hostname part) as in the first example above.
 
  RETRIEVING DOCUMENTS
 The SEND command (or the WWW command which is equivalent) returns the
 document with the given W3 address, subject to certain restrictions.
 Hypertext documents are formatted to 72 character width, with links
 numbered. A separate list at the end gives the document-addresses of
 the related documents.
 If the document is hypertext, it links will be marked by numbers in
 brackets, and a list of document addresses by number will be appended
 to the message. In this way, you can navigate through the web, albeit
 only at mail speed.
 If you don't know where to start, try asking for one of
 

 http://info.cern.ch./hypertext/DataSources/bySubject/Overview.html
 http://info.cern.ch./hypertext/DataSources/bySubject/Physics/HEP.html
 http://info.cern.ch./hypertext/WWW/TheProject.html

 for lists of futher pointers.
 
  CAUTIONARY NOTE
 As the robot gives potential mail access to a *vast* amount of
 information, we must emphasise that the service should not be abused.
 Examples of appropriate use would be:
 
      Accessing any information about W3 itself;
      

T. Berners-Lee                                                       44

                                 WWW Server Guide)        14 July 1993

      Accessing any CERN and/or physics-related or network development
      related information;
      
 Examples of INappropriate use would be:
 
      Attempting to retrieve binaries or .tar files or anything more
      than directory listsings or short ASCCII files from FTP archive
      sites;
      
      Reading internet newsgroups which your site doesn't take;
      
      Repeated automatic use;
      
 There is currently a 1000 line limit on any returned file. We don't
 want to overload other people's mail relays or our server. We reserve
 the right to withdraw the service at any time. We are currently
 monitoring all use of the server, so your reading will not initially
 enjoy privacy. End of cautionary note.
 Enjoy!
 
                           The W3 team at CERN  (www-bug@info.cern.ch)
                                                                      
Installation

 Here are the steps necessary to install the Mail Robot product on
 your unix system.
 
  CUSTOMISATION
 Set up the variables in listserv.h and CommonMakefile to suit your
 site.
 
  POSTMASTER              The address from which messages appear to
                         come. Why not listserv? Perhaps to prevent
                         mail loops.
                         
  SECUREWWW               The executable W3 line mode browser (v1.3 or
                         later, so as to have the -listrefs option).
                         This is a separate product. For security, www
                         should be writable only by root.
                         
  SERVERDIR               The directory in which you want to put your
                         mailing lists and help about them.
                         
  COMPILE THE PROGRAMS
 Everything compiled on AEM's MicroVax II running ULTRIX 3.0 then
 TBL's NeXT without any problem at all. Your results may vary.
 
  CREATE YOUR SERVDIR
 wherever you specified in listserv.h. Install a HELP file, perhaps
 using the example-files/HELP in this directory as a template.
 
  SET UP AN ALIAS "LISTSERV"
 Make an alias in your /etc/aliases (or /etc/sendmail/aliases,

T. Berners-Lee                                                       45

                                 WWW Server Guide)        14 July 1993

 whatever you have) that points to this program, for example:
 

                listserv:       "|/usr/local/mail/listserv"
                robot:          "|/usr/local/mail/listserv"


  FOR EACH MAILING LIST
 Create a name.info file giving a bit of information about that
 mailing list. see the *.info files in the example-files subdirectory.
 Create a name file in the same directory, consisting of email
 addresses one to a line of subscribers to a group. If it is for a
 brand-new group, create an empty file. Remember that this file must
 be writable by the mail daemon. The name of the file is just the name
 of the group.
 Depending on how you have your mailing lists set up, you may need to
 add an alias to the /etc/aliases file for each of the mailing lists.
 For example:
 
        real-recipes: :include:/usr/local/mail/maillists/recipes

 So sending mail to real-recipes actually goes to each of the
 subscribers listed in /usr/local/mail/maillists/recipes
 
  INSTALL LISTSERV
 Install in the appropriate directory.  Edit the CommonMakefile and
 then
 
                make install

  RUN NEWALIASES
 This gets sendmail to read the changes in /etc/aliases.
 
                newaliases

  TRY IT OUT
 Send mail to listserv with body
 

                HELP

 for example.  You should get a plain text version of the help file.
 
Mail Robot

 This is a "listserv" type program which maintains mailing lists, and
 allows W3 documents to be retrieved by electronic mail.
 
  Author:                 Various, modified by TBL.
                         
  Status:                 Source available  by anonymous FTP. (Oct 92)
                         


T. Berners-Lee                                                       46

                                 WWW Server Guide)        14 July 1993

  Current version:        1.0
                         
  Platforms:              Unix only.
                         
  More information:       Overview , Bugs , change history .
                         
Bugs

 This is a list of bugs in or improvements desired in the Mail Robot.
 See also the list of bug fixes .
 
      The INDEX command ought to be implemented, but for some reason
      always returns an empty list.  Occasionally it seems to work.
      
Change History

 Changes to the Mail Robot , in reverse chronological order:
 
  OCTOBER 1992
 TBL added information retrieval possibility using WWW. Release as an
 unsupported W3 product to those who ask for it.
 
  1991
 TBL rewrote str.c (used to overwrite its arguments).
 
  AEM
 A. E. Mossberg, aem@mthvax.cs.miami.edu made a couple minor changes,
 to make it slightly less UCSD-specific. He also added a README, and
 example files in the subdirectory example-files.
 
  ORIGIN
 Note this is NOT the bitnet LISTSERV program. The term "mail robot"
 is yused to attempt to prevent confusion between these two products,
 which have different functionality although they do basically the
 same sort of thing.
 This was the UCSD listserv program, which AEM retrieved from ucsd.edu
 by anonymous ftp, TBL retrieved from ftp.eff.org  As retrieved, from
 file://ftp.eff.org/pub/listserv2.shar, it consisted of the following
 files:
 
                        README
                        Makefile
                        commands.c
                        listserv.h
                        main.c
                        str.c
                        subscribe.c

   





T. Berners-Lee                                                       47