                          W3 SERVER SOFTWARE
                                   
   A W3 server, like the ftp daemon , is a program which responds to
   an incoming tcp connection and provides a service to the caller.
   There are many varieties of W3 server software to serve different
   forms of data.
   
Basic W3 servers

  CERN server             The basic W3  daemon program serves files
                         already in hypertext or plain text.  This
                         daemon then is used as a basis for many other
                         types of server and gateways .  Platforms:
                         unix, VMS.
                         
  NCSA server             A server for files, written in C, public
                         domain.  Runs on top of a gopher-style
                         database just like "gopherd". Platforms:
                         unix.
                         
  GN                      A single server providing both HTTP and
                         Gopher access to the same data. In C, General
                         Public License. Designed to help serevrs
                         transition from gopher to WWW.  Platforms:
                         unix.
                         
  Perl server             from Marc VanHeyningen at Indiana
                         University. Wriiten in perl . Platforms: unix
                         
  Plexus                  Tony Sander's server originally based on
                         Marc VH's, but incorporating lots more stuff,
                         including an Archie gateway, etc etc.
                         Platforms: unix.
                         
  MacHTTP                 Server for the Macintosh.
                         
  REXX for VM             A server consisting of a amall C program
                         which passes control to a  server written in
                         REXX.
                         
   Whatever server you are running, you will probably be interested
   in:
   
      Tools for information providers
      
      Syle Guide for Online Hypertext
      
Writing a new server

   This daemon is often used as a basis for a more specific server for
   a given application.  A server which allows a world of data to be
   seen as part of the W3 universe is known as a gateway.  (Most
   servers could therefore be regarded as gateways, but the term



T. Berners-Lee                                                       1

   implies some conversion or mapping between dissimilar worlds) .
   For  short tutorials with examples, see:
   
      Writing a server in C
      
      Writing a server as a script
      
   It is a good idea to pick the basic daemon or one of the servers in
   the list as a starting point when making a new server.
   
Other servers and Gateways

   These are servers which provide data extracted from other systems.
   they are built using code from the basic daemon, or scripts. See
   
      List of Gateways available .
      
                                                                Tim BL
                                                                      
About documents generated from hypertext

   Paper manuals generated from hypertext are made for convenience,
   for example for reading when one has no computer to turn to.  We
   have tried to make the hypertext into fairly conventional paper
   documents, but they may seem a little strange in some ways.
   
   All the links have been removed. Therefore, it is worth looking at
   the table of contents to see what there is in the manual.
   Something which is not explained in place may be explained in
   detail elsewhere.
   
   We have tried to keep related matter together, but sometimes
   necessarily you might have to check the table of contents to find
   it.
   
   Please remember that these are for the most part "living
   documents". That is, they are constantly changing to reflect
   current knowledge. If you see a statement such as "Product xxx does
   not support this feature", remember that it was the case when the
   document was generated, and may not be the same now.   So if in
   doubt, check the online version. Of course, the living document may
   be out of date too, in which case it is helpful to mail its author.
   
                                                                Tim BL
                                                                      
                        WWW SERVER USER GUIDE
                                   
   The basic WWW server allows files and directories in a file system
   to be server to the world as menu trees, multimedia, and/or
   hypertext.
   
   The http daemon, httpd , is a general server program which runs a
   w3 protocol, " HTTP ".   This is a TCP/IP based protocol running by



T. Berners-Lee                                                       2

   convention on port 80.
   
In this guide

  Distribution            How to get the code.
                         
  Compilation             The daemon is compiled in the same way as
                         the library and line mode browser -- see WWW
                         distributed code .
                         
  Installation            How to install a server under unix internet
                         daemon
                         
  Options                 Command line options at run time
                         
  Rule File               The format of a rule file. By default,
                         /etc/httpd.conf
                         
  Etiquette               Conventions you should follow to make life
                         smoother
                         
  Debugging               If it doesn't seem to work
                         
  Known bugs              and improvements desired
                         
  Change History          change list of improvements made and bug
                         fixes.
                         
Related documents

  HTML specification      A description of the hypertext markup
                         language used for representing menus, etc
                         
  HTTP specification      A desription of the protocol used by the
                         server.
                         
Status of basic WWW server

   A basic fast information server for files.
   
  Author                  TBL
                         
  Status:                 Version  2 available by anonymous FTP, with
                         no index search but file access, name mapping
                         and security filter, ability to act as
                         gateway for anything in the WWW library's
                         repertoire, including WAIS.
                         
  Plans:                  A version which will allow general unix
                         users to set up an index search daemon. As
                         index search tools are not generally
                         available, we may use the NeXT digital
                         Librarian or WAIS as an basis.



T. Berners-Lee                                                       3

  Platforms               Unix, VMS, VM/CMS (VM/XA).
                         
  Next Milestone:         Run shell scripts to implement virtual
                         documents and searches.
                         
  More information:       User guide ,  Bug list , Internals ,  Change
                         history .
                         
  Wider scope:            W3 servers , Other WWW software
                         
   Features include
   
      Installation under inetd or run stand-alone
      
      Can be run stand-alone by normal user
      
      Automatically generates hypertext view of directory tree
      
      Uses "README" files to document directory listings
      
      Handles multimple formats of same file, selects format
      apropriate for client  capabilities
      
      Document name to filename mapping for longer-lived document
      names
      
      Can act as gateway for WAIS, news, etc if needed
      
      Provides access authorization
      
WorldWideWeb CERN-distributed code

   See the CERN copyright .  This is the README file which you get
   when you unwrap one of our tar files. These files contain
   information about hypertext, hypertext systems, and the
   WorldWideWeb project. If you have taken this with a .tar file, you
   will have only a subset of the files.
   
   THIS FILE IS A VERY ABRIDGED VERSION OF THE INFORMATION AVAILABLE
   ON THE WEB.   IF IN DOUBT, READ THE WEB DIRECTLY. If you have not
   got ANY browser installed yet, do this by telnet to info.cern.ch
   (no username or password).
   
   Files from info.cern.ch are also mirrored on ftp.ripe.net.
   
  ARCHIVE DIRECTORY STRUCTURE
  
   Under /pub/www , besides this README file, you'll find bin , src
   and doc directories.  The main archives are as follows:
   
  bin/xxx/bbbb            Executable binaries of program bbbb for
                         system xxx. Check what's there before you
                         bother compiling. (Note HP700/8800 series is



T. Berners-Lee                                                       4

                         "snake")
                         
  bin/next/WorldWideWeb_v.vv.tar.Z
                         The Hypertext Browser/editor for the NeXT --
                         binary.
                         
  src/WWWLibrary_v.vv.tar.Z
                          The W3 Library. All source, and Makefiles
                         for selected systems.
                         
  src/WWWLineMode_v.vv.tar.Z
                          The Line mode browser - all source, and
                         Makefiles for selected systems. Requires the
                         Library .
                         
  src/WWWDaemon_v.vv.tar.Z
                          The HTTP daemon, and WWW-WAIS  gateway
                         programs. Source.  Requires the Library.
                         
  src/WWWMailRobot_v.vv.tar.Z
                          The Mail Robot.
                         
  doc/WWWBook.tar.Z       A snapshot of our internal documentation -
                         we prefer you to access this on line -- see
                         warnings below.
                         
  BASIC WWW SOFTWARE INSTALLATION FROM SOURCE
  
   This applies to the line mode client and the server.  Below, $prod
   means LineMode or Daemon depending on which you are building.
   
    Generated Directory structure
    
   The tar files are all designed to be unwrapped in the same (this)
   directory. They create different parts of a common directory tree
   under that directory. There may be some duplication. They also
   generate a few files in this directory: README.*, Copyright.*, and
   some installation instructions (.txt).
   
   The directory structure is, for product $prod  and machine
   $WWW_MACH
   
  WWW/$prod/Implementation
                          Source files for a given product
                         
  WWW/$prod/Implementation/CommonMakefile
                         The machine-independent parts of the Makefile
                         for this product
                         
  WWW/$prod/$WWW_MACH/    Area for compiling for a given system
                         
  WWW/All/$WWW_MACH/Makefile.include
                         The machine-dependent parts of the makefile



T. Berners-Lee                                                       5

                         for any product
                         
  WWW/All/Implementation/Makefile.product
                         A makefile which includes both parts above
                         and so can be used from any product, any
                         machine.
                         
    Compilation on already supported platforms
    
   You must get the WWWLibrary tar file as well as the products you
   want and unwrap them all from the same directory.
   
   You must define the environmant variable WWW_MACH to be the
   architecure of your machine (sun4, decstation, rs6000, sgi, snake,
   etc)
   
   In directory WWW, type BUILD.
   
    Compilation on new platforms
    
   If your machine is not on the list:
   
      Make up a new subdirectory of that name under WWW/$prod and
      WWW/All, copying the contents of a basically similar
      architecture's directory.
      
      Check the  WWW/All/$WWW_MACH/Makefile.include for suitable
      directory and flag definitions.
      
      Check the file tcp.h for the system-specific include file
      coordinates, etc.
      
      Send any changes you have to make back to
      www-request@info.cern.ch for inclusion into future releases.
      
      Once you have this set up, type BUILD.
      
  NEXTSTEP BROWSER/EDITOR
  
   The browser for the NeXT is those files contained in the
   application directory WWW/Next/Implementation/WorldWideWeb.app and
   is compiled. When you install the app, you may want to configure
   the default page, WorldWideWeb.app/default.html. These must point
   to some useful information! You should keep it up to date with
   pointers to info on your site and elsewhere. If you use the CERN
   home page note there is a link at the bottom to the master copy on
   our server.   You should set up the address of your local news
   server with
   
                      dwrite WorldWideWeb NewsHost  news

   replacing the last word with the actual address of your news host.
   See Installation instructions .



T. Berners-Lee                                                       6

  LINE MODE BROWSER
  
   Binaries of this for some systems are available in /pub/www/bin/ .
   The binaries can be picked up, set executable, and run immediately.
   
   If there is no binary, see "Installation from source" above.
   
    (See Installation notes ).  Do the same thing (in the same
   directory) to the WWWLibrary_v.cc.tar.Z file to get the common
   library.
   
   You will have an ASCII printable manual in the file
   WWW/LineMode/Defaults/line-mode-guide.txt which you can print out
   at this stage. This is a frozen copy of some of the online
   documentation.
   
   Whe you install the browser, you may configure a default page. This
   is /usr/local/lib/WWW/default.html for the line mode browser. This
   must point to some useful information! You should keep it up to
   date with pointers to info on your site and elsewhere. If you use
   the CERN home page note there is a link at the bottom to the master
   copy on our server.
   
   Some basic documentation on the browser is delivered with the home
   page in the directory WWW/LineMode/Defaults. A separate tar file of
   that directory (WWWLineModeDefaults.tar.Z) is available if you just
   want to update that.
   
   The rest of the documentation is in hypertext, and so wil be
   readable most easily with a browser. We suggest that after
   installing the browser, you browse through the basic documentation
   so that you are aware of the options and customisation
   possibilities for example.
   
  SERVER
  
   The server can be run very simply under the internet  daemon, to
   export a file directory tree as a browsable hypertext tree.
   Binaries are avilable for some platofrms, otherwise follow
   instructions above for compiling and then go on to " Installing the
   basic W3 server ".
   
  XMOSAIC
  
   XMosaic is an X11/Motif  W3 browser.
   
   The sources and binaries are distributed separately from
   FTP.NCSA.UIUC.EDU , in  /Web/xmosaic .  Binaries are available for
   some platforms.  If you have to build from source, check the README
   in the distribution.
   
   The binaries can be picked up, uncompressed, set "executable" and
   run immediately.



T. Berners-Lee                                                       7

  VIOLA BROWSER FOR X11
  
   Viola is an X11 application for reading global hypertext.  If a
   binary is available from your machine, in /pub/www/bin/.../viola*,
   then take that and also the Viola "apps" tar file which contains
   the scripts you will need.
   
   To generate this from source, you will need both the W3 library and
   the Viola source files.  There is an Imakefile with the viola
   source directory. You will need to generate the XPA and XPM
   libraries and the W3 library befere you make viola itself.
   
  DOCUMENTATION
  
   In the /pub/www/doc directory are a number articles, preprints and
   guides on the web.
   
   See the online WWW bibliography for a list of these and other
   articles, books, etc. and also the list of WWW Manuals available in
   text and postscript form.
   
  GENERAL
  
   Your comments will of course be most appreciated, on code, or
   information on the web which is out of date or misleading. If you
   write your own hypertext and make it available by anonymous ftp or
   using a server, tell us and we'll put some pointers to it in ours.
   Thus spreads the web...
   
                                                       Tim Berners-Lee
                                                                      
                                                  WorldWideWeb project
                                                                      
                                     CERN, 1211 Geneva 23, Switzerland
                                                                      
 Tel: +41 22 767 3755; Fax: +41 22 767 7155; email: timbl@info.cern.ch
                                                                      
Installing the basic WWW server

   IIf using unix, for the simplest method see Installation under the
   Internet Daemon.
   
   There are special instructions if you are installing under VMS .
   
   The usual way to install a daemon is to either run it from the
   bootstrap command file (for example /etc/rc) so that it runs
   continuously, or to set up the internet daemon (inetd) to run it
   when a call comes in.
   
   See a csh script which does everything below for unix BSD systems
   but which you should modify with care for your own system.
   
   Note: With  version 2.0 on, a rule file is no longer essential if



T. Berners-Lee                                                       8

   you want to just export a directory tree.
   
   The installation normally requires superuser status, but it is
   poosible to run httpd from a terminal session as a normal user.
   
  ACCESS AUTHORIZATION
  
   See quick guide on how to set up access authorization (for versions
   2.12 and newer).
   
  LOG FILE
  
   If  a log file is required,  make sure that the user name under
   which the daemon is run  has the right to write the file
   
                                                                Tim BL
                                                                      
  PRIVILIGED PORTS
  
   The TCP/IP port numbers below 1024 are special in that normal users
   are not  allowed to run servers on them.  This is a security
   feaure, in that if you connect to a service on one of these ports
   you are fairly sure that you have the real thing, and not a fake
   which some hacker has put up for you.
   
   The normal port number for W3 servers is port 80, which is such a
   port. (This number is assigned by the Internet Assigned Numbers
   Authority, IANA).
   
   When you run a server as a test from a non-priviliged account, you
   will normally test it on other ports, such as 2784 or 5000
   typically.
   
    Under unix
    
   The inet daemon (running as root) can listen for incomming
   conections on port 80 and pass them down to a process with a safer
   uid for the server itself. Of course, you have to be root to set up
   the inet daemon.
   
    Under VMS
    
   Under UCX, The process running as a server needs BYPASS privilege
   to listen to ports below 1024.  This might mean you have to install
   the server.  With other TCP/IP packages, privilege of some sort is
   similarly required.
   
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
  UNDER VMS
  



T. Berners-Lee                                                       9

The daemon runs just as under unix, for which the rest of the document
ation was written.  These instructions are my ideas about how to run i
t under VMS but it is a long time since I did anything like this, so p
lease tell me what is wrong. We don't have effort available to distrib
                                                 ute HTTPD.EXE, sorry.
                                                                      
Compilation of the daemon for VMS requires taking the library and Daem
on source files from the unix release, copying them all onto the VMS s
ystem, compiling them all.  The object files from the Library should g
 o into a libwww.olb file.  The object files from the Daemon should be
                      linked together and with the libwww.olb library.
                                                                      
When compiling the sources, you must use a compiler flag to specify wh
 ether you have Multinet, UCX or Wollongong TCP/IP. (cf rebuilding the
                     line mode browser ).  The flags should be one of:
                                                                      
                /DEF=MULTINET
                /DEF=WIN_TCP
                /DEF=UCX

    Running
    


The daemon works with document names which look like unix-style filena
mes. At the point of reading a file, these are converted into unix sty
                                                         le filenames.
                                                                      
    Testing it
    
Suppose you have compiled and linked httpd successfully. You write a "
welcome.html" file as an introduction to your server for those from ou
tside, and you put it in some suitable directory which you wish to exp
                              ort, say sys$disk[my.public]welcome.html
                                                                      
You run it as an ordinary user on a port over 1024 from a terminal win
                                                                  dow.
                                                                      
                httpd == $sys$disk:[my.directory]httpd.exe
                httpd -p 8000 -v "/sys$disk/my/public"

Note that the directory to be exported is given in unix style. Don't p
anic. Watch the trace (enabled by the -v option) . The server should e
                                          nd up waiting for a message.
                                                                      
From another terminal window, you test the server, giving the internet
 node name of your machine in place if mynode.dom.ain and the same por
t number.   We assume you have the lin mode browser installed.  You co
 uld test it with a GUI browser, but the trace might be more difficult
                                                              to find.
                                                                      
                www -v "http://mynode.dom.ain:8000/welcome.html"




T. Berners-Lee                                                       10

 You should now get your welcome page displayed on the terminal. theer
will be a lot of trace as well which may make it almost unreadable, bu
t if it works of course you run both server and/or client next time wi
                                                         thout the -v.
                                                                      
    Installing properly.
    
 Check whwther your TCP/IP brand contains an inetd daemon. If it does,
that is great, you just run it under the inetd daemon following the ma
nufacturer's instructions.  Set the daemon up to run on any TCP connec
 tion to port 80.  (The service name for port 80 is http).In this case
the only command line parameter which you will need to pass to httpd i
s the directory name.  Omit the port number to tell httpd that it is r
unning under the inet daemon. If you find that this daemon is too slow
     (very possible under VMS), then switch to using the method below.
                                                                      
If you don't have an Inet daemon, then you have to run the daemon as a
 detached process.  To do this you have to add something to one of the
 many VMS boot startup files like SYSTARTUP.COMor some such.  You need
 to be the system manager to do this, and if you are, you probably kno
w where you personally like to put these things.  The command line sho
uld be as in the example when you tested it, except the port should be
                80 (not 8000), and there should be no trace requested.
                                                                      
In practice it seems that under VMS you always have to start a DCL env
ironment to run a server, because if you just detach HTTPD you can't p
ass it any parameters.  So you use the usual trick of running loginout
                                     .exe to create a DCL environment:
                                                                      
                $RUN/DETACH/IN=SYS$EXE:HTTPD.COM/OUT=SYS$TEMP:HTTPD.LO
G -
                    SYS$SYSTEM:LOGINOUT.EXE

                                                    where HTTPD.COM is
                                                                      
                $ httpd == $sys$disk:[my.directory]httpd.exe
                $ httpd -p 80 "/sys$disk/my/public"

               Check that out and tell me if it doesn't work... Tim BL
                                                                      
  INSTALLING A DAEMON UNDER INETD
  
   This is how to to set up the internet daemon (inetd) to run your
   HTTPD server whenever a request comes in.   (These steps are the
   same for any daemon under unix: you will probably find a similar
   thing has been done for the FTP daemon, ftpd, for example.)
   
    Step1
    
   Copy the daemon program or shell script ( httpd in this example)
   into a suitable directory such as /usr/etc. Protect it from anyone
   writing to it except root.
   



T. Berners-Lee                                                       11

    Step2
    
   Put "http" in the /etc/services file, or use the name of a specific
   service of your own if you want to use have a special port number.
   
    (Exceptions: on a NeXT, see  using the NetInfomanager . On any
   machine running NIS (yellow pages), see specicial instructions ).
   
   For example,
   
http            80/tcp                  # WorldWideWeb server

    Step3
    
   Put a line in the internet daemon configuration file,
   /etc/inetd.conf. For example,
   
http    stream  tcp     nowait  nobody  /usr/etc/httpd          httpd
/Public

   (That was all one line.) Here "http" is used as a link between the
   services file and inetd.conf: it could have been any identifier.
   "nobody" is the user name under which you want the daemon to run,
   which determines what privileges it has for example to read data.
   "/usr/etc/httpd" is the actual file name of the server. The rest of
   the line is the arguments passed to httpd: arg0 is the program
   name, "httpd",  by convention. Here the argument "/Public"  is the
   directory tree to be exported. This is in fact the default if no
   directory is given. See command line syntax for more details.
   
   Note: The inted.conf format varies from system to system. If in
   doubt, copy the format of other lines in your existing inted.conf.
   For example, under ultrix there is no user name field -- everything
   runs as root.
   
   Note: there seem to be, on the NeXT at least, a limit of 4
   arguments passed across by inetd!
   
    Step 4
    
   When you have updated inted.conf, find out which process is running
   inetd, and send it a "HUP" signal.  On BSD unix (For system V, use
   ps-el for ps aux) this looks like:
   
                
                > ps aux | grep inetd | grep -v grep
                root        85   0.0  0.9 1.24M  304K ?  S     0:01 /u
sr/etc/inetd
                > kill -HUP 85
                >


    Test it



T. Berners-Lee                                                       12

   Test the server with the line mode browser by giving its address
   explicitly:
   
                        www http://myhost.dom.ain/welcome.html

   This assumes that you have a file "welcome.html" in your exported
   directory.  If it doesn't work, you have probably missed something.
   See notes on debugging .
   
                                                                Tim BL
                                                                      
  USING NIS (YELLOW PAGES)
  
   If your machine is running Sun's "Network Information Service",
   originally know as 'yellow pages", read this.
   
   You must:
   
      First make an addition to the /etc/services file just as for a
      normal unix system.
      
      Then, change directory to /var/yp and type "make".
      
   This will  load the /etc/services file info the yellow pages
   information system.
   
   Some peopl ehave found that they needed to reboot he system
   afterward for the change to take effect.
   
                                                                Tim BL
                                                                      
  ADDING A SERVICE ON THE NEXT
  
   The NeXT uses the the "netinfo" database instead of the
   /etc/services file.  This is managed with the
   /NextAdmin/NetInforManager application. Here's how to add the
   service "www":
   
      Start the NetInfomanager by  double-clicking on its icon.
      
      If you are operating in a cluster,  open either your local
      domain (/hostname) or if you have authority, the whole cluster
      domain (/). If you're not in a cluster,  just use the domain you
      are presented with.
      
      Select "services" from the browser tree.
      
      Select "ftp" from the list of services
      
      Select "dupliacte" from the edit menu.
      
      Select "copy of  ftp" and double-click on its icon to get
      theproperty editor.



T. Berners-Lee                                                       13

      Click on  "name" and then on the value "copy of ftp". Change
      this to "www" by typing "www" in the window at the botton, and
      hitting return.
      
      Click on "port", and then on the value "21". Change it to "80".
      
      Use "Directory:Save" menu (Command/s) to save the result. You
      will have to give a root password or netinfo manager password.
      
                                                                Tim BL
                                                                      
The Rule File

   The rule file (configuration file) defines how the WWW software
   will translate a request into a document name.   For a server, it
   allows one to provide an extra level of  name mapping above that
   given by links in the file system. It allows, for example, out of
   date names to mapped onto their more recent counterparts.
   
   For the client, it allows access to certain servers to be remapped
   for example caching servers, or to local copies of the same
   information.
   
   The rule file also allows access to be restricted.  This is
   essential, to prevent, for example, unauthorized access to your
   password file.
   
   By default, the rule file /etc/httpd.conf is loaded, unless
   specified otherwise with the -R or -r options .
   
   See also: example rule files , Old format for software before 2.0 ,
   Setting up gateways , Firewall gateways .
   
  MAPPING AND FILTERING
  
   Each line consists of an operation code and one or two parameters,
   referred to as the template and the result. Anything on a line
   after and including a hash sign (#) is ignored, as are empty lines.
   
   The server uses the top rule first, then EACH SUCCESSIVE RULE
   unless told otherwise by PASS or FAIL. The operation codes are as
   follows
   
  map template result     If the address matches the template, use the
                         result string from now on for future rules.
                         
  pass template           If the address maches the template, use it
                         as it is, porocessing no further rules.
                         
  pass template result    If the string matches the template, use the
                         result string as it is, processing no futher
                         rules.
                         



T. Berners-Lee                                                       14

  fail template           If the address matches the template,
                         prohibit access, processing no futher rules.
                         
   The template string may contain at most one wildcard asterisk
   ("*"). The result string may have one wildcard only if the template
   has one.
   
   When matching,
   
      Rules are scanned from the top of the file to the bottom.
      
      If a request matches a "map" template exactly, the result string
      is used instead of the original string and applied to successive
      rules.
      
      If the request maches a "map" template with wildcard, then the
      text of the request which matches the wildcard is inserted in
      place of the wildcard in the result string to form the
      translated request. If the result string has no wildcard, it is
      used as it is.
      
      When a map substitution takes place, the rule scan continues
      with the next rule using the new string in place of the request.
       This is not the case if a pass ro fail is matched: they
      terminate the rule scan.
      
    Access Authorization
    
   From the version 2.12 on daemon supports access authorization which
   introduces two new rules: protect and defprot. They have the
   following syntax:
   

        defprot <template> <setupfile> <uid.gid>
        protect <template> <setupfile> <uid.gid>

  <setupfile>             is a pathname for protection setup file
                         which sets up the actual protection
                         parameters.
                         
  Setup file can be omitted from protect rule, but it is obligatory in
                         defprot rule. If setup file is omitted it is
                         not possible to give the <uid.gid> part,
                         either.
                         
  <uid.gid>               are the Unix user id and group id (either by
                         name or by number, separated by comma) to
                         which the server should change when serving
                         the request. These are only meaningful when
                         the server is running as root.
                         
  These can be omitted, when they default to nobody and nogroup.
                         



T. Berners-Lee                                                       15

   See also the full description of protect and defprot.
   
  SUFFIX DEFINITIONS
  
   As well as any mapping lines in the rule file, the rule file may be
   used to define the data types of files with particular suffixes.
   The syntax
   
                suffix  <suffix>  <representation> <encoding> [ <quali
ty> ]

   for example:
   
                suffix  .pc     text/plain          7bit        1.0
                suffix  *.*     application/binary  binary      0.1
                suffix  *       text/plain          7bit


   The parameters are as follows:
   
  <suffix>                The last part of the filename. There are two
                         special cases. "*.*" matches to all files
                         which have not been matched by any explicit
                         suffixes but do contain a dot. "*" by itself
                         matches to any file which does not match any
                         other suffix.
                         
  <representation>        A MIME "content-type" style description of
                         the repreentation in fact in use in the file.
                          See the HTTP spec.  This need not be a real
                         MIME type -- it will only be used if it
                         matches a type given by a client.
                         
  <encoding>              A MIME content transfer encoding type.  Much
                         more limited in variety than representations,
                         basically whether the file is ASCII (7bit or
                         8bit) or binary. A few other encodings are
                         allowed, and maybe extension to compression.
                         
  <quality>               Optional. A floating point number between
                         0.0 and 1.0 which determines the relative
                         merits of files xxx.* which differ in their
                         suffix only, when a link to xxx.multi is
                         being resolved.  Defaults to 1.0.
                         
  PRESENTATION DEFINITIONS
  
   In the rule file for a client, you can define the presentation of a
   given data type. The syntax is
   
                presentation   <representation>  <command-string>

   where the parameters are



T. Berners-Lee                                                       16

  <representation>        A MIME-style content type. You can use
                         regulare MIME types, such as image/jpeg, or
                         your own extensions which start with x-, such
                         as image/x-tiff, application/x-my-app.  See
                         also above .
                         
  <command string>        The command needed to display a temporary
                         file of this type.  A "%s" within this string
                         will be replaces with the name of the
                         temporary file.  Note that is any file suffix
                         has been specified as corresenponding to this
                         representation, then the temporarty file will
                         be give that (or the first if there is a
                         choice) suitable suffix.
                         
                                                                Tim BL
                                                                      
  ALLOWING DIRECTORY BROWSING
  
   Sometimes one has a large body of information and no desire to
   write or generate hypertext for it.  In this case, the WWW daemon
   may be set up to present the directory structure of existing files
   as a hyperetxt tree.
   
   The rule file is still used in exactly the same way to map document
   names onto directory names.  When a document name is allowed and it
   corresponds to a directory, then the behaviour of the httpd server
   depends on the command line options given
   
    Controlling access
    
   If -dn is give, the access is denied.
   
   If -dy is give, or -ds is given and a file named ".www_browsable"
   exists in the directory, then brwosing is allowed.  Note that -ds
   is the default if neither -dn nor -dy is specified.
   
    Inclusion of README files
    
   It is common practice to put a file named README into a directory
   containing instructions or notices to be read by anyone new to the
   directory. The http server will be default embed any README file in
   the hypertext version of a directory.
   
   If  the -dr option is given, README files ar e not included but a
   link is included in the listig as for other files.
   
   The -db and -dt options control whether the README file is included
   at hte top, above the listing (-dt, the default) or at the bottom,
   below the listing (-db).  To put them at the top is normal, but
   they might be better at the bottom if they are very long.
   
   These features are available in httpd version 0.9b or later.



T. Berners-Lee                                                       17

                                                                Tim BL
                                                                      
  RULE FILE EXAMPLES
  
   A basic rule file for the http daemon might look like this (it
   looked different before version 2.0 ):
   

pass    /          file:/u/john/welcome.html
pass    /*         file:/u/john/public/*
fail   *

   The first line maps the root document onto a specific document
   about the server, and accepts it.  (see etiquette about the welcome
   page)
   
   The second line maps all document names onto filenames in a
   particular directory and accepts them.
   
   The third line disallows access to all other documents. (There
   won't be in any in this case because of the mapping, but its wise
   to put in for later).
   
    Second example
    

map    /            /tnotes/welcome.html
map    /tnotes/*    file:/u/john/public/*
map    /seminars/*  file:/u/jane/seminars/*
pass   file:/u/john/public/*
pass   file:/u/jane/seminars/*.html
fail   *

   The first line maps the root document onto a specific document
   about the server.   Because it is "map and not "pass",  it DOESN'T
   accept it  but passes it on for futher mapping by lines futher
   down.
   
   The second line maps all document names starting with /tnote/ onto
   filenames in a particular directory where john maintains the
   technical notes. If someone else takes over the technical notes, we
   can change this. Here we are starting to distinguish between
   document names and file names. This can be carried much further if
   necessary, but one level of mapping is enough to allow for changes
   of administration of different areas.
   
   The third line separately maps the seminar information into Jane's
   directory.
   
   The fourth and fifth line enable access to anything in John's
   "public" directory, and any .html file in Jane's "seminar"
   directory tree. Note here that the * maps to any sequence INCLUDING
   SLASHES so all files in any subdirectory of /u/jane/seminars will



T. Berners-Lee                                                       18

   be enabled so long as they end in .html.
   
   The bottom line will pick up for example any attempt to use the
   server to access non-html files in Jane's seminars directory.
   
    Configuration file for a WAIS gateway
    
   The httpd daemon can be used as a WAIS gateay if it has been
   compiled with the necessary options and linked with the freeWAIS
   software. A suitable configuration file is
   
map     /*              wais://*
pass    wais://*
fail    *


Server Command Line

   The command line syntax for the basic www server allows a number of
   options and an optional directory argument.
   
                        httpd  [options] [directory]

   The directory argument, if present, indicates the directory to be
   exported. (Version 2.0 and later only.)  If not present, either a
   rule file is be used, to export combinations of directories, or
   else the default is to export the "/Public" directory tree.
   
  EXAMPLES
  
                        httpd -p 80  -dyt /ftp/pub

   This exports the entire /ftp/pub tree with browsable directories
   and README files included at the top of directory listings.
   
                        httpd

   This comamnd in the inetd configuration file inetd.conf exports the
   /Public directory tree.  This tree may contain soft links to other
   directory trees.
   
  -dn                     Disable directory browsing. An attempt to
                         access a directory will generate an error
                         response.
                         
  -dy                     Enable direcory browsing.  Directories are
                         returned as hypertext documents. See browsing
                         directories . This is the default.
                         
  -ds                     Enable directory browsing only for
                         directories containing a file named
                         ".www_browsable".
                         



T. Berners-Lee                                                       19

  -dt                     For any browsable directory which contains a
                         README file, include the text of the README
                         file at the top of the document before the
                         listing. This is the default.
                         
  -db                     As -dt but put the README at the bottom,
                         after the listing.  The -db and -dt options
                         may be combined with -dy as -dyb, -dty etc.
                         
  -dr                     Disables the README inclusion feature .
                         
  -l  file                Log all calls to the given file. The file is
                         appended to if it already exists.
                         
  -p port                 Specify the port number. If this option is
                         not given, the daemon assumes that it has
                         been run by inetd, and uses stdin and stdout
                         as its communication channel . Note that port
                         numbers under 1024 are privileged .
                         
  -v                      Verbose mode. Copious trace messages are
                         written to the standard output stream. Mainly
                         for debugging.
                         
  -r file                 Load a rule file . The rules are added after
                         any rules already loaded.  Inhibits the
                         loading of the default rule file.
                         
  -R                      Do not use. Inhibit the loading of the
                         default rule file.  Warning: running without
                         a rule file  normally poses a security
                         problem.  It won't work in general as only
                         the path part of a URL is input into the rule
                         file, and a fully qualifiue URL (with file:
                         in front for example) is required on output.
                         
                                                                Tim BL
                                                                      
Debugging the daemon

   Suppose you think you have installed a W3 server but it doesn't
   work. That is, you have followed the installation instructions and
   the test at the end fails. Here we assume you have used port 80.
   If you have a situation not handled by this problem-solving guide,
   please mail me.
   
   Type
   
        www http://myhost.domain:80/


   What happens?
   



T. Berners-Lee                                                       20

      "Cannot connect to information server" message, "Unable to
      access document" or some other generic-sounding error message
      
      An empty document is displayed
      
      A document containing the words "Document address invalid or
      access not authorised", or some "Error 500" message is displayed
      
      A document is displayed, but not what you wanted the server to
      give in response to that document name (/)
      
                                                                Tim BL
                                                                      
  DOCUMENT ADDRESS INVALID
  
   You have accessed a W3 server and you get back a message "Document
   address invalid or access not authorized", or some other error
   message from the server.
   
   The 1.x server does not (originally for security reasons)
   distringuish between a document which does not exist, and one to
   which you are not allowed access.  However, most server are public
   servers which allow access to anyone, so if you are following a
   bona fide link, this could mean
   
      You have been passed a bad document address. If you are
      following a link, check with the author of the document which
      contained the link.
      
      The document has been moved. Check with the server
      administrator. You should be able to find out who runs the
      server by going to the welcome page (type "g /" with the line
      mode browser) and seeing a link to information about the
      maintainers.
      
   If you are the server administrator, and you can't  understand why
   the daemon refuses to deliver the file,
   
      Check the rule file if you have one.  Think out way the document
      name will be mapped successively by each line, and what the
      result will be. Checking the trace below may help clarify this.
      
      Run the daemon with trace from a terminal session to get trace
      information
      
                                                                Tim BL
                                                                      
  CAN'T CONNECT TO SERVER
  
   There is more information you can get.  use the "verbose" option on
   the browser to find out what went wrong:
   
                        www -v http://myhost.domain:80/



T. Berners-Lee                                                       21

   What do you get? A load of trace messages. There are several cases.
   
      The browser can't look up the name of the host. If it can, it
      will display "Parsed address as" message. If not, try fixing
      your name server or /etc/hosts file, or quoting the IP number of
      the host in decimal notation (like 128.141.77.45) instead.
      
      The browser can get to the host but gets "Connection refused"
      status back .
      
      Your browser gets an error number but prints "error message not
      translated". This is because when it was compiled on your
      platform it didn't know what form the error message table took.
      Try the same thing form a unix platform for example.
      
      You get some network error like "network unreachable". Depending
      on whether the IP network is your responsibility or not, and
      your attitude to life, either fix it,  try again in an hour's
      time, or complain to someone.
      
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
  "CONNECTION REFUSED"
  
The browser tries to connect to the daemon but gets this status in the
                                                                trace.
                                                                      
This means that noone was listening on that port number. Check the por
t numbers match btween server and client.  Make sure you specify the p
                ort number explicitly in the document address for www.
                                                                      
If you are running the daemon without the inet daemon, (with the -a op
tion) then try running it from the terminal with -v as well.  The trac
e for the server should say "socket, bind and listen all ok". If it do
es, and you still get "connection refused", then you must be talking t
 o the wrong host (or, conceivably, different ethernet adapters on the
                                                            same host)
                                                                      
 If you are running with the inet daemon, then check both the services
file (/etc/service) or database (yellow pages, netinfo) if your system
 uses it,  and the /etc/inetd.conf file. Check the service name matche
                                                  s between these two.
                                                                      
Did you remember to kill -HUP the inet daemon when you changed the int
                                                         ed.conf file?
                                                                      
Try running the deamon from a shell window to see what happens better.
                                                                Tim BL
                                                                      
  YOU GET AN EMPTY DOCUMENT
  



T. Berners-Lee                                                       22

       The document sent back is empty, but there is no error message.
                                                                      
The inet daemon has started a process to run your server but it immedi
                                 ately failed.  Possibilities include:
                                                                      
      The daemon may not be in the file specified, or may not be
      executable by the specified user (or, if a user id is not
      specified in your variety of inetd.conf, root)
      
      You have written your own daemon and it crashes.
      
      You are using ours and it crashes (mail us!)
      
Try running the daemon from a terminal window to see what happens. Tim
                                                                    BL
                                                                      
  BAD OUTPUT FROM THE DAEMON
  
                                                 These are some ideas:
                                                                      
      Try running the server from the terminal .
      
      Check the HTML source the daemon produces with
      
        www -source http://myost.domain:80/

      Try telnetting to the daemon and simulating the client:
      

        > telnet myhost.domain 80
        Connected to myhost.domain on port 80
        Escape is ^[
        GET /documentname


                                                                Tim BL
                                                                      
  TELNETTING TO A SERVER
  
Most implementations of telnet allow you to specify a port number. Und
er unix this is often just a second parameter, under VMS a /PORT optio
                                                                    n.
                                                                      
The HTTP protocol is a telnet protocol, so you can simulate it just by
 typing things in.  This will help you to see exactly what a sending b
ack, and it will check you that it really is the server not the browse
                                                r which has a problem.
                                                                      
            Here is an example. (You type "telnet..." and  "GET ...").
                                                                      
        > telnet myhost.domain 80
        Connected to myhost.domain on port 80
        Escape is ^[



T. Berners-Lee                                                       23

        GET /documentname
        <PLAINTEXT>
        Document name "/documentname" invalid.

  RUNNING UNDER SHELL
  
You don't have to run the daemon under the inted if it doesn't work. Y
                                   ou can run it from a shell session.
                                                                      
If the daemon is httpd, then run it from your terminal, with a differe
                    nt port number like 8000.  You use the -p option .
                                                                      
                httpd -p 8000

 Note: You must be root (under VMS, have some privilege) to run with a
                                               port number below 1024.
                                                                      
If you select a port above 1024, then you can run as a normal user.  T
 his way, anyone can publish files on the net. Howeever, it isn't very
reliable, as your server will not automatically come back up if the ma
chine is rebooted. In the long term it is best to install it under "in
                                                                 etd".
                                                                      
You can't use a port number which has been used by a daemon process re
cently, so you may have to switch port number if you ^C and restart th
e daemon.  When it is running like this, you can read the trace messag
es and use a debugger on it if necessary. (See also: telnetting to the
                                                              server )
                                                                      
    Debugging using Trace
    
 If you can't understand why a server refuses to give back a document,
then run wiith the -v option to get trace.  You will see the daemon se
tting up the rules for translating requests into local URLs, and you w
 ill see its attept to access the file (assuming you map requests onto
                                                               files).
                                                                      
                httpd -v -p 8000

Try to access the document from a client using another terminal window
. Look at the trace printout.  It will probably explain what is happen
 ing.  If it includes specific messages below, follow them to detailed
                                                                 help.
                                                                      
      Can't find internet hostname `'
      
If you still can't figure out the problem, mail your local guru help d
 esk or if desperate www-request@info.cern.ch ENCLOSING a copy of that
                                                                trace.
                                                                      
    Even simpler
    
For testing a daemon very simply, without using a client, you can make



T. Berners-Lee                                                       24

  the terminal be the client.  With httpd, or if the server is a shell
script "myserver", try just running it with the terminal and typing GE
                                       T /documentname into its input:
                                                                      
                        > httpd
                        GET /

Try it with the -v option if what comes back isn't a formatted documen
                                                             t. Tim BL
                                                                      
The basic W3 server:  Internals

This describes the generic hypertext daemon (server) program. The daem
                              on is part of the WWW project. See also:
                                                                      
      User guide .
      
      Bugs and Features
      
      Other servers
      
The hypertext daemon, like the ftp daemon, is a program which responds
  to an incomming tcp connection and provides a service to the caller.
                                                                      
  SOURCES
  
A compilation option (SELECT) controls whether more than one connectio
n can be handled at a time. This is a function of whether the TCP/IP i
mplementation beneath the application has a working "select()" routine
. If  it is not true, this implementation services one connection, the
n drops it before accepting another one. In neither case does the daem
on concurrently serve two clients, nor does it fork off a process to d
                                                               o that.
                                                                      
The basic server loop is in the file HTDaemon.c .  A separate module (
 for example HTRetrieve.c ) contains the code to handle one request. V
arious specific versions of this may be written for different flavours
 of server. Also used are various modules of WWW common code.  The htt
pd released from CERN uses almost the entire W3 library and can theref
ore access any object which a browser running on that machine can acce
                ss, and return it as HTML or some other format. Tim BL
                                                                      
Bugs and Improvements needed

Improvements to be made in the HTTP daemon program are as follows. (Se
                                                       also Features )
                                                                      
      Call shell scripts to perform searches on directory trees or
      documents.
      
      The HTRetrieve() routine ought to be able to pick up the user
      node and userid, etc...
      



T. Berners-Lee                                                       25

      Ought to have chroot option. (wwwww July 93)
      
                                                                Tim BL
                                                                      
Daemon features: Update history

   History list for the WWW daemon . (See also bugs ).  Many other
   changes to the daemon are in fact changes to the common code
   library.
   
  2.12  11 OCTOBER 93
  
      First release with access authorization.
      
  2.06  7 JUNE 93
  
      Bug fix: Load error 500 returned as proper HTTP status, not as
      simple document.
      
      WAIS gateway now caches source files again.
      
      Bug fix: Daemon used to try to display graphics file locally on
      the server when the client couldn't display them!  Cause of much
      confusion  :-)
      
  2.05
  
      Big bug fix in local file directory handling .. didn't work in
      2.04!
      
  2.04  28 APRIL 93
  
      With the properly compiled libwww library, this daemon will
      operate as a WAIS, news etc gaetway if so configured.
      
      WAIS gateway operation bug fix.
      
  2.03-BETA: UNRELEASED
  
      Bug fix: operation with no rule file didn't work as expected.
      
  2.02-BETA: 17 MARCH 93
  
      Misleading error trace removed.
      
      Compiled on HP, SGI, Sun, DEC, NeXT and binaries available
      
      Binary handling fixed in library.
      
      Reference to missing HTDirRead.h removed.
      
      Assumes that user can handle files of unknown format
      (application/binary).



T. Berners-Lee                                                       26

  2.00-ALPHA  15 MAR 93
  
      Simple command line -- with no parameters, exports the /Public
      directory.
      
      Multiformat handling -- see library changes for 2.0.  Links to
      .multi filenames resolve to any file with same root, any
      recognised extension.
      
  UNREALEASED 0.9B
  
      Bug fix: If a PASS or FAIL line in the configuration file acted
      on a single document id (ie no wildcard) then it crashed the
      daemon. (HTRules.c, 17-Jun-92, TBL).
      
  SEPT 1991 V0.3
  
      Bug fix: Plain text files were returned to be parsed as SGML,
      causing them to come out as garbage. (Mike Sendall)
      
  AUGUST 1991 V 0.2
  
      -R option now suppresses default rule file.
      
      Rule file format changed completely. Now allows authorisation of
      specific paths only.
      
  JUNE 1991 VERSION 0.1
  
      -r and -R options for rules
      
      Default address is now for Inet daemon working. (29 June)
      
      -l option to log to a file.
      
      -a option for address other than default
      
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
  INTERNET HOSTNAME `'
  
   Sounds as though you are running a server which has a bad rule
   file. (This error also happened with pre-2.07 servers when thry
   weren't given a rule file).
   
   You need something like
   
                pass     /*    file:/my/directory/*

   in your rule file.  The "file:' bit is important as it shows that
   the rest is a local filename. If you don't put that, then the



T. Berners-Lee                                                       27

   server can output this message in attempting to access the document
   over the net without a hostname.
   
                                                                Tim BL
                                                                      
Access Authorization Overview

  Status of this documentation:
                         This is the documentation of WWW telnet-level
                         Access Authorization as implemented in
                         October 1993 (Basic scheme, part of the WWW
                         Common Library).  Contains also proposals for
                         encryption level protection (Pubkey scheme
                         proposal and RIPEM based proposal).
                         
  QUICK REFERENCES
  
  AA protocol in a nutshell:
                         Protocol examples.
                         
  Setup in a nutshell:   a quick manual on how to set up protection in
                         the CERN Daemon.
                         
  COMPLETE DOCUMENTATION
  
   CERN Server: Protection setup user manual.
   
   Scheme specifications:       Basic Protection Scheme (Basic)as
   implemented in October 1993, Public Key Protection Scheme (Pubkey)
   proposal, and RIPEM based proposal by Tony Sanders.
   
   Details:     Browser side AA and Server side AA.
   
   See also:    Vocabulary.
   
    AA Testing Page
    
                                                    AL 14 October 1993
                                                                      
  ACCESS CONTROL LIST FILE
  
   The Access Control List File (.www_acl)contains access information
   for the files in that directory. It is of the following format:
   

        template: method,method,...: group,user,group,...

  template                is a name of a file in that directory (not
                         containing the path). It may contain one
                         wildcard *like any template in WWW server
                         rule file. If there are entries with
                         references to files in other directories they
                         are completely ignored.



T. Berners-Lee                                                       28

  method,method,...       is a list of methods allowed.
                         
  group,user,group,...    lists the groups and users allowed to
                         execute those methods for files matching the
                         template.  Group list can also have IP number
                         templates, and in fact the group definition
                         syntax is exactly as in the group file.
                         
   There may be many entries for each file, for example the following
   is valid:
   

        * : get,put : ari,tim,robert
        * : get     : group1,group2

   Should put imply get?
   
   If an entry for a file is missing, the file is considered to be
   completely protected from everybody, and it is never served to
   anybody.
   
   If Access Control List File exists it is always consulted, even
   when there is no protect rule saying so.  In other words, an
   existing ACL file turns protection on (and then there must be a
   matched defprot rule).
   
   If there is a protect rule protecting a directory, but there is no
   corresponding ACL and there is no Mask-Group definition in the
   protection setup file, the situation is handled as if ACL was empty
   (i.e. contains no entries for any files), so access will be
   forbidden.
   
                                                    AL 14 October 1993
                                                                      
  BASIC PROTECTION SCHEME
  
   The Basic Protection Scheme consists of the following steps:
   
       (Server sends an Unauthorized status).
      
       Client authenticates himself.
      
       Server checks authentication and authorization.
      
       If previous was step successful, document is sent normally by
      the server.
      
       Document is recieved normally by the client.
      
   If the server protection hierarchy is clear and the browser
   sophisticated enough to figure out right away if a document is
   protected, first step is visited very seldom (possibly only once)
   during the entire browsing session for each protected server.



T. Berners-Lee                                                       29

    Step 1:  Server Sends an Unauthorized Status
    
   Once a server receives a request without an Authorization: field to
   access a document that is protected, it sends an Unauthorized 401
   status code, and a set of WWW-Authenticate: fields containing valid
   authentication schemes and their scheme-specific parameters.
   
   In Basic scheme the reply is following:
   

        HTTP/1.0 401 Unauthorized -- authentication failed
        WWW-Authenticate: Basic realm="CollabName"

   where realm specifies used password file; same server can use
   different password file for different trees of documents (this is
   the server-id specified in CERN server protection setup file).
   Client can thus figure out which password to use at any given time.
   
    Step 2: Client Authenticates Himself
    
   After receiving Unauthorized status code, the browser prompts for
   user name and password (if they are not already given by the user),
   and constructs a string containing those two separated by a colon:
   

        username:password

   This string is then encoded into printable characters, and sent it
   along with the next request in the Authorization: field as follows:
   

        Authorization: Basic encoded_string

    Step 3:  Server Checks Authentication and Authorization
    
   When the server receives a request to access a document protected
   by the Basic Scheme, and the request is a full request containing
   Authorization: field which contains the Basic Scheme information,
   it will execute the following Access Request Validation Procedure:
   
      The server receives an Authorization: field with the scheme name
      Basic and encoded authorization string.
      
       If the scheme name is wrong, access is denied, and an
      Unauthorized 401 status with WWW-Authenticate: field containing
      appropriate scheme name (Basic) and realm name is sent back (as
      if no authorization information was given).
      
       If scheme name is correct the authorization string is decoded.
      
       If the access information is correct, the result should have
      two fields separated by a colon, of which at least the first
      must be non-empty (there can be a username without a password).



T. Berners-Lee                                                       30

       If not, access is denied, and an Unauthorized 401status with
      appropriate WWW-Authenticate: field is sent back.
      
       Otherwise, username and password are checked for validity from
      the password file.
      
       If the username-password pair is incorrect, access is denied
      with an Unauthorized 401 status and WWW-Authenticate: field etc.
      
       If the username-password pair is correct, the server checks if
      user and connecting IP address are members of mask-group(if)
      specified in protection setup file (using group file).
      
       Server then looks for an entry for the requested file in the
      corresponding Access Control List File, which is in the same
      directory as the file to be accessed, named .www_acl (if any).
      
       If there is no mask-group nor ACL, or if ACL exists, but there
      is no entry for that file, access is denied with a Frobidden 403
      status code.
      
       If there is an ACL entry for it, server checks if the user and
      connecting IP address belong to the list of groups and users
      allowed to access it (using group file).
      
       If not, an Unauthorized 401 status etc. is sent.
      
       Otherwise, the server checks if the requested file exists.
      
       If not, a Not found 404 status is sent back.
      
       Otherwise access is allowed, and the server sends the document
      normally to the browser.
      
   See also the discussion about Basic Protection Scheme.
   
                                                    AL 14 October 1993
                                                                      
  DISCUSSION ABOUT THE BASIC PROTECTION SCHEME
  
   Because the password flies (almost) unencrypted through the
   Internet, anyone who is listening to the Internet traffic can find
   out people's user names and corresponding passwords. Thus this kind
   of a telnet-level protection only protects from accidental viewing
   of classified documents.
   
   You migth think of this as a door. If you really want to get in,
   you can always break the lock, and get what you want, but the
   bottom line then is that then you have broken something, and that
   is wrong.
   
   Thus the Basic Protection Scheme only provides the means of telling
   people that a document is a protected one; it does not prevent the



T. Berners-Lee                                                       31

   document from being accessed by someone who wants it badly enough
   to go through the trouble of listening to the Internet traffic,
   finding out which printable encoding scheme we use, and decoding
   your username and password.
   
   However, using IP address masking together with usernames ("only
   these people from these internet addresses") makes it more secure,
   because an intruder would also have to have access to the machine
   having the required IP address.
   
                                                    AL 14 October 1993
                                                                      
  BROWSER SIDE ACCESS AUTHORIZATION DESCRIPTION
  
   The exact browser side Access Authorization procedures are
   described in the corresponding protection scheme specification:
   
       Basic Protection Scheme (Basic) and
      
      Public Key Protection     Scheme (Pubkey)
      
   During a browsing session the client side keeps track on the hosts,
   schemes, and the corresponding usernames and passwords.  Because
   the browser keeps track of this authorization information, on
   subsequent requests to servers that it has contacted already during
   a particular browsing session, the browser can automatically send
   the authorization information
   
      without first failing to access the document, and
      
      without having to re-prompt for the username and password from
      the user.
      
    How Does the Browser Know When to Send AA Info
    
   The protected documents are to be collected to directories of
   protected documents. In those directories there should be only
   protected documents, all of which are protected by the same scheme.
   The browser can then use this assumption to make the decision about
   whether to send authorization information along with the request:
   
       
     If the servers replies 401 (Unauthorized) for some file, eve
     ry
     other file in that directory and in its subdirectories is
     considered protected by that same server (and shceme).
     
   The 'directory' in this context means what seems to be a directory
   when examining a given URL.
   
                                                    AL 15 October 1993
                                                                      
  SECURITY HOLE IN THE UNIX FINGER DAEMON



T. Berners-Lee                                                       32

   On some systems, the finger daemon, fingerd, was run under user-id
   zero (root). In this case a user could make his .plan file just to
   be a link to a read-protected file. Then fingering himself he could
   access that file.
   
                                                    AL 14 October 1993
                                                                      
  GROUP FILE
  
   Each user may belong to zero or more groups, and a group may
   contain zero or more users and/or other groups. Groups are just
   abbreviations long lists of users. Group names can be referenced in
   protection setup file (in mask-group field), and in ACL file (the
   last field in each line).
   
    Group Declaration
    
   Each line in the group file contains information about one group,
   and the format is like in the following example (this is called a
   group declaration:
   

        groupname: user1,user2,group1,user3,group2

   That is, the groupname is followed by a colon followed by a
   comma-separated list of usernames and/or groupnames in arbitrary
   order (this list is called a group definition).
   
   A groupname must be defined before it is referenced (and a
   groupname is not defined inside its own definition).  An undefined
   reference is treated as a username.  This guarantees the absence of
   circular structures in the group hierarchy.
   
    Syntax of Group Definition Part
    
   Group definition part appears not only in the group file, but also
   
       in mask-group field in protection setup file, and
      
       as last item on each line of the ACL file.
      
   Group definitions are in their simples form just one user or group
   name, or a comma-separated list of them.
   
      IP Address Masks
      
   Any group definition may contain an IP address restriction like:
   
       "anybody from these IP addresses"
      
       "this user from these addresses"
      
       "these users and groups from these addresses"



T. Berners-Lee                                                       33

   IP address restriction starts with an at sign @ and is followded by
   an IP number template. In IP template each of the 4 parts may
   contain one wildcard character *.
   
   IP address restriction can be on its own when it allows anyone from
   a matching address:
   

    cern_site: @128.141.*.*

   However, it can also immediately follow a user or group name in
   which case these users are only allowed if they connect from a
   matching address:
   

    ari_at_work: luotonen@128.141.8.187

      Lists of Names and IP Address Templates
      
   It is possible to make a list of users and groups, and IP
   addresses, and combine them all together with parentheses:
   

    cern_hackers: (luotonen,timbl)@(128.141.8.187, 128.141.244.101)

      Continuation Line
      
   Long group definitions can be split on multiple lines after any
   comma in the group definition:
   

    wizards: marca, sanders, kevin, dave, montulli, timbl,
             cailliau, hallam, jak
    hackers: marca@141.142.*.*, sanders@153.39.*.*,
             (luotonen, timbl, hallam)@128.141.*.*,
             cailliau@(128.141.201.162, 128.141.248.119)

   See also: Password file.
   
                                                    AL 14 October 1993
                                                                      
  DISCUSSION ABOUT UNIX LINKS
  
   Usually WWW servers providing protected information also want to
   provide public information. Since the information about which
   document files are protected cannot reside in the same file as the
   document itself, the Unix links (both soft and hard) pose a serious
   safety problem, because with them it is possible to make a file
   appear in some other directory than where it really resides.
   
    Description of How to Override the Protection
    
      Make a document.



T. Berners-Lee                                                       34

      Make a Unix (soft or hard) link to the protected document
      (has to be on the same machine, of course).
      
      From your own document make a hypertext link to the newly
      created Unix link.
      
      You can now access the protected document by following the
      hypertext link in your own document.
      
   As you can see, in order to gain illegal access to protected
   documents, the person has to have an account in the same machine as
   the document resides. Also, the Unix link must be put under the
   real WWW server on that machine, not just a privately run copy of
   it (because otherwise it would not have Unix read access to the
   protected documents). Thus, just everybody cannot override the
   access authorization system even with the weak spot existing.
   
   However, the worst thing about this is the fact that it is not just
   the creator of the Unix link who gains access to the protected
   data, but in fact every person, who can access that file (link)
   through the Web (and that's the entire world).
   
   The problem originates from the fact that the WWW server has Unix
   access to both protected and public documents, and IT has to
   resolve whether it is in fact protected or public, and the
   underlying Unix file system certainly doesn't make it any easier.
   
   Unix links have caused similar trouble before, too.
   
    Solution in CERN Daemon
    
   Obviously the simplest and safest solution is to run the server
   under such a user-id that has access to documents of one
   collaboration, but not any others. Because the server has to be
   able to serve documents of multiple collaborations it runs first as
   root, and sets its process user and group ids just before serving
   the request.
   
                                                    AL 14 October 1993
                                                                      
  DISCUSSION ABOUT PASSWORDS IN ACCESS AUTHORIZATION
  
   There are a number of ways in which a user can identify himself to
   a system; some of them are better than the others in some way, but
   worse in another.
   
   In order to build a totally, completely bullet proof protection
   scheme into WWW system we would have to construct a system so heavy
   that it would ruin everything that WWW stands for: fast access to a
   large amount of data from anywhere in the world easily. The Web is
   not a place to keep state secrets. However, it should provide at
   least some level of protection.
   



T. Berners-Lee                                                       35

   There are three methods worth considering in this connection:
   
      username and password identification
      
      single key encryption
      
      public key encryption
      
   If we use the public key encryption, the user must keep the private
   key in some file somewhere accessible to the WWW browser. Since
   some platforms do not provide any kind of protection of data at
   all, this method by itself is not sufficient.
   
   Moreover, WWW users must be able to access the Web from anywhere in
   the world. This implies, that they would have to carry a diskette
   or some other form of media with them containing the private key.
   Also, the users would have to worry about the media being
   compatible with the platform they are temporarily using. Otherwise
   the private key would have to be transmitted via an unsecure
   channel, which is why we need the encryption system in the first
   place.
   
   Even if we should end up using public key or other advanced
   encryption method as a means of authentication, one problem still
   remains: a chain is as weak as the weakest link in it. In other
   words, no matter how safe the WWW system itself is, all the
   platforms using it cannot guarantee that the user is who he says he
   is.
   
   For example, someone might break into a machine and use someone
   else's public-private key pair, and the WWW system could not do
   anything about it.
   
   Since most of the platforms use only a simple username-password
   method, it will suffice for the protection needs of the WWW system,
   at least for the time being. We shall call this telnet-level
   protection shceme the Basic Access Authorization Scheme, or Basic
   for short.
   
   Later on, an enhanced version of this (a combination of all the
   three methods mentioned above) will be implemented, called the
   Public Key Access Authorization Scheme, or Pubkey for short.
   
                                                    AL 14 October 1993
                                                                      
  PASSWORD FILE
  
   The information about users and their passwords is kept in a
   password file of the server. Each line in the password file
   contains information about one user, in the following format:
   

        username:password:real name and maybe other information



T. Berners-Lee                                                       36

   password field is encrypted by C library crypt() function.  This
   makes it compatible with Unix password file (/etc/passwd). Password
   file can be maintained by the htadmprogram.
   
   Password file should not reside in the served tree of documents, or
   it should be carefully checked that the rule file prevents it from
   being accessed via the WWW server.
   
   There must not be duplicate entries for the same username, and
   username must never contain colons.
   
   See also: Group file.
   
                                                    AL 14 October 1993
                                                                      
  SELECTING THE ENCRYPTION METHODS FOR THE PUBKEY PROTECTION SCHEME
  
   There are two encryption methods needed to implement the Public Key
   Protection Scheme. We need a conventional single key method, where
   the same key both encrypts and decrypts (for encrypting and
   decrypting the server reply: the headers and the document itself),
   and a public key method (used for encrypting user's identification
   information and his encryption key).
   
   The reason for using two encryption methods is the fact that public
   key encryption is too slow for large amounts of data (documents),
   so the documents have to encrypted with a single key method. But
   the key has to be sent over an unsecure channel, and the way to do
   this securely is to use a public key method.
   
    Single Key Methods to Consider
    
   The following single key encryption methods are worth considering:
   
  DES                     Patent in the U.S.
                         
  IDEA                    Patent in Europe, no license fee for
                         noncommercial use.
                         
   I suggest that DES encryption be used, since there are so many
   different implementations all over the world, that it is easy to
   plug it in, if just clear hooks are left in the WWW Common Library
   code.
   
    Public Key Methods to Consider
    
   The following public key encryption methods are worth considering:
   
  RSA                     Rivest-Shamir-Adleman, patent in the U.S.
                         
  Rabin                   Public Key Partners claim their patent
                         covers all public key cryptography.
                         



T. Berners-Lee                                                       37

                                                    AL 14 October 1993
                                                                      
  PUBLIC KEY PROTECTION SCHEME PROPOSAL
  
   In the Basic Protection Scheme the password flies trough the net
   unencrypted, which is not a very good idea. One solution to this is
   to encrypt username and password with a public key of the server.
   
   Furthermore, the documents might be classified or copyrighted in
   such a way that they need to be encrypted, too, while transferring
   them through the Internet.
   
   The Public Key Protection Scheme consists of the following steps:
   
       (Server sends an Unauthorized status).
      
       Client authenticates himself.
      
       Server checks authentication and authorization.
      
       Server sends an encrypted reply.
      
       Client decrypts the reply from server.
      
    Step 1:  Server Sends an Unauthorized 401 Status
    
   On reception of a request to access a protected document the Public
   Key Protection Scheme works otherwise like the Basic Protection
   Scheme, except that the WWW server sends also its public key in the
   WWW-Authenticate: header field of the reply:
   

        HTTP/1.0 401 Unauthorized -- authentication failed
        WWW-Authenticate: Basic realm="CollabName",
                                key="encodedPublicKey"

   If the client had given the Authorization: field already with the
   request, then the scheme continues at step 3: server checks
   authentication and authorization.
   
    Step 2:  Client Authenticates Himself
    
   After having received the Unauthorized status code (or otherwise
   knowing from a previous request to the server that it requires
   authorization information when accessing the desired file), browser
   prompts for username and password (unless already given), generates
   a random encryption key, then concatenates the user name, password,
   browser's IP address, timestamp and the generated encryption key,
   with colons as separators:
   

    username:password:browser_inet_address:timestamp:browser_key




T. Berners-Lee                                                       38

   encrypts the gained string with the server's public key, and
   encodes it into printable characters.
   
   The client then sends the encrypted string along with the next
   request in the Authorization: field as follows:
   

        Authorization: Pubkey encrypted_string

   Although browser's encryption key exists only in it's memory and in
   the server's memory, and is encrypted with the server's public key
   while it flies through the network, the same key should not be used
   twice, but a new key should be generated even when accessing the
   same server, thus reducing the possibility of encryption key being
   cracked.
   
   Browser's encryption key is concatenated with the identification
   information before encryption to guarantee, that even if someone
   catches the authorization string it will be useless, because using
   it will produce undecryptable results. Thus replaying is possible
   from the same internet address as the original request during the
   (short) time when timestamp is valid, but useless.
   
    Step 3:  Server Checks Authentication and Authorization
    
   When the server receives an access request to a document protected
   by the Public Key Access Authorization Scheme, and the request is a
   full request containing Authorization: field which contains the
   Public Key Scheme authorization information, it will execute the
   same Access Request Validation Procedure as in Basic scheme with
   the following exceptions and additions:
   
       The authorization string is decrypted with servers private key
      after decoding it from printable characters.
      
       If access information is correct, result should be five fields
      with colons as field separators. Those fields contain username,
      password, internet address, timestamp and browser's encryption
      key, respectively.
      
       IP address is checked with the actual requesting address. If no
      match access is forbidden (403 status code).
      
       Timestamp is checked with current server time. If not within
      limits access is denied because of failing authentication (401
      status code). Server sends also a WWW-Server-Time: field giving
      the browser its current time (this removes the need for
      syncronized clocks).
      
    Step 4: Server Sends An Encrypted Reply
    
   In the Public Key Scheme, if the client is allowed to access the
   document, the reply from server may be encrypted. Server replies



T. Berners-Lee                                                       39

   with the usual status line, and immediately after that follow the
   DEK-Info:,  Key-Info: and  MIC-Info: fields (almost as in RFC1421):
   

        HTTP/1.0 200 Document follows
        DEK-Info: DES-CBC,BFF968AA74691AC1
        Key-Info: DES_ECB,DJSFo7dSDFf34hKHFD8234jDFf2bfasdf832DF3nZ
        MIC-Info: MD5,
         LDKJF3kr34hfDuf23r98FBk38ftDFP9873hbrFDp9gb23kfDPF2b3JfKeL7G
         DLkwtDICl234FJi9834kjfslk
        ... other headers and the encrypted document follow ...

   The document body is not encoded into printable characters, but is
   pure binary as output by the encryption procedure. This is to save
   time, space and bandwidth.
   
    Step 5: Client Decrypts The Reply From Server
    
   When the client recieves a reply with  DEK-Info:,  Key-Info: and
   MIC-Info: it knows that the body is encrypted. These fields are
   used to decrypt the document as described in RFC1421.
   
   In further discussion about the Public Key Scheme there are
   considerations about possible encryption methods to use.
   
                                                    AL 14 October 1993
                                                                      
  DISCUSSION ABOUT THE PUBLIC KEY PROTECTION SCHEME
  
   Implemented in this way, Access Authorization does not violate
   agaist the implementation guideline requiring that there are no
   sessions between client and server.
   
   The Public Key Scheme in itself is independent of the encryption
   methods selected. The only requirements are that one is a public
   key cryptosystem, and the other one is a single key cryptosystem,
   and that server and client agree on the cryptosystems used.  See
   the discussion about possible encryption methods.
   
                                                    AL 14 October 1993
                                                                      
   Origin: This is the file THEORY in rpem.tar.Z,available from
   dcssparc.cl.msu.edu (35.8.1.6).
   
   Edited into HTML by: AL
   
  DESCRIPTION OF THE RABIN PUBLIC KEY CRYPTOSYSTEM
  
   Here are some messages from Marc Ringuette and Bennet Yee
   concerning the Rabin system.  They provide a succinct description
   of the system, and statements concerning its public domainness.
   
   Note that the version of the Rabin system I/we have implemented is



T. Berners-Lee                                                       40

   not exactly as described in Rabin's papers, so I may be giving him
   short shrift here.  We/I use the Berlekamp square root algorithm
   (which is very much different than the exponentiation that RSA
   uses) in order to be sure that no one at RSA can claim this is an
   RSA ripoff. I think it's safe to say that this square root
   algorithm, coupled with the Chinese Remainder Theorem, is the
   "magic" that makes this whole system work.
   

-------- Messages follow ---------------------------------------

Date: Fri, 24 Aug 1990 11:26-EDT
From: Marc.Ringuette@DAISY.LEARNING.CS.CMU.EDU
To: Mark Riordan
Subject: Re: Royalty-free public key algorithm wanted

   Happy news - I have something for you.  My friend Bennet Yee
   introduced me to it, and it's a simple PK technique, provably as
   hard as factoring, that is probably equivalent to or better than
   RSA.  It's not patented as far as I know...but I haven't written
   away to the author yet.
   
   It was invented by Michael Rabin, and goes like this:
   
          The private key is a pair of large random primes, as for RSA
      
          The encryption function is squaring/square root modulo pq.
      Squaring     is easy -- modular multiplication -- but taking a
      square root modulo     pq is as hard as factoring.  Once you
      know the factors, though, it     is possible.
      
          So to encrypt a short message with the public key, square
      the message     modulo pq.
      
          To decrypt it, take the four square roots modulo pq, and
      choose the correct     one somehow.
      
   In a practical system, you use this function to encrypt a one-time
   key for DES or some other private-key system, then encrypt the rest
   of the message with the private key system.
   
   p.s. Here's a brief proof that the method is as hard as factoring:
   
   Assume you can take arbitrary square roots modulo pq.  If a number
   has a square root (1 out of 4 numbers do), then it has 4 square
   roots, two distinct ones and their negations mod pq.
   
   To factor pq, choose a random number, square it, and take the
   square root. With 50% probability, you will obtain the other
   distinct square root.  From these you can derive the factoring
   (damn, I can't quite remember how - was it the Chinese Remainder
   Theorem, or some sort of GCD?).  I can fill in the details sometime
   if you want.



T. Berners-Lee                                                       41

Return-Path:
Received: from DAISY.LEARNING.CS.CMU.EDU by clvax1.cl.msu.edu with SMT
P ;
          Thu, 13 Sep 90 14:09:28 EDT
Date: Thu, 13 Sep 1990 14:06-EDT
From: Marc.Ringuette@DAISY.LEARNING.CS.CMU.EDU
To: ceblair@ux1.cso.uiuc.edu, riordanmr@clvax1.cl.msu.edu
Subject: Re: Is Rabin cryptosystem covered by patents?

   I just got mail from Michael Rabin, saying that his technique is in
   the public domain.  Yay!
   
   Bennet Yee adds:
   

Date: Sun, 28 Apr 91 22:06:12 EDT
From: Bennet.Yee@PLAY.MACH.CS.CMU.EDU

   Rabin's protocol is equivalent to factoring:  Suppose you have a
   procedure P which, given a quadratic residue, gives one of its
   square roots mod pq.  The four nsquare roots of a quadratic residue
   y=x^2 mod pq is -x, x, -gamma x, gamma x, where gamma is the
   nontrivial square root of unity mod pq.
   
   Aside:  you can find gamma if you know p and q by using the Chinese
   Remainder Theorem (CRT) and solving the system of equations
   

        x = -1 mod p
        x = 1 mod q

   [ You can see where the other square roots of unity comes from:
   they are the other possible patterns of signs on the 1's in the
   system of eqns for CRT. ]
   
   Now, given P, you choose a random r between 1 and pq-1 inclusive
   and compute y = P(r^2).  With 1/2 probability, y = +/- gamma r.
   Since you knew r, you can find g = y/r = +/- gamma.  Now, since g-1
   is either 0 mod q or 0 mod p, so GCD(g-1,pq) will give you p or q.
   
   [ To find 1/r mod pq, use EGCD:  The extended Euclidean algorithm,
   given m,n, will find GCD(m,n) as well as the pair a,b such that
   am+bn=GCD(m,n). When GCD(m,n)=1, we have a=1/m mod n. ]
   
   Note that this can be simplfied a little, since with very high
   probability r does not divide pq:  r(g-1) = r(y/r - 1) = y - r, so
   GCD(y-r,pq) will work just as well.  If r divides pq, you've
   already (accidentally) factored the modulus.
   

-------- End of Messages ---------------------------------------------
--




T. Berners-Lee                                                       42

   Let me add a few words about "choosing the correct root somehow".
   If there's one square root of X mod pq, then there are four square
   roots. In general, it's not obvious which of the four square roots
   is the original message.
   
   H. C. Williams devised a modification of the Rabin system which
   allows the cryptographer to decide definitively which of the four
   square roots is the original message.  I started to implement
   Williams' variation (see the code in cippkg.c that has been #if'ed
   out), but decided that his variation made the system look too much
   like RSA.  The RSA system is great, but I don't want their lawyers
   after me.
   
   So, the question remains:  how should we distinguish which of four
   candidates is the original plaintext?  I decided upon a brute force
   approach:  I add 64 bits of redundant information to a message
   before encrypting it.  The 64 bits are simply the first 64 bits of
   the message.  If the message is less than 64 bits long, it is
   repeated as necessary to fill out the 64 bits.  When the ciphertext
   is decrypted, the correct plaintext can be detected (with a
   probability of error of 2^-64, I assume) by looking for the
   redundancy.
   
   This technique is ugly because it does not *guarantee* unique
   detection of the correct root (though 2^-64 is good enough for me),
   and also because it wastes bits.  However, the waste of bits isn't
   as bad as it looks.
   
   Messages in the Rabin system have to be broken up into chunks of
   size (just less than) pq.  But since p and q need to be rather
   large in order to provide adequate security, each chunk of the
   message should be several hundred bits or more in size. Using 64
   bits of that to discriminate amoungst the square roots is not much
   overhead.  Plus, public key systems are typically used only to
   encipher a message key for a more conventional (and much faster)
   secret key system.  The message key is typically much smaller than
   several hundred bits, so there's plenty of room left over for
   redundancy.
   
    Selected References
    
      M. O. Rabin, "Digitized signatures and public-key functions as
       intractable as factorization,", MIT Lab. for Computer Science,
        Technical Report LCS/TR-212, 1979.    [I've not located this
      paper myself and have instead relied upon    references to it in
      other papers and upon Marc Ringuette's    description.]
      
      H. C. Williams, "A Modification of the RSA Public-Key Encryption
         Procedure," IEEE Transactions on Information Theory, Vol
      IT-26,    No. 6, November 1980.    [I decided not to use this
      because it looked too RSA-like.]
      
      Trygve Nagell, Introduction to Number Theory.  New York:



T. Berners-Lee                                                       43

      Chelsea Publishing Company, 1964.    [Basic number theory text,
      better for cryptographic purposes    than most.  See esp. the
      chapter "Theory of Quadratic Residues".]
      
      Henk C. A. van Tilborg, An Introduction to Cryptology.  Boston:
        Kluwer Academic Publishers, 1988.    [Especially strong on
      public key systems.  Comes with handy    appendices on number
      theory and the theory of finite fields.]
      
      Jennifer Seberry and Josef Pieprzyk, Cryptography:  An
      Introduction    to Computer Security.  Sydney, Australia:
      Prentice Hall, 1989.    [More easily readable than most similar
      books, with more of    an eye toward applications.  Contains
      complete C source to    a DES implementation.  So much for DES
      being a secret.]
      
       Mark Riordan,   riordanmr@clvax1.cl.msu.edu,    late April 1991
                                                                      
  AA ADDITIONS TO RULE FILE
  
   Access Authorization brings two additional rules to the rule file:
   protect and defprot. They have the same syntax:
   

        defprot <template> <setupfile> <uid.gid>
        protect <template> <setupfile> <uid.gid>

  <template>              is the usual template used in rule file to
                         match agaist the requested URL.
                         
  <setupfile>             is a pathname for protection setup file
                         which sets up the actual protection
                         parameters.
                         
  Setup file can be omitted from protect rule, but it is obligatory in
                         defprot rule. If setup file is omitted it is
                         not possible to give the <uid.gid> part,
                         either.
                         
  <uid.gid>               are the Unix user id and group id (either by
                         name or by number, separated by comma) to
                         which the server should change when serving
                         the request. These are only meaningful when
                         the server is running as root.
                         
  These can be omitted, when they default to nobody and nogroup. Also
                         either part by itself may be omitted, as far
                         it is kept in mind that the dot belongs to
                         the group id part:
                         

        user.group        user        .group




T. Berners-Lee                                                       44

  are all valid.
                         
    The defprot Rule
    
   defprot rule specifies the default protection setup file and
   process uid and gid.
   
   defprot by itself does not protect anything, but if protection is
   later on turned on by
   
      either an existing access control list file
      
      or a protect rule without setup file name
      
   the protection settings of defprot rule are used. Rule translation
   continues normally after defprot rule. If another defprot rule is
   matched it overrides the previous.
   
    The protect Rule
    
   protect rule tells the server, that the document matching template
   is protected. If protection setup file is not specified it is taken
   from the previously matched defprot. If no defprot rule has matched
   before it is an error.
   
   Rule translation continues normally, but the document is served in
   protected mode: either an access control list file (.www_acl) must
   be found in the same directory as the document, or a mask must be
   present in protection setup file, (or both) and in addition, of
   course, the requirements in mask/ACL must be met (i.e. the user/IP
   number must belong to an allowed group).
   
   If another protect rule is matched it overrides the privious one.
   
   Note: Even without protect rule protection is enabled if there is
   an Access Control List in the same directory as the requested file.
   
   The reason for protect rule existing is that it is possible to tell
   that an entire hierarchy of files is protected, and if for some
   reason the ACL is missing, it does not result in protected files
   being exposed.
   
   It can also be used to avoid having ACLs alltogether when
   Mask-Group is set in the protection setup file.
   
    Examples
    

    defprot  *               /WWW/httpd.prot
    protect  /priv/*         /WWW/priv/httpd.prot         foo.bar
    protect  /priv/secret/*  /WWW/priv/secret/httpd.prot  foo.bar
    fail     *.prot
    map      /*              file:/WWW/*



T. Berners-Lee                                                       45

    fail     *

   This setup uses protection setup files in the top-level directory
   for each different protection level (this doesn't need to be the
   case). When accessing "private" and "secret" files the server sets
   its process user and group id to foo and bar. Otherwise it is
   running as nobody in nogroup.fail rule explisitly fails every
   request to access any protection setup files (however, they need
   not be called httpd.prot).
   
                                                    AL 14 October 1993
                                                                      
  SERVER SIDE ACCESS AUTHORIZATION DESCRIPTION
  
   The exact server side Access Authorization procedures are described
   in the corresponding protection scheme specification:
   
       Basic Protection Scheme     (basic) and
      
      Public Key Protection     Scheme (pubkey)
      
   Because the Unix file system with (soft and hard) links makes it
   easy to access a file from another directory than where the file
   actually resides, server needs to use the unix filesystem
   protections in its favour. Therefore, the Unix file system must
   provide the protection between the collaborations using the same
   machine, and the server sets its process uid and gid according to
   which set of files are currently served.
   
    Forking and Process uid and gid
    
   The server can be standalone, in which case it forks another copy
   of itself and after that sets its user and group ids. (Forking is
   necessary because once a process has set its user-id to something
   else than root it cannot change back.) If the server is run by
   inetd (inet daemon) there is no need for forking.
   
   If users in the server machine can be trusted files can have world
   (or group) read permission, and the server can run as nobody (or
   with appropriate group id). In this case there is no need to fork
   even when running standalone.
   
                                                    AL 14 October 1993
                                                                      
  SERVER'S PUBLIC AND PRIVATE KEYS
  
   Server's public and private keys must remain the same for a
   reasonably long time because, in principle, every time the keys are
   changed it's likely that there are one or two clients just waiting
   for their user's to type in their usernames and passwords. When
   they have completed, the authorization string is encrypted with the
   old public key thus leading to an Unauthorized status from the
   server although the user may well be authorized.



T. Berners-Lee                                                       46

   The server might accept data encrypted with either of the keys for
   a while, but this would introduce state to the server, and would
   complicate things too much for something, that is really not that
   vital.
   
   Furthermore, if the keys keep changing all the time (say once a
   minute, or even every ten minutes) the browser will practically
   always have to first fail trying to access a document to get the
   new public key, and then use it to encrypt the authorization
   information again (of course generating a new encryption key,
   because otherwise the material to be encrypted with the public key
   would be exactly the same as encrypted with the old key and thus
   compromise the safety of the system, because having two different
   encryptions of the same message makes it easier to break).
   
   Since public key encryption can be considered rather safe for a
   period of even years, it will be reasonable to say, that the server
   needs not change it's public and private keys more often than say,
   every couple of weeks.
   
   On Suns, if the server is run by inetd which only starts the server
   when someone is requesting a connection to it, i.e., the server is
   not running all the time, there may be a separate program updating
   the keys either regularly (run by cron), or during the system init
   (run from /etc/rc.local).
   
   On other platforms, especially those not providing multiple
   processes, the key update has to be done either once at the server
   startup, or if the server is not booted often enough (and why would
   it be?) the server itself must do this task regularly.
   
   Private key must be kept in a directory with no world or group
   permissions under the WWW server pseudo account's home directory,
   in a file with no world or group permissions to it either.
   
                                                    AL 14 October 1993
                                                                      
  PRINTABLE ENCODING
  
   Encoding into printable characters is done as defined in RFC 1421.
   
                                                    AL 14 October 1993
                                                                      
  VOCABULARY
  
                           A
    
  AA                     Access Authorization
                         
  Asymmetric Cryptography
                         Cryptography in which two keys are used: the
                         other one encrypts and the other one
                         decrypts. What is more is that it is also



T. Berners-Lee                                                       47

                         vice versa.  Moreover, something that has
                         veen encrypted with one of the keys can be
                         decrypted only with the other one.
                         
    D
    
  DEK                    Data Encryption Key. A single (symmetric) key
                         used to encrypt the document (but not the
                         headers) sent by the server.
                         
    H
    
  HTTP                   HyperText Transfer Protocol.
                         
    P
    
  PEM                    Privacy Enhanced Mail.
                         
  Private Key            The secret component of the two keys used in
                         public key encryption.
                         
  Public Key             The public componen of the two keys used in
                         public key encryption.
                         
  Public Key Cryptography
                         See Asymmetric Cryptography.
                         
    S
    
  Secret Key             The single key used in symmetric
                         cryptography.
                         
  Symmetric Cryptography
                         Encryption and decryption done by the same
                         (secret) key.
                         
    U
    
  URL                    Universal Resource Locator.
                         
                                                    AL 14 October 1993
                                                                      

Network Working Group                                            J. Li
nn
Request for Comments: 1421                    IAB IRTF PSRG, IETF PEM
WG
Obsoletes: 1113                                            February 19
93


           Privacy Enhancement for Internet Electronic Mail:
        Part I: Message Encryption and Authentication Procedures



T. Berners-Lee                                                       48

Status of this Memo

   This RFC specifies an IAB standards track protocol for the Internet
   community, and requests discussion and suggestions for improvements
.
   Please refer to the current edition of the "IAB Official Protocol
   Standards" for the standardization state and status of this protoco
l.
   Distribution of this memo is unlimited.

Acknowledgements

   This document is the outgrowth of a series of meetings of the Priva
cy
   and Security Research Group (PSRG) of the IRTF and the PEM Working
   Group of the IETF.  I would like to thank the members of the PSRG a
nd
   the IETF PEM WG, as well as all participants in discussions on the
   "pem-dev@tis.com" mailing list, for their contributions to this
   document.

1.  Executive Summary

   This document defines message encryption and authentication
   procedures, in order to provide privacy-enhanced mail (PEM) service
s
   for electronic mail transfer in the Internet.  It is intended to
   become one member of a related set of four RFCs.  The procedures
   defined in the current document are intended to be compatible with
a
   wide range of key management approaches, including both symmetric
   (secret-key) and asymmetric (public-key) approaches for encryption
of
   data encrypting keys.  Use of symmetric cryptography for message te
xt
   encryption and/or integrity check computation is anticipated. RFC
   1422 specifies supporting key management mechanisms based on the us
e
   of public-key certificates.  RFC 1423 specifies algorithms, modes,
   and associated identifiers relevant to the current RFC and to RFC
   1422.  RFC 1424 provides details of paper and electronic formats an
d
   procedures for the key management infrastructure being established
in
   support of these services.

   Privacy enhancement services (confidentiality, authentication,
   message integrity assurance, and non-repudiation of origin) are
   offered through the use of end-to-end cryptography between originat
or
   and recipient processes at or above the User Agent level.  No speci
al
   processing requirements are imposed on the Message Transfer System



T. Berners-Lee                                                       49

at



Linn                                                            [Page
1]


















































T. Berners-Lee                                                       50

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   endpoints or at intermediate relay sites.  This approach allows
   privacy enhancement facilities to be incorporated selectively on a
   site-by-site or user-by-user basis without impact on other Internet
   entities.  Interoperability among heterogeneous components and mail
   transport facilities is supported.

   The current specification's scope is confined to PEM processing
   procedures for the RFC-822 textual mail environment, and defines th
e
   Content-Domain indicator value "RFC822" to signify this usage.
   Follow-on work in integration of PEM capabilities with other
   messaging environments (e.g., MIME) is anticipated and will be
   addressed in separate and/or successor documents, at which point
   additional Content-Domain indicator values will be defined.

2.  Terminology

   For descriptive purposes, this RFC uses some terms defined in the O
SI
   X.400 Message Handling System Model per the CCITT Recommendations.
   This section replicates a portion of (1984) X.400's Section 2.2.1,
   "Description of the MHS Model: Overview" in order to make the
   terminology clear to readers who may not be familiar with the OSI M
HS
   Model.

   In the [MHS] model, a user is a person or a computer application.
A
   user is referred to as either an originator (when sending a message
)
   or a recipient (when receiving one).  MH Service elements define th
e
   set of message types and the capabilities that enable an originator
   to transfer messages of those types to one or more recipients.

   An originator prepares messages with the assistance of his or her
   User Agent (UA).  A UA is an application process that interacts wit
h
   the Message Transfer System (MTS) to submit messages.  The MTS
   delivers to one or more recipient UAs the messages submitted to it.
   Functions performed solely by the UA and not standardized as part o
f










T. Berners-Lee                                                       51

   the MH Service elements are called local UA functions.

   The MTS is composed of a number of Message Transfer Agents (MTAs).
   Operating together, the MTAs relay messages and deliver them to the
   intended recipient UAs, which then make the messages available to t
he
   intended recipients.

   The collection of UAs and MTAs is called the Message Handling Syste
m
   (MHS).  The MHS and all of its users are collectively referred to a
s
   the Message Handling Environment.







Linn                                                            [Page
2]


































T. Berners-Lee                                                       52

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


3.  Services, Constraints, and Implications

   This RFC defines mechanisms to enhance privacy for electronic mail
   transferred in the Internet. The facilities discussed in this RFC
   provide privacy enhancement services on an end-to-end basis between
   originator and recipient processes residing at the UA level or abov
e.
   No privacy enhancements are offered for message fields which are
   added or transformed by intermediate relay points between PEM
   processing components.

   If an originator elects to perform PEM processing on an outbound
   message, all PEM-provided security services are applied to the PEM
   message's body in its entirety; selective application to portions o
f
   a PEM message is not supported. Authentication, integrity, and (whe
n
   asymmetric key management is employed) non-repudiation of origin
   services are applied to all PEM messages; confidentiality services
   are optionally selectable.

   In keeping with the Internet's heterogeneous constituencies and usa
ge
   modes, the measures defined here are applicable to a broad range of
   Internet hosts and usage paradigms.  In particular, it is worth
   noting the following attributes:


























T. Berners-Lee                                                       53

        1.  The mechanisms defined in this RFC are not restricted to a
            particular host or operating system, but rather allow
            interoperability among a broad range of systems.  All
            privacy enhancements are implemented at the application
            layer, and are not dependent on any privacy features at
            lower protocol layers.

        2.  The defined mechanisms are compatible with non-enhanced
            Internet components.  Privacy enhancements are implemented
            in an end-to-end fashion which does not impact mail
            processing by intermediate relay hosts which do not
            incorporate privacy enhancement facilities.  It is
            necessary, however, for a message's originator to be
            cognizant of whether a message's intended recipient
            implements privacy enhancements, in order that encoding an
d
            possible encryption will not be performed on a message who
se
            destination is not equipped to perform corresponding inver
se
            transformations.  (Section 4.6.1.1.3 of this RFC describes
 a
            PEM message type ("MIC-CLEAR") which represents a signed,
            unencrypted PEM message in a form readable without PEM
            processing capabilities yet validatable by PEM-equipped
            recipients.)

        3.  The defined mechanisms are compatible with a range of mail
            transport facilities (MTAs).  Within the Internet,



Linn                                                            [Page
3]






















T. Berners-Lee                                                       54

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


            electronic mail transport is effected by a variety of SMTP
            [2] implementations.  Certain sites, accessible via SMTP,
            forward mail into other mail processing environments (e.g.
,
            USENET, CSNET, BITNET).  The privacy enhancements must be
            able to operate across the SMTP realm; it is desirable tha
t
            they also be compatible with protection of electronic mail
            sent between the SMTP environment and other connected
            environments.

        4.  The defined mechanisms are compatible with a broad range o
f
            electronic mail user agents (UAs).  A large variety of






































T. Berners-Lee                                                       55

            electronic mail user agent programs, with a corresponding
            broad range of user interface paradigms, is used in the
            Internet.  In order that electronic mail privacy
            enhancements be available to the broadest possible user
            community, selected mechanisms should be usable with the
            widest possible variety of existing UA programs.  For
            purposes of pilot implementation, it is desirable that
            privacy enhancement processing be incorporable into a
            separate program, applicable to a range of UAs, rather tha
n
            requiring internal modifications to each UA with which PEM
            services are to be provided.

        5.  The defined mechanisms allow electronic mail privacy
            enhancement processing to be performed on personal compute
rs
            (PCs) separate from the systems on which UA functions are
            implemented.  Given the expanding use of PCs and the limit
ed
            degree of trust which can be placed in UA implementations
on
            many multi-user systems, this attribute can allow many use
rs
            to process PEM with a higher assurance level than a strict
ly
            UA-integrated approach would allow.

        6.  The defined mechanisms support privacy protection of
            electronic mail addressed to mailing lists (distribution
            lists, in ISO parlance).

        7.  The mechanisms defined within this RFC are compatible with
 a
            variety of supporting key management approaches, including
            (but not limited to) manual pre-distribution, centralized
            key distribution based on symmetric cryptography, and the
            use of public-key certificates per RFC 1422.  Different
            key management mechanisms may be used for different
            recipients of a multicast message.  For two PEM
            implementations to interoperate, they must share a common
            key management mechanism; support for the mechanism define
d
            in RFC 1422 is strongly encouraged.





Linn                                                            [Page
4]






T. Berners-Lee                                                       56

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93






















































T. Berners-Lee                                                       57

   In order to achieve applicability to the broadest possible range of
   Internet hosts and mail systems, and to facilitate pilot
   implementation and testing without the need for prior and pervasive
   modifications throughout the Internet, the following design
   principles were applied in selecting the set of features specified
in
   this RFC:

        1.  This RFC's measures are restricted to implementation at
            endpoints and are amenable to integration with existing
            Internet mail protocols at the user agent (UA) level or
            above, rather than necessitating modifications to existing
            mail protocols or integration into the message transport
            system (e.g., SMTP servers).

        2.  The set of supported measures enhances rather than restric
ts
            user capabilities.  Trusted implementations, incorporating
            integrity features protecting software from subversion by
            local users, cannot be assumed in general.  No mechanisms
            are assumed to prevent users from sending, at their
            discretion, messages to which no PEM processing has been
            applied. In the absence of such features, it appears more
            feasible to provide facilities which enhance user services
            (e.g., by protecting and authenticating inter-user traffic
)
            than to enforce restrictions (e.g., inter-user access
            control) on user actions.

        3.  The set of supported measures focuses on a set of function
al
            capabilities selected to provide significant and tangible
            benefits to a broad user community.  By concentrating on t
he
            most critical set of services, we aim to maximize the adde
d
            privacy value that can be provided with a modest level of
            implementation effort.

   Based on these principles, the following facilities are provided:

        1.  disclosure protection,

        2.  originator authenticity,

        3.  message integrity measures, and

        4.  (if asymmetric key management is used) non-repudiation of
            origin,

   but the following privacy-relevant concerns are not addressed:

        1.  access control,



T. Berners-Lee                                                       58

Linn                                                            [Page
5]






















































T. Berners-Lee                                                       59

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


        2.  traffic flow confidentiality,

        3.  address list accuracy,

        4.  routing control,

        5.  issues relating to the casual serial reuse of PCs by
            multiple users,

        6.  assurance of message receipt and non-deniability of receip
t,

        7.  automatic association of acknowledgments with the messages
            to which they refer, and

        8.  message duplicate detection, replay prevention, or other
            stream-oriented services

4.  Processing of Messages

4.1  Message Processing Overview

   This subsection provides a high-level overview of the components an
d
   processing steps involved in electronic mail privacy enhancement
   processing.  Subsequent subsections will define the procedures in
   more detail.

4.1.1  Types of Keys

   A two-level keying hierarchy is used to support PEM transmission:

        1.  Data Encrypting Keys (DEKs) are used for encryption of
            message text and (with certain choices among a set of
            alternative algorithms) for computation of message integri
ty
            check (MIC) quantities.  In the asymmetric key management
            environment, DEKs are also used to encrypt the signed
            representations of MICs in PEM messages to which
            confidentiality has been applied. DEKs are generated
            individually for each transmitted message; no
            predistribution of DEKs is needed to support PEM
            transmission.

        2.  Interchange Keys (IKs) are used to encrypt DEKs for
            transmission within messages.  Ordinarily, the same IK wil






T. Berners-Lee                                                       60

l
            be used for all messages sent from a given originator to a
            given recipient over a period of time.  Each transmitted
            message includes a representation of the DEK(s) used for
            message encryption and/or MIC computation, encrypted under
            an individual IK per named recipient.  The representation
is



Linn                                                            [Page
6]












































T. Berners-Lee                                                       61

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


            associated with Originator-ID and Recipient-ID fields
            (defined in different forms so as to distinguish symmetric
            from asymmetric cases), which allow each individual
            recipient to identify the IK used to encrypt DEKs and/or
            MICs for that recipient's use.  Given an appropriate IK, a
            recipient can decrypt the corresponding transmitted DEK
            representation, yielding the DEK required for message text
            decryption and/or MIC validation.  The definition of an IK
            differs depending on whether symmetric or asymmetric
            cryptography is used for DEK encryption:

                 2a. When symmetric cryptography is used for DEK
                     encryption, an IK is a single symmetric key share
d
                     between an originator and a recipient.  In this
                     case, the same IK is used to encrypt MICs as well
                     as DEKs for transmission.  Version/expiration
                     information and IA identification associated with
                     the originator and with the recipient must be
                     concatenated in order to fully qualify a symmetri
c
                     IK.

                 2b. When asymmetric cryptography is used, the IK
                     component used for DEK encryption is the public
                     component [8] of the recipient.  The IK component
                     used for MIC encryption is the private component
of
                     the originator, and therefore only one encrypted
                     MIC representation need be included per message,
                     rather than one per recipient.  Each of these IK
                     components can be fully qualified in a Recipient-
ID
                     or Originator-ID field, respectively.
                     Alternatively, an originator's IK component may b
e
















T. Berners-Lee                                                       62

                     determined from a certificate carried in an
                     "Originator-Certificate:" field.

4.1.2  Processing Procedures

   When PEM processing is to be performed on an outgoing message, a DE
K
   is generated [1] for use in message encryption and (if a chosen MIC
   algorithm requires a key) a variant of the DEK is formed for use in
   MIC computation.  DEK generation can be omitted for the case of a
   message where confidentiality is not to be applied, unless a chosen
   MIC computation algorithm requires a DEK.  Other parameters (e.g.,
   Initialization Vectors (IVs)) as required by selected encryption
   algorithms are also generated.

   One or more Originator-ID and/or "Originator-Certificate:" fields a
re
   included in a PEM message's encapsulated header to provide recipien
ts
   with an identification component for the IK(s) used for message



Linn                                                            [Page
7]































T. Berners-Lee                                                       63

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   processing.  All of a message's Originator-ID and/or "Originator-
   Certificate:" fields are assumed to correspond to the same principa
l;
   the facility for inclusion of multiple such fields accomodates the
   prospect that different keys, algorithms, and/or certification path
s
   may be required for processing by different recipients.  When a
   message includes recipients for which asymmetric key management is
   employed as well as recipients for which symmetric key management i
s
   employed, a separate Originator-ID or "Originator-Certificate:" fie
ld
   precedes each set of recipients.

   In the symmetric case, per-recipient IK components are applied for
   each individually named recipient in preparation of ENCRYPTED, MIC-
   ONLY, and MIC-CLEAR messages. A corresponding "Recipient-ID-
   Symmetric:" field, interpreted in the context of the most recent
   preceding "Originator-ID-Symmetric:" field, serves to identify each
   IK.  In the asymmetric case, per-recipient IK components are applie
d
   only for ENCRYPTED messages, are independent of originator-oriented
   header elements, and are identified by "Recipient-ID-Asymmetric:"





























T. Berners-Lee                                                       64

   fields.  Each Recipient-ID field is followed by a "Key-Info:" field
,
   which transfers the message's DEK encrypted under the IK appropriat
e
   for the specified recipient.

   When symmetric key management is used for a given recipient, the
   "Key-Info:" field following the corresponding "Recipient-ID-
   Symmetric:" field also transfers the message's computed MIC,
   encrypted under the recipient's IK. When asymmetric key management
is
   used, a "MIC-Info:" field associated with an "Originator-ID-
   Asymmetric:" or "Originator-Certificate:" field carries the message
's
   MIC, asymmetrically signed using the private component of the
   originator.  If the PEM message is of type ENCRYPTED (as defined in
   Section 4.6.1.1.1 of this RFC), the asymmetrically signed MIC is
   symmetrically encrypted using the same DEK, algorithm, encryption
   mode and other cryptographic parameters as used to encrypt the
   message text, prior to inclusion in the "MIC-Info:" field.

4.1.2.1  Processing Steps

   A four-phase transformation procedure is employed in order to
   represent encrypted message text in a universally transmissible for
m
   and to enable messages encrypted on one type of host computer to be
   decrypted on a different type of host computer.  A plaintext messag
e
   is accepted in local form, using the host's native character set an
d
   line representation.  The local form is converted to a canonical
   message text representation, defined as equivalent to the inter-SMT
P
   representation of message text.  This canonical representation form
s
   the input to the MIC computation step (applicable to ENCRYPTED, MIC
-
   ONLY, and MIC-CLEAR messages) and the encryption process (applicabl
e
   to ENCRYPTED messages only).



Linn                                                            [Page
8]










T. Berners-Lee                                                       65

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   For ENCRYPTED PEM messages, the canonical representation is padded
as


















































T. Berners-Lee                                                       66

   required by the encryption algorithm, and this padded canonical
   representation is encrypted. The encrypted text (for an ENCRYPTED
   message) or the unpadded canonical form (for a MIC-ONLY message) is
   then encoded into a printable form.  The printable form is composed
   of a restricted character set which is chosen to be universally
   representable across sites, and which will not be disrupted by
   processing within and between MTS entities. MIC-CLEAR PEM messages
   omit the printable encoding step.

   The output of the previous processing steps is combined with a set
of
   header fields carrying cryptographic control information.  The
   resulting PEM message is passed to the electronic mail system to be
   included within the text portion of a transmitted message. There is
   no requirement that a PEM message comprise the entirety of an MTS
   message's text portion; this allows PEM-protected information to be
   accompanied by (unprotected) annotations.  It is also permissible f
or
   multiple PEM messages (and associated unprotected text, outside the
   PEM message boundaries) to be represented within the encapsulated
   text of a higher-level PEM message. PEM message signatures are
   forwardable when asymmetric key management is employed; an authoriz
ed
   recipient of a PEM message with confidentiality applied can reduce
   that message to a signed but unencrypted form for forwarding purpos
es
   or can re-encrypt that message for subsequent transmission.

   When a PEM message is received, the cryptographic control fields
   within its encapsulated header provide the information required for
   each authorized recipient to perform MIC validation and decryption
of
   the received message text.  For ENCRYPTED and MIC-ONLY messages, th
e
   printable encoding is converted to a bitstring.  Encrypted portions
   of the transmitted message are decrypted.  The MIC is validated.
   Then, the recipient PEM process converts the canonical representati
on
   to its appropriate local form.

4.1.2.2  Error Cases

   A variety of error cases may occur and be detected in the course of
   processing a received PEM message. The specific actions to be taken
   in response to such conditions are local matters, varying as
   functions of user preferences and the type of user interface provid
ed
   by a particular PEM implementation, but certain general
   recommendations are appropriate. Syntactically invalid PEM messages
   should be flagged as such, preferably with collection of diagnostic
   information to support debugging of incompatibilities or other
   failures.  RFC 1422 defines specific error processing requirements
   relevant to the certificate-based key management mechanisms defined



T. Berners-Lee                                                       67

   therein.




Linn                                                            [Page
9]

















































T. Berners-Lee                                                       68

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   Syntactically valid PEM messages which yield MIC failures raise
   special concern, as they may result from attempted attacks or forge
d
   messages.  As such, it is unsuitable to display their contents to
   recipient users without first indicating the fact that the contents
'
   authenticity and integrity cannot be guaranteed and then receiving
   positive user confirmation of such a warning.  MIC-CLEAR messages
   (discussed in Section 4.6.1.1.3 of this RFC) raise special concerns
,
   as MIC failures on such messages may occur for a broader range of
   benign causes than are applicable to other PEM message types.

4.2  Encryption Algorithms, Modes, and Parameters

   For use in conjunction with this RFC, RFC 1423 defines the
   appropriate algorithms, modes, and associated identifiers to be use
d
   for encryption of message text with DEKs.

   The mechanisms defined in this RFC incorporate facilities for
   transmission of cryptographic parameters (e.g., pseudorandom
   Initializing Vectors (IVs)) with PEM messages to which the
   confidentiality service is applied, when required by symmetric
   message encryption algorithms and modes specified in RFC 1423.

   Certain operations require encryption of DEKs, MICs, and digital
   signatures under an IK for purposes of transmission.  A header
   facility indicates the mode in which the IK is used for encryption.
   RFC 1423 specifies encryption algorithm and mode identifiers and
   minimum essential support requirements for key encryption processin
g.

   RFC 1422 specifies asymmetric, certificate-based key management
   procedures based on CCITT Recommendation X.509 to support the messa
ge
   processing procedures defined in this document. Support for the key
   management approach defined in RFC 1422 is strongly recommended.  T
he
   message processing procedures can also be used with symmetric key
   management, given prior distribution of suitable symmetric IKs, but











T. Berners-Lee                                                       69

   no current RFCs specify key distribution procedures for such IKs.

4.3  Privacy Enhancement Message Transformations

4.3.1  Constraints

   An electronic mail encryption mechanism must be compatible with the
   transparency constraints of its underlying electronic mail
   facilities.  These constraints are generally established based on
   expected user requirements and on the characteristics of anticipate
d
   endpoint and transport facilities.  An encryption mechanism must al
so
   be compatible with the local conventions of the computer systems
   which it interconnects.  Our approach uses a canonicalization step
to
   abstract out local conventions and a subsequent encoding step to



Linn                                                           [Page 1
0]


































T. Berners-Lee                                                       70

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   conform to the characteristics of the underlying mail transport
   medium (SMTP).  The encoding conforms to SMTP constraints.  Section
   4.5 of RFC 821 [2] details SMTP's transparency constraints.

   To prepare a message for SMTP transmission, the following
   requirements must be met:

        1.  All characters must be members of the 7-bit ASCII characte
r
            set.

        2.  Text lines, delimited by the character pair <CR><LF>, must
            be no more than 1000 characters long.

        3.  Since the string <CR><LF>.<CR><LF> indicates the end of a
            message, it must not occur in text prior to the end of a
            message.

   Although SMTP specifies a standard representation for line delimite
rs
   (ASCII <CR><LF>), numerous systems in the Internet use a different
   native representation to delimit lines.  For example, the <CR><LF>
   sequences delimiting lines in mail inbound to UNIX systems are
   transformed to single <LF>s as mail is written into local mailbox
   files.  Lines in mail incoming to record-oriented systems (such as
   VAX VMS) may be converted to appropriate records by the destination


























T. Berners-Lee                                                       71

   SMTP server [3].  As a result, if the encryption process generated
   <CR>s or <LF>s, those characters might not be accessible to a
   recipient UA program at a destination which uses different line
   delimiting conventions.  It is also possible that conversion betwee
n
   tabs and spaces may be performed in the course of mapping between
   inter-SMTP and local format; this is a matter of local option.  If
   such transformations changed the form of transmitted ciphertext,
   decryption would fail to regenerate the transmitted plaintext, and
a
   transmitted MIC would fail to compare with that computed at the
   destination.

   The conversion performed by an SMTP server at a system with EBCDIC
as
   a native character set has even more severe impact, since the
   conversion from EBCDIC into ASCII is an information-losing
   transformation.  In principle, the transformation function mapping
   between inter-SMTP canonical ASCII message representation and local
   format could be moved from the SMTP server up to the UA, given a
   means to direct that the SMTP server should no longer perform that
   transformation.  This approach has a major disadvantage: internal
   file (e.g., mailbox) formats would be incompatible with the native
   forms used on the systems where they reside.  Further, it would
   require modification to SMTP servers, as mail would be passed to SM
TP
   in a different representation than it is passed at present.




Linn                                                           [Page 1
1]























T. Berners-Lee                                                       72

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


4.3.2  Approach

   Our approach to supporting PEM across an environment in which
   intermediate conversions may occur defines an encoding for mail whi
ch
   is uniformly representable across the set of PEM UAs regardless of
   their systems' native character sets.  This encoded form is used (f
or
   specified PEM message types) to represent mail text in transit from
   originator to recipient, but the encoding is not applied to enclosi
ng
   MTS headers or to encapsulated headers inserted to carry control
   information between PEM UAs.  The encoding's characteristics are su
ch
   that the transformations anticipated between originator and recipie





































T. Berners-Lee                                                       73

nt
   UAs will not prevent an encoded message from being decoded properly
   at its destination.

   Four transformation steps, described in the following four
   subsections, apply to outbound PEM message processing:

4.3.2.1  Step 1: Local Form

   This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, a
nd
   MIC-CLEAR.  The message text is created in the system's native
   character set, with lines delimited in accordance with local
   convention.

4.3.2.2  Step 2: Canonical Form

   This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, a
nd
   MIC-CLEAR.  The message text is converted to a universal canonical
   form, similar to the inter-SMTP representation [4] as defined in RF
C
   821 [2] and RFC 822 [5]. The procedures performed in order to
   accomplish this conversion are dependent on the characteristics of
   the local form and so are not specified in this RFC.

   PEM canonicalization assures that the message text is represented
   with the ASCII character set and "<CR><LF>" line delimiters, but do
es
   not perform the dot-stuffing transformation discussed in RFC 821,
   Section 4.5.2.  Since a message is converted to a standard characte
r
   set and representation before encryption, a transferred PEM message
   can be decrypted and its MIC can be validated at any type of
   destination host computer.  Decryption and MIC validation is
   performed before any conversions which may be necessary to transfor
m
   the message into a destination-specific local form.

4.3.2.3  Step 3: Authentication and Encryption

   Authentication processing is applicable to PEM message types
   ENCRYPTED, MIC-ONLY, and MIC-CLEAR.  The canonical form is input to
   the selected MIC computation algorithm in order to compute an



Linn                                                           [Page 1
2]







T. Berners-Lee                                                       74

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93






















































T. Berners-Lee                                                       75

   integrity check quantity for the message.  No padding is added to t
he
   canonical form before submission to the MIC computation algorithm,
   although certain MIC algorithms will apply their own padding in the
   course of computing a MIC.

   Encryption processing is applicable only to PEM message type
   ENCRYPTED.  RFC 1423 defines the padding technique used to support
   encryption of the canonically-encoded message text.

4.3.2.4  Step 4: Printable Encoding

   This printable encoding step is applicable to PEM message types
   ENCRYPTED and MIC-ONLY.  The same processing is also employed in
   representation of certain specifically identified PEM encapsulated
   header field quantities as cited in Section 4.6.  Proceeding from
   left to right, the bit string resulting from step 3 is encoded into
   characters which are universally representable at all sites, though
   not necessarily with the same bit patterns (e.g., although the
   character "E" is represented in an ASCII-based system as hexadecima
l
   45 and as hexadecimal C5 in an EBCDIC-based system, the local
   significance of the two representations is equivalent).

   A 64-character subset of International Alphabet IA5 is used, enabli
ng
   6 bits to be represented per printable character.  (The proposed
   subset of characters is represented identically in IA5 and ASCII.)
   The character "=" signifies a special processing function used for
   padding within the printable encoding procedure.

   To represent the encapsulated text of a PEM message, the encoding
   function's output is delimited into text lines (using local
   conventions), with each line except the last containing exactly 64
   printable characters and the final line containing 64 or fewer
   printable characters.  (This line length is easily printable and is
   guaranteed to satisfy SMTP's 1000-character transmitted line length
   limit.) This folding requirement does not apply when the encoding
   procedure is used to represent PEM header field quantities; Section
   4.6 discusses folding of PEM encapsulated header fields.

   The encoding process represents 24-bit groups of input bits as outp
ut
   strings of 4 encoded characters. Proceeding from left to right acro
ss
   a 24-bit input group extracted from the output of step 3, each 6-bi
t
   group is used as an index into an array of 64 printable characters.
   The character referenced by the index is placed in the output strin
g.
   These characters, identified in Table 1, are selected so as to be
   universally representable, and the set excludes characters with
   particular significance to SMTP (e.g., ".", "<CR>", "<LF>").



T. Berners-Lee                                                       76

Linn                                                           [Page 1
3]






















































T. Berners-Lee                                                       77

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   Special processing is performed if fewer than 24 bits are available
   in an input group at the end of a message.  A full encoding quantum
   is always completed at the end of a message.  When fewer than 24
   input bits are available in an input group, zero bits are added (on
   the right) to form an integral number of 6-bit groups.  Output
   character positions which are not required to represent actual inpu
t
   data are set to the character "=".  Since all canonically encoded
   output is an integral number of octets, only the following cases ca
n
   arise: (1) the final quantum of encoding input is an integral
   multiple of 24 bits; here, the final unit of encoded output will be
   an integral multiple of 4 characters with no "=" padding, (2) the
   final quantum of encoding input is exactly 8 bits; here, the final
   unit of encoded output will be two characters followed by two "="
   padding characters, or (3) the final quantum of encoding input is
   exactly 16 bits; here, the final unit of encoded output will be thr
ee
   characters followed by one "=" padding character.

   Value Encoding  Value Encoding  Value Encoding  Value Encoding
   0 A            17 R            34 i            51 z        1 B
       18 S            35 j            52 0        2 C            19 T
            36 k            53 1        3 D            20 U
 37 l            54 2        4 E            21 V            38 m
      55 3        5 F            22 W            39 n            56 4
       6 G            23 X            40 o            57 5        7 H
           24 Y            41 p            58 6        8 I
25 Z            42 q            59 7        9 J            26 a
     43 r            60 8       10 K            27 b            44 s
          61 9       11 L            28 c            45 t            6
2 +       12 M            29 d            46 u            63 /       1
3 N            30 e            47 v       14 O            31 f
    48 w         (pad) =       15 P            32 g            49 x
    16 Q            33 h            50 y                   Printable E
ncoding Characters                              Table 1

4.3.2.5  Summary of Transformations

   In summary, the outbound message is subjected to the following
   composition of transformations (or, for some PEM message types, a
   subset thereof):

         Transmit_Form = Encode(Encrypt(Canonicalize(Local_Form)))








T. Berners-Lee                                                       78

Linn                                                           [Page 1
4]






















































T. Berners-Lee                                                       79

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   The inverse transformations are performed, in reverse order, to
   process inbound PEM messages:

       Local_Form = DeCanonicalize(Decipher(Decode(Transmit_Form)))

   Note that the local form and the functions to transform messages to
   and from canonical form may vary between the originator and recipie
nt
   systems without loss of information.

4.4  Encapsulation Mechanism

   The encapsulation techniques defined in RFC-934 [6] are adopted for
   encapsulation of PEM messages within separate enclosing MTS message
s
   carrying associated MTS headers. This approach offers a number of
   advantages relative to a flat approach in which certain fields with
in
   a single header are encrypted and/or carry cryptographic control
   information.  As far as the MTS is concerned, the entirety of a PEM
   message will reside in an MTS message's text portion, not the MTS
   message's header portion. Encapsulation provides generality and
   segregates fields with user-to-user significance from those
   transformed in transit.  All fields inserted in the course of
   encryption/authentication processing are placed in the encapsulated
   header.  This facilitates compatibility with mail handling programs
   which accept only text, not header fields, from input files or from
   other programs.

   The encapsulation techniques defined in RFC-934 are consistent with
   existing Internet mail forwarding and bursting mechanisms.  These
   techniques are designed so that they may be used in a nested manner
.
   The encapsulation techniques may be used to encapsulate one or more
   PEM messages for forwarding to a third party, possibly in conjuncti
on
   with interspersed (non-PEM) text which serves to annotate the PEM
   messages.

   Two encapsulation boundaries (EB's) are defined for delimiting
   encapsulated PEM messages and for distinguishing encapsulated PEM
   messages from interspersed (non-PEM) text.  The pre-EB is the strin
g
   "-----BEGIN PRIVACY-ENHANCED MESSAGE-----", indicating that an
   encapsulated PEM message follows.  The post-EB is either (1) anothe
r






T. Berners-Lee                                                       80

   pre-EB indicating that another encapsulated PEM message follows, or
   (2) the string "-----END PRIVACY-ENHANCED MESSAGE-----" indicating
   that any text that immediately follows is non-PEM text.  A special
   point must be noted for the case of MIC-CLEAR messages, the text
   portions of which may contain lines which begin with the "-"
   character and which are therefore subject to special processing per
   RFC-934 forwarding procedures.  When the string "- " must be
   prepended to such a line in the course of a forwarding operation in
   order to distinguish that line from an encapsulation boundary, MIC



Linn                                                           [Page 1
5]










































T. Berners-Lee                                                       81

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   computation is to be performed prior to prepending the "- " string.
   Figure 1 depicts the encapsulation of a single PEM message.

   This RFC places no a priori limits on the depth to which such
   encapsulation may be nested nor on the number of PEM messages which
   may be grouped in this fashion at a single nesting level for
   forwarding.  A implementation compliant with this RFC must not
   preclude a user from submitting or receiving PEM messages which
   exploit this encapsulation capability.  However, no specific
   requirements are levied upon implementations with regard to how thi
s
   capability is made available to the user.  Thus, for example, a
   compliant PEM implementation is not required to automatically detec
t
   and process encapsulated PEM messages.

   In using this encapsulation facility, it is important to note that
it
   is inappropriate to forward directly to a third party a message tha
t
   is ENCRYPTED because recipients of such a message would not have
   access to the DEK required to decrypt the message.  Instead, the us
er
   forwarding the message must transform the ENCRYPTED message into a
   MIC-ONLY or MIC-CLEAR form prior to forwarding.  Thus, in order to
   comply with this RFC, a PEM implementation must provide a facility
to
   enable a user to perform this transformation, while preserving the
   MIC associated with the original message.

   If a user wishes PEM-provided confidentiality protection for
   transmitted information, such information must occur in the
   encapsulated text of an ENCRYPTED PEM message, not in the enclosing
   MTS header or PEM encapsulated header. If a user wishes to avoid


















T. Berners-Lee                                                       82

Linn                                                           [Page 1
6]






















































T. Berners-Lee                                                       83

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   Encapsulated Message

       Pre-Encapsulation Boundary (Pre-EB)
           -----BEGIN PRIVACY-ENHANCED MESSAGE-----

       Encapsulated Header Portion
           (Contains encryption control fields inserted in plaintext.
           Examples include "DEK-Info:" and "Key-Info:".
           Note that, although these control fields have line-oriented
           representations similar to RFC 822 header fields, the set
           of fields valid in this context is disjoint from those used
           in RFC 822 processing.)

       Blank Line
           (Separates Encapsulated Header from subsequent
           Encapsulated Text Portion)

       Encapsulated Text Portion
           (Contains message data encoded as specified in Section 4.3.
)

       Post-Encapsulation Boundary (Post-EB)
           -----END PRIVACY-ENHANCED MESSAGE-----


                   Encapsulated Message Format
                            Figure 1


   disclosing the actual subject of a message to unintended parties, i
t
   is suggested that the enclosing MTS header contain a "Subject:" fie
ld
   indicating that "Encrypted Mail Follows".

   If an integrity-protected representation of information which occur
s
   within an enclosing header (not necessarily in the same format as
   that in which it occurs within that header) is desired, that data c
an
   be included within the encapsulated text portion in addition to its
   inclusion in the enclosing MTS header.  For example, an originator
   wishing to provide recipients with a protected indication of a
   message's position in a series of messages could include within the
   encapsulated text a copy of a timestamp or message counter value
   possessing end-to-end significance and extracted from an enclosing






T. Berners-Lee                                                       84

   MTS header field.  (Note: mailbox specifiers as entered by end user
s
   incorporate local conventions and are subject to modification at
   intermediaries, so inclusion of such specifiers within encapsulated
   text should not be regarded as a suitable alternative to the
   authentication semantics defined in RFC 1422 and based on X.500
   Distinguished Names.) The set of header information (if any) includ
ed
   within the encapsulated text of messages is a local matter, and thi
s



Linn                                                           [Page 1
7]









































T. Berners-Lee                                                       85

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   RFC does not specify formatting conventions to distinguish replicat
ed
   header fields from other encapsulated text.

4.5  Mail for Mailing Lists

   When mail is addressed to mailing lists, two different methods of
   processing can be applicable: the IK-per-list method and the IK-per
-
   recipient method.  Hybrid approaches are also possible, as in the
   case of IK-per-list protection of a message on its path from an
   originator to a PEM-equipped mailing list exploder, followed by IK-
   per-recipient protection from the exploder to individual list
   recipients.

   If a message's originator is equipped to expand a destination maili
ng
   list into its individual constituents and elects to do so (IK-per-
   recipient), the message's DEK (and, in the symmetric key management
   case, MIC) will be encrypted under each per-recipient IK and all su
ch
   encrypted representations will be incorporated into the transmitted
   message.  Note that per-recipient encryption is required only for t
he
   relatively small DEK and MIC quantities carried in the "Key-Info:"
   field, not for the message text which is, in general, much larger.
   Although more IKs are involved in processing under the IK-per-
   recipient method, the pairwise IKs can be individually revoked and
   possession of one IK does not enable a successful masquerade of
   another user on the list.

   If a message's originator addresses a message to a list name or
   alias, use of an IK associated with that name or alias as a entity



















T. Berners-Lee                                                       86

   (IK-per-list), rather than resolution of the name or alias to its
   constituent destinations, is implied. Such an IK must, therefore, b
e
   available to all list members. Unfortunately, it implies an
   undesirable level of exposure for the shared IK, and makes its
   revocation difficult.  Moreover, use of the IK-per-list method allo
ws
   any holder of the list's IK to masquerade as another originator to
   the list for authentication purposes.

   Pure IK-per-list key management in the asymmetric case (with a comm
on
   private key shared among multiple list members) is particularly
   disadvantageous in the asymmetric environment, as it fails to
   preserve the forwardable authentication and non-repudiation
   characteristics which are provided for other messages in this
   environment.  Use of a hybrid approach with a PEM-capable exploder
is
   therefore particularly recommended for protection of mailing list
   traffic when asymmetric key management is used; such an exploder
   would reduce (per discussion in Section 4.4 of this RFC) incoming
   ENCRYPTED messages to MIC-ONLY or MIC-CLEAR form before forwarding
   them (perhaps re-encrypted under individual, per-recipient keys) to
   list members.



Linn                                                           [Page 1
8]



























T. Berners-Lee                                                       87

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


4.6  Summary of Encapsulated Header Fields

   This section defines the syntax and semantics of the encapsulated
   header fields to be added to messages in the course of privacy
   enhancement processing.

   The fields are presented in three groups.  Normally, the groups wil
l
   appear in encapsulated headers in the order in which they are shown
,
   though not all fields in each group will appear in all messages.  T
he
   following figures show the appearance of small example encapsulated
   messages.  Figure 2 assumes the use of symmetric cryptography for k
ey
   management.  Figure 3 illustrates an example encapsulated ENCRYPTED
   message in which asymmetric key management is used.



































T. Berners-Lee                                                       88

   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,ENCRYPTED
   Content-Domain: RFC822
   DEK-Info: DES-CBC,F8143EDE5960C597
   Originator-ID-Symmetric: linn@zendia.enet.dec.com,,
   Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,3
   Key-Info: DES-ECB,RSA-MD2,9FD3AAD2F2691B9A,
             B70665BB9BF7CBCDA60195DB94F727D3
   Recipient-ID-Symmetric: pem-dev@tis.com,ptf-kmc,4
   Key-Info: DES-ECB,RSA-MD2,161A3F75DC82EF26,
             E2EF532C65CBCFF79F83A2658132DB47

   LLrHB0eJzyhP+/fSStdW8okeEnv47jxe7SJ/iN72ohNcUk2jHEUSoH1nvNSIWL9M
   8tEjmF/zxB+bATMtPjCUWbz8Lr9wloXIkjHUlBLpvXR0UrUzYbkNpk0agV2IzUpk
   J6UiRRGcDSvzrsoK+oNvqu6z7Xs5Xfz5rDqUcMlK1Z6720dcBWGGsDLpTpSCnpot
   dXd/H5LMDWnonNvPCwQUHt==
   -----END PRIVACY-ENHANCED MESSAGE-----

          Example Encapsulated Message (Symmetric Case)
                            Figure 2


   Figure 4 illustrates an example encapsulated MIC-ONLY message in
   which asymmetric key management is used; since no per-recipient key
s
   are involved in preparation of asymmetric-case MIC-ONLY messages,
   this example should be processable for test purposes by arbitrary P
EM
   implementations.

   Fully-qualified domain names (FQDNs) for hosts, appearing in the
   mailbox names found in entity identifier subfields of "Originator-
   ID-Symmetric:" and "Recipient-ID-Symmetric:" fields, are processed
in
   a case-insensitive fashion.  Unless specified to the contrary, othe
r
   field arguments (including the user name components of mailbox name
s)



Linn                                                           [Page 1
9]













T. Berners-Lee                                                       89

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   are to be processed in a case-sensitive fashion.

   In most cases, numeric quantities are represented in header fields
as
   contiguous strings of hexadecimal digits, where each digit is















































T. Berners-Lee                                                       90

   represented by a character from the ranges "0"-"9" or upper case
   "A"-"F".  Since public-key certificates and quantities encrypted













































Linn                                                           [Page 2
0]







T. Berners-Lee                                                       91

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93






















































T. Berners-Lee                                                       92

   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,ENCRYPTED
   Content-Domain: RFC822
   DEK-Info: DES-CBC,BFF968AA74691AC1
   Originator-Certificate:
    MIIBlTCCAScCAWUwDQYJKoZIhvcNAQECBQAwUTELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDzAN
    BgNVBAsTBk5PVEFSWTAeFw05MTA5MDQxODM4MTdaFw05MzA5MDMxODM4MTZaMEUx
    CzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5jLjEU
    MBIGA1UEAxMLVGVzdCBVc2VyIDEwWTAKBgRVCAEBAgICAANLADBIAkEAwHZHl7i+
    yJcqDtjJCowzTdBJrdAiLAnSC+CnnjOJELyuQiBgkGrgIh3j8/x0fM+YrsyF1u3F
    LZPVtzlndhYFJQIDAQABMA0GCSqGSIb3DQEBAgUAA1kACKr0PqphJYw1j+YPtcIq
    iWlFPuN5jJ79Khfg7ASFxskYkEMjRNZV/HZDZQEhtVaU7Jxfzs2wfX5byMp2X3U/
    5XUXGx7qusDgHQGs7Jk9W8CW1fuSWUgN4w==
   Key-Info: RSA,
    I3rRIGXUGWAF8js5wCzRTkdhO34PTHdRZY9Tuvm03M+NM7fx6qc5udixps2Lng0+
    wGrtiUm/ovtKdinz6ZQ/aQ==
   Issuer-Certificate:
    MIIB3DCCAUgCAQowDQYJKoZIhvcNAQECBQAwTzELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDTAL
    BgNVBAsTBFRMQ0EwHhcNOTEwOTAxMDgwMDAwWhcNOTIwOTAxMDc1OTU5WjBRMQsw
    CQYDVQQGEwJVUzEgMB4GA1UEChMXUlNBIERhdGEgU2VjdXJpdHksIEluYy4xDzAN
    BgNVBAsTBkJldGEgMTEPMA0GA1UECxMGTk9UQVJZMHAwCgYEVQgBAQICArwDYgAw
    XwJYCsnp6lQCxYykNlODwutF/jMJ3kL+3PjYyHOwk+/9rLg6X65B/LD4bJHtO5XW
    cqAz/7R7XhjYCm0PcqbdzoACZtIlETrKrcJiDYoP+DkZ8k1gCk7hQHpbIwIDAQAB
    MA0GCSqGSIb3DQEBAgUAA38AAICPv4f9Gx/tY4+p+4DB7MV+tKZnvBoy8zgoMGOx
    dD2jMZ/3HsyWKWgSF0eH/AJB3qr9zosG47pyMnTf3aSy2nBO7CMxpUWRBcXUpE+x
    EREZd9++32ofGBIXaialnOgVUn0OzSYgugiQ077nJLDUj0hQehCizEs5wUJ35a5h
   MIC-Info: RSA-MD5,RSA,
    UdFJR8u/TIGhfH65ieewe2lOW4tooa3vZCvVNGBZirf/7nrgzWDABz8w9NsXSexv
    AjRFbHoNPzBuxwmOAFeA0HJszL4yBvhG
   Recipient-ID-Asymmetric:
    MFExCzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5j
    LjEPMA0GA1UECxMGQmV0YSAxMQ8wDQYDVQQLEwZOT1RBUlk=,
    66
   Key-Info: RSA,
    O6BS1ww9CTyHPtS3bMLD+L0hejdvX6Qv1HK2ds2sQPEaXhX8EhvVphHYTjwekdWv
    7x0Z3Jx2vTAhOYHMcqqCjA==

   qeWlj/YJ2Uf5ng9yznPbtD0mYloSwIuV9FRYx+gzY+8iXd/NQrXHfi6/MhPfPF3d
   jIqCJAxvld2xgqQimUzoS1a4r7kQQ5c/Iua4LqKeq3ciFzEv/MbZhA==
   -----END PRIVACY-ENHANCED MESSAGE-----

    Example Encapsulated ENCRYPTED Message (Asymmetric Case)
                            Figure 3






Linn                                                           [Page 2
1]



T. Berners-Lee                                                       93

























































T. Berners-Lee                                                       94

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   using asymmetric algorithms are large in size, use of a more space-
   efficient encoding technique is appropriate for such quantities, an
d
   the encoding mechanism defined in Section 4.3.2.4 of this RFC,
   representing 6 bits per printed character, is adopted for this
   purpose.

   Encapsulated headers of PEM messages are folded using whitespace pe
r
   RFC 822 header folding conventions; no PEM-specific conventions are
   defined for encapsulated header folding.  The example shown in Figu
re
   4 shows (in its "MIC-Info:" field) an asymmetrically encrypted
   quantity in its printably encoded representation, illustrating the
   use of RFC 822 folding.

   In contrast to the encapsulated header representations defined in R
FC
   1113 and its precursors, the field identifiers adopted in this RFC
do
   not begin with the prefix "X-" (for example, the field previously
   denoted "X-Key-Info:" is now denoted "Key-Info:") and such prefixes
   are not to be emitted by implementations conformant to this RFC.  T
o
   simplify transition and interoperability with earlier
   implementations, it is suggested that implementations based on this
   RFC accept incoming encapsulated header fields carrying the "X-"
   prefix and act on such fields as if the "X-" were not present.
























T. Berners-Lee                                                       95

Linn                                                           [Page 2
2]






















































T. Berners-Lee                                                       96

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,MIC-ONLY
   Content-Domain: RFC822
   Originator-Certificate:
    MIIBlTCCAScCAWUwDQYJKoZIhvcNAQECBQAwUTELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDzAN
    BgNVBAsTBk5PVEFSWTAeFw05MTA5MDQxODM4MTdaFw05MzA5MDMxODM4MTZaMEUx
    CzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5jLjEU
    MBIGA1UEAxMLVGVzdCBVc2VyIDEwWTAKBgRVCAEBAgICAANLADBIAkEAwHZHl7i+
    yJcqDtjJCowzTdBJrdAiLAnSC+CnnjOJELyuQiBgkGrgIh3j8/x0fM+YrsyF1u3F
    LZPVtzlndhYFJQIDAQABMA0GCSqGSIb3DQEBAgUAA1kACKr0PqphJYw1j+YPtcIq
    iWlFPuN5jJ79Khfg7ASFxskYkEMjRNZV/HZDZQEhtVaU7Jxfzs2wfX5byMp2X3U/
    5XUXGx7qusDgHQGs7Jk9W8CW1fuSWUgN4w==
   Issuer-Certificate:
    MIIB3DCCAUgCAQowDQYJKoZIhvcNAQECBQAwTzELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDTAL
    BgNVBAsTBFRMQ0EwHhcNOTEwOTAxMDgwMDAwWhcNOTIwOTAxMDc1OTU5WjBRMQsw
    CQYDVQQGEwJVUzEgMB4GA1UEChMXUlNBIERhdGEgU2VjdXJpdHksIEluYy4xDzAN
    BgNVBAsTBkJldGEgMTEPMA0GA1UECxMGTk9UQVJZMHAwCgYEVQgBAQICArwDYgAw
    XwJYCsnp6lQCxYykNlODwutF/jMJ3kL+3PjYyHOwk+/9rLg6X65B/LD4bJHtO5XW
    cqAz/7R7XhjYCm0PcqbdzoACZtIlETrKrcJiDYoP+DkZ8k1gCk7hQHpbIwIDAQAB
    MA0GCSqGSIb3DQEBAgUAA38AAICPv4f9Gx/tY4+p+4DB7MV+tKZnvBoy8zgoMGOx
    dD2jMZ/3HsyWKWgSF0eH/AJB3qr9zosG47pyMnTf3aSy2nBO7CMxpUWRBcXUpE+x
    EREZd9++32ofGBIXaialnOgVUn0OzSYgugiQ077nJLDUj0hQehCizEs5wUJ35a5h
   MIC-Info: RSA-MD5,RSA,
    jV2OfH+nnXHU8bnL8kPAad/mSQlTDZlbVuxvZAOVRZ5q5+Ejl5bQvqNeqOUNQjr6
    EtE7K2QDeVMCyXsdJlA8fA==

   LSBBIG1lc3NhZ2UgZm9yIHVzZSBpbiB0ZXN0aW5nLg0KLSBGb2xsb3dpbmcgaXMg
   YSBibGFuayBsaW5lOg0KDQpUaGlzIGlzIHRoZSBlbmQuDQo=
   -----END PRIVACY-ENHANCED MESSAGE-----

     Example Encapsulated MIC-ONLY Message (Asymmetric Case)
                            Figure 4


4.6.1  Per-Message Encapsulated Header Fields

   This group of encapsulated header fields contains fields which occu
r
   no more than once in a PEM message, generally preceding all other
   encapsulated header fields.

4.6.1.1  Proc-Type Field

   The "Proc-Type:" encapsulated header field, required for all PEM






T. Berners-Lee                                                       97

   messages, identifies the type of processing performed on the
   transmitted message.  Only one "Proc-Type:" field occurs in a
   message; the "Proc-Type:" field must be the first encapsulated head
er



Linn                                                           [Page 2
3]















































T. Berners-Lee                                                       98

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   field in the message.

   The "Proc-Type:" field has two subfields, separated by a comma.  Th
e
   first subfield is a decimal number which is used to distinguish amo
ng
   incompatible encapsulated header field interpretations which may
   arise as changes are made to this standard.  Messages processed
   according to this RFC will carry the subfield value "4" to
   distinguish them from messages processed in accordance with prior P
EM
   RFCs.  The second subfield assumes one of a set of string values,
   defined in the following subsections.

4.6.1.1.1  ENCRYPTED

   The "ENCRYPTED" specifier signifies that confidentiality,
   authentication, integrity, and (given use of asymmetric key
   management) non-repudiation of origin security services have been
   applied to a PEM message's encapsulated text.  ENCRYPTED messages
   require a "DEK-Info:" field and individual Recipient-ID and "Key-
   Info:" fields for all message recipients.

4.6.1.1.2  MIC-ONLY

   The "MIC-ONLY" specifier signifies that all of the security service
s
   specified for ENCRYPTED messages, with the exception of
   confidentiality, have been applied to a PEM message's encapsulated
   text. MIC-ONLY messages are encoded (per Section 4.3.2.4 of this RF
C)
   to protect their encapsulated text against modifications at message
   transfer or relay points.

   Specification of MIC-ONLY, when applied in conjunction with certain
   combinations of key management and MIC algorithm options, permits
   certain fields which are superfluous in the absence of encryption t
o
   be omitted from the encapsulated header.  In particular, when a













T. Berners-Lee                                                       99

   keyless MIC computation is employed for recipients for whom
   asymmetric cryptography is used, "Recipient-ID-Asymmetric:" and
   "Key-Info:" fields can be omitted.  The "DEK-Info:" field can be
   omitted for all "MIC-ONLY" messages.

4.6.1.1.3  MIC-CLEAR

   The "MIC-CLEAR" specifier represents a PEM message with the same
   security service selection as for a MIC-ONLY message.  The set of
   encapsulated header fields required in a MIC-CLEAR message is the
   same as that required for a MIC-ONLY message.

   MIC-CLEAR message processing omits the encoding step defined in
   Section 4.3.2.4 of this RFC to protect a message's encapsulated tex
t
   against modifications within the MTS.  As a result, a MIC-CLEAR



Linn                                                           [Page 2
4]



































T. Berners-Lee                                                       100

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   message's text can be read by recipients lacking access to PEM
   software, even though such recipients cannot validate the message's
   signature. The canonical encoding discussed in Section 4.3.2.2 is
   performed, so interoperation among sites with different native
   character sets and line representations is not precluded so long as
   those native formats are unambiguously translatable to and from the
   canonical form.  (Such interoperability is feasible only for those
   characters which are included in the canonical representation set.)

   Omission of the printable encoding step implies that MIC-CLEAR
   message MICs will be validatable only in environments where the MTS
   does not modify messages in transit, or where the modifications
   performed can be determined and inverted before MIC validation
   processing.  Failed MIC validation on a MIC-CLEAR message does not,
   therefore, necessarily signify a security-relevant event; as a
   result, it is recommended that PEM implementations reflect to their
   users (in a suitable local fashion) the type of PEM message being
   processed when reporting a MIC validation failure.

   A case of particular relevance arises for inbound SMTP processing o
n
   systems which delimit text lines with local native representations
   other than the SMTP-conventional <CR><LF>.  When mail is delivered
to
   a UA on such a system and presented for PEM processing, the <CR><LF
>
   has already been translated to local form.  In order to validate a

























T. Berners-Lee                                                       101

   MIC-CLEAR message's MIC in this situation, the PEM module must
   recanonicalize the incoming message in order to determine the inter
-
   SMTP representation of the canonically encoded message (as defined
in
   Section 4.3.2.2 of this RFC), and must compute the reference MIC
   based on that representation.

4.6.1.1.4  CRL

   The "CRL" specifier indicates a special PEM message type, used to
   transfer one or more Certificate Revocation Lists.  The format of P
EM
   CRLs is defined in RFC 1422.  No user data or encapsulated text
   accompanies an encapsulated header specifying the CRL message type;
 a
   correctly-formed CRL message's PEM header is immediately followed b
y
   its terminating message boundary line, with no blank line
   intervening.

   Only three types of fields are valid in the encapsulated header
   comprising a CRL message.  The "CRL:" field carries a printable
   representation of a CRL, encoded using the procedures defined in
   Section 4.3.2.4 of this RFC. "CRL:" fields may (as an option) be
   followed by no more than one "Originator-Certificate:" field and an
y
   number of "Issuer-Certificate:" fields. The "Originator-Certificate
:"
   and "Issuer-Certificate:" fields refer to the most recently previou
s
   "CRL:" field, and provide certificates useful in validating the



Linn                                                           [Page 2
5]



















T. Berners-Lee                                                       102

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   signature included in the CRL.  "Originator-Certificate:" and
   "Issuer-Certificate:" fields' contents are the same for CRL message
s
   as they are for other PEM message types.

4.6.1.2  Content-Domain Field

   The "Content-Domain:" encapsulated header field describes the type
of
   content which is represented within a PEM message's encapsulated
   text.  It carries one string argument, whose value is defined as









































T. Berners-Lee                                                       103

   "RFC822" to indicate processing of RFC-822 mail as defined in this
   specification.  It is anticipated that additional "Content-Domain:"
   values will be defined subsequently, in additional or successor
   documents to this specification. Only one "Content-Domain:" field
   occurs in a PEM message; this field is the PEM message's second
   encapsulated header field, immediately following the "Proc-Type:"
   field.

4.6.1.3  DEK-Info Field

   The "DEK-Info:" encapsulated header field identifies the message te
xt
   encryption algorithm and mode, and also carries any cryptographic
   parameters (e.g., IVs) used for message encryption.  No more than o
ne
   "DEK-Info:" field occurs in a message; the field is required for al
l
   messages specified as "ENCRYPTED" in the "Proc-Type:" field.

   The "DEK-Info:" field carries either one argument or two arguments
   separated by a comma.  The first argument identifies the algorithm
   and mode used for message text encryption.  The second argument, if
   present, carries any cryptographic parameters required by the
   algorithm and mode identified in the first argument.  Appropriate
   message encryption algorithms, modes and identifiers and
   corresponding cryptographic parameters and formats are defined in R
FC
   1423.

4.6.2  Encapsulated Header Fields Normally Per-Message

   This group of encapsulated header fields contains fields which
   ordinarily occur no more than once per message.  Depending on the k
ey
   management option(s) employed, some of these fields may be absent
   from some messages.

4.6.2.1  Originator-ID Fields

   Originator-ID encapsulated header fields identify a message's
   originator and provide the originator's IK identification component
.
   Two varieties of Originator-ID fields are defined, the "Originator-
   ID-Asymmetric:" and "Originator-ID-Symmetric:" field.  An
   "Originator-ID-Symmetric:" header field is required for all PEM



Linn                                                           [Page 2
6]






T. Berners-Lee                                                       104

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93






















































T. Berners-Lee                                                       105

   messages employing symmetric key management.  The analogous
   "Originator-ID-Asymmetric:" field, for the asymmetric key managemen
t
   case, is used only when no corresponding "Originator-Certificate:"
   field is included.

   Most commonly, only one Originator-ID or "Originator-Certificate:"
   field will occur within a message. For the symmetric case, the IK
   identification component carried in an "Originator-ID-Symmetric:"
   field applies to processing of all subsequent "Recipient-ID-
   Symmetric:" fields until another "Originator-ID-Symmetric:" field
   occurs.  It is illegal for a "Recipient-ID-Symmetric:" field to occ
ur
   before a corresponding "Originator-ID-Symmetric:" field has been
   provided.  For the asymmetric case, processing of "Recipient-ID-
   Asymmetric:" fields is logically independent of preceding
   "Originator-ID-Asymmetric:" and "Originator-Certificate:" fields.

   Multiple Originator-ID and/or "Originator-Certificate:" fields may
   occur in a message when different originator-oriented IK components
   must be used by a message's originator in order to prepare a messag
e
   so as to be suitable for processing by different recipients. In
   particular, multiple such fields will occur when both symmetric and
   asymmetric cryptography are applied to a single message in order to
   process the message for different recipients.

   Originator-ID subfields are delimited by the comma character (","),
   optionally followed by whitespace.  Section 5.2, Interchange Keys,
   discusses the semantics of these subfields and specifies the alphab
et
   from which they are chosen.

4.6.2.1.1  Originator-ID-Asymmetric Field

   The "Originator-ID-Asymmetric:" field contains an Issuing Authority
   subfield, and then a Version/Expiration subfield.  This field is us
ed
   only when the information it carries is not available from an
   included "Originator-Certificate:" field.

4.6.2.1.2  Originator-ID-Symmetric Field

   The "Originator-ID-Symmetric:" field contains an Entity Identifier
   subfield, followed by an (optional) Issuing Authority subfield, and
   then an (optional) Version/Expiration subfield.  Optional
   "Originator-ID-Symmetric:" subfields may be omitted only if rendere
d
   redundant by information carried in subsequent "Recipient-ID-
   Symmetric:" fields, and will normally be omitted in such cases.






T. Berners-Lee                                                       106

Linn                                                           [Page 2
7]






















































T. Berners-Lee                                                       107

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


4.6.2.2  Originator-Certificate Field

   The "Originator-Certificate:" encapsulated header field is used onl
y
   when asymmetric key management is employed for one or more of a
   message's recipients.  To facilitate processing by recipients (at
   least in advance of general directory server availability), inclusi
on
   of this field in all messages is strongly recommended.  The field
   transfers an originator's certificate as a numeric quantity,
   comprised of the certificate's DER encoding, represented in the
   header field with the encoding mechanism defined in Section 4.3.2.4
   of this RFC.  The semantics of a certificate are discussed in RFC
   1422.

4.6.2.3  MIC-Info Field

   The "MIC-Info:" encapsulated header field, used only when asymmetri
c
   key management is employed for at least one recipient of a message,
   carries three arguments, separated by commas.  The first argument
   identifies the algorithm under which the accompanying MIC is
   computed.  The second argument identifies the algorithm under which
   the accompanying MIC is signed.  The third argument represents a MI
C
   signed with an originator's private key.

   For the case of ENCRYPTED PEM messages, the signed MIC is, in turn,
   symmetrically encrypted using the same DEK, algorithm, mode and
   cryptographic parameters as are used to encrypt the message's
   encapsulated text.  This measure prevents unauthorized recipients
   from determining whether an intercepted message corresponds to a
   predetermined plaintext value.

   Appropriate MIC algorithms and identifiers, signature algorithms an
d
   identifiers, and signed MIC formats are defined in RFC 1423.

   A "MIC-Info:" field will occur after a sequence of fields beginning
   with a "Originator-ID-Asymmetric:" or "Originator-Certificate:" fie
ld
   and followed by any associated "Issuer-Certificate:" fields.  A
   "MIC-Info:" field applies to all subsequent recipients for whom
   asymmetric key management is used, until and unless overridden by a
   subsequent "Originator-ID-Asymmetric:" or "Originator-Certificate:"
   and corresponding "MIC-Info:".






T. Berners-Lee                                                       108

4.6.3  Encapsulated Header Fields with Variable Occurrences

   This group of encapsulated header fields contains fields which will
   normally occur variable numbers of times within a message, with
   numbers of occurrences ranging from zero to non-zero values which a
re
   independent of the number of recipients.




Linn                                                           [Page 2
8]











































T. Berners-Lee                                                       109

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


4.6.3.1  Issuer-Certificate Field

   The "Issuer-Certificate:" encapsulated header field is meaningful
   only when asymmetric key management is used for at least one of a
   message's recipients.  A typical "Issuer-Certificate:" field would
   contain the certificate containing the public component used to sig
n
   the certificate carried in the message's "Originator-Certificate:"
   field, for recipients' use in chaining through that certificate's
   certification path.  Other "Issuer-Certificate:" fields, typically
   representing higher points in a certification path, also may be
   included by an originator.  It is recommended that the "Issuer-
   Certificate:" fields be included in an order corresponding to
   successive points in a certification path leading from the originat
or
   to a common point shared with the message's recipients (i.e., the
   Internet Certification Authority (ICA), unless a lower Policy
   Certification Authority (PCA) or CA is common to all recipients.)
   More information on certification paths can be found in RFC 1422.

   The certificate is represented in the same manner as defined for th
e
   "Originator-Certificate:" field (transporting an encoded
   representation of the certificate in X.509 [7] DER form), and any
   "Issuer-Certificate:" fields will ordinarily follow the "Originator
-
   Certificate:" field directly.  Use of the "Issuer-Certificate:" fie
ld
   is optional even when asymmetric key management is employed, althou
gh
   its incorporation is strongly recommended in the absence of alterna
te
   directory server facilities from which recipients can access issuer
s'
   certificates.

















T. Berners-Lee                                                       110

4.6.4  Per-Recipient Encapsulated Header Fields

   The encapsulated header fields in this group appear for each of an
   "ENCRYPTED" message's named recipients.  For "MIC-ONLY" and "MIC-
   CLEAR" messages, these fields are omitted for recipients for whom
   asymmetric key management is employed in conjunction with a keyless
   MIC algorithm but the fields appear for recipients for whom symmetr
ic
   key management or a keyed MIC algorithm is employed.

4.6.4.1  Recipient-ID Fields

   A Recipient-ID encapsulated header field identifies a recipient and
   provides the recipient's IK identification component.  One
   Recipient-ID field is included for each of a message's named
   recipients. Section 5.2, Interchange Keys, discusses the semantics
of
   the subfields and specifies the alphabet from which they are chosen
.
   Recipient-ID subfields are delimited by the comma character (","),
   optionally followed by whitespace.





Linn                                                           [Page 2
9]




























T. Berners-Lee                                                       111

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   For the symmetric case, all "Recipient-ID-Symmetric:" fields are
   interpreted in the context of the most recent preceding "Originator
-
   ID-Symmetric:" field.  It is illegal for a "Recipient-ID-Symmetric:
"
   field to occur in a header before the occurrence of a corresponding
   "Originator-ID-Symmetric:" field.  For the asymmetric case,
   "Recipient-ID-Asymmetric:" fields are logically independent of a
   message's "Originator-ID-Asymmetric:" and "Originator-Certificate:"
   fields.  "Recipient-ID-Asymmetric:" fields, and their associated
   "Key-Info:" fields, are included following a header's originator-
   oriented fields.

4.6.4.1.1  Recipient-ID-Asymmetric Field

   The "Recipient-ID-Asymmetric:" field contains, in order, an Issuing
   Authority subfield and a Version/Expiration subfield.

4.6.4.1.2  Recipient-ID-Symmetric Field

































T. Berners-Lee                                                       112

   The "Recipient-ID-Symmetric:" field contains, in order, an Entity
   Identifier subfield, an (optional) Issuing Authority subfield, and
an
   (optional) Version/Expiration subfield.

4.6.4.2  Key-Info Field

   One "Key-Info:" field is included for each of a message's named
   recipients.  In addition, it is recommended that PEM implementation
s
   support (as a locally-selectable option) the ability to include a
   "Key-Info:" field corresponding to a PEM message's originator,
   following an Originator-ID or "Originator-Certificate:" field and
   before any associated Recipient-ID fields, but inclusion of such a
   field is not a requirement for conformance with this RFC.

   Each "Key-Info:" field is interpreted in the context of the most
   recent preceding Originator-ID, "Originator-Certificate:", or
   Recipient-ID field; normally, a "Key-Info:" field will immediately
   follow its associated predecessor field. The "Key-Info:" field's
   argument(s) differ depending on whether symmetric or asymmetric key
   management is used for a particular recipient.

4.6.4.2.1  Symmetric Key Management

   When symmetric key management is employed for a given recipient, th
e
   "Key-Info:" encapsulated header field transfers four items, separat
ed
   by commas: an IK Use Indicator, a MIC Algorithm Indicator, a DEK an
d
   a MIC.  The IK Use Indicator identifies the algorithm and mode in
   which the identified IK was used for DEK and MIC encryption for a
   particular recipient.  The MIC Algorithm Indicator identifies the M
IC
   computation algorithm used for a particular recipient.  The DEK and



Linn                                                           [Page 3
0]















T. Berners-Lee                                                       113

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   MIC are symmetrically encrypted under the IK identified by a
   preceding "Recipient-ID-Symmetric:" field and/or prior "Originator-
   ID-Symmetric:" field.

   Appropriate symmetric encryption algorithms, modes and identifiers,
   MIC computation algorithms and identifiers, and encrypted DEK and M
IC













































T. Berners-Lee                                                       114

   formats are defined in RFC 1423.

4.6.4.2.2  Asymmetric Key Management

   When asymmetric key management is employed for a given recipient, t
he
   "Key-Info:" field transfers two quantities, separated by a comma.
   The first argument is an IK Use Indicator identifying the algorithm
   and mode in which the DEK is asymmetrically encrypted.  The second
   argument is a DEK, asymmetrically encrypted under the recipient's
   public component.

   Appropriate asymmetric encryption algorithms and identifiers, and
   encrypted DEK formats are defined in RFC 1423.

5.  Key Management

   Several cryptographic constructs are involved in supporting the PEM
   message processing procedure.  A set of fundamental elements is
   assumed.  Data Encrypting Keys (DEKs) are used to encrypt message
   text and (for some MIC computation algorithms) in the message
   integrity check (MIC) computation process.  Interchange Keys (IKs)
   are used to encrypt DEKs and MICs for transmission with messages.
In
   a certificate-based asymmetric key management architecture,
   certificates are used as a means to provide entities' public
   components and other information in a fashion which is securely bou
nd
   by a central authority.  The remainder of this section provides mor
e
   information about these constructs.

5.1  Data Encrypting Keys (DEKs)

   Data Encrypting Keys (DEKs) are used for encryption of message text
   and (with some MIC computation algorithms) for computation of messa
ge
   integrity check quantities (MICs).  In the asymmetric key managemen
t
   case, they are also used for encrypting signed MICs in ENCRYPTED PE
M
   messages.  It is strongly recommended that DEKs be generated and us
ed
   on a one-time, per-message, basis.  A transmitted message will
   incorporate a representation of the DEK encrypted under an
   appropriate interchange key (IK) for each of the named recipients.

   DEK generation can be performed either centrally by key distributio
n
   centers (KDCs) or  by endpoint systems.  Dedicated KDC systems may
be
   able to  implement stronger algorithms for random DEK generation th
an



T. Berners-Lee                                                       115

Linn                                                           [Page 3
1]






















































T. Berners-Lee                                                       116

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   can be supported in endpoint systems.  On the other hand,
   decentralization allows endpoints to be relatively self-sufficient,
   reducing the level of trust which must be placed in components othe
r
   than those of a message's originator and recipient.  Moreover,
   decentralized DEK generation at endpoints reduces the frequency wit
h
   which originators must make real-time queries of (potentially uniqu
e)
   servers in order to send mail, enhancing communications availabilit
y.

   When symmetric key management is used, one advantage of centralized
   KDC-based generation is that DEKs can be returned to endpoints
   already encrypted under the IKs of message recipients rather than
   providing the IKs to the originators.  This reduces IK exposure and
   simplifies endpoint key management requirements.  This approach has
   less value if asymmetric cryptography is used for key management,
   since per-recipient public IK components are assumed to be generall
y
   available and per-originator private IK components need not
   necessarily be shared with a KDC.

5.2  Interchange Keys (IKs)

   Interchange Key (IK) components are used to encrypt DEKs and MICs.
   In general, IK granularity is at the pairwise per-user level except
   for mail sent to address lists comprising multiple users.  In order
   for two principals to engage in a useful exchange of PEM using
   conventional cryptography, they must first possess common IK
   components (when symmetric key management is used) or complementary
   IK components (when asymmetric key management is used).  When
   symmetric cryptography is used, the IK consists of a single
   component, used to encrypt both DEKs and MICs.  When asymmetric
   cryptography is used, a recipient's public component is used as an
IK
   to encrypt DEKs (a transformation invertible only by a recipient
   possessing the corresponding private component), and the originator
's
   private component is used to encrypt MICs (a transformation
   invertible by all recipients, since the originator's certificate
   provides all recipients with the public component required to perfo
rm
   MIC validation.

   This RFC does not prescribe the means by which interchange keys are






T. Berners-Lee                                                       117

   made available to appropriate parties; such means may be centralize
d
   (e.g., via key management servers) or decentralized (e.g., via
   pairwise agreement and direct distribution among users).  In any
   case, any given IK component is associated with a responsible Issui
ng
   Authority (IA).  When certificate-based asymmetric key management,
as
   discussed in RFC [1422, is employed, the IA function is performed b
y
   a Certification Authority (CA).

   When an IA generates and distributes an IK component, associated
   control information is provided to direct how it is to be used.  In



Linn                                                           [Page 3
2]





































T. Berners-Lee                                                       118

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   order to select the appropriate IK(s) to use in message encryption,
   an originator must retain a correspondence between IK components an
d
   the recipients with which they are associated.  Expiration date
   information must also be retained, in order that cached entries may
   be invalidated and replaced as appropriate.

   Since a message may be sent with multiple IK components identified,
   corresponding to multiple intended recipients, each recipient's UA
   must be able to determine that recipient's intended IK component.
   Moreover, if no corresponding IK component is available in the
   recipient's database when a message arrives, the recipient must be
   able to identify the required IK component and identify that IK
   component's associated IA.  Note that different IKs may be used for
   different messages between a pair of communicants.  Consider, for
   example, one message sent from A to B and another message sent (usi
ng
   the IK-per-list method) from A to a mailing list of which B is a
   member.  The first message would use IK components associated
   individually with A and B, but the second would use an IK component
   shared among list members.

   When a PEM message is transmitted, an indication of the IK componen
ts
   used for DEK and MIC encryption must be included.  To this end,
   Originator-ID and Recipient-ID encapsulated header fields provide
   (some or all of) the following data:

        1.  Identification of the relevant Issuing Authority (IA























T. Berners-Lee                                                       119

            subfield)

        2.  Identification of an entity with which a particular IK
            component is associated (Entity Identifier or EI subfield)

        3.  Version/Expiration subfield

   In the asymmetric case, all necessary information associated with a
n
   originator can be acquired by processing the certificate carried in
   an "Originator-Certificate:" field; to avoid redundancy in this cas
e,
   no "Originator-ID-Asymmetric:" field is included if a corresponding
   "Originator-Certificate:" appears.

   The comma character (",") is used to delimit the subfields within a
n
   Originator-ID or Recipient-ID.  The IA, EI, and version/expiration
   subfields are generated from a restricted character set, as
   prescribed by the following BNF (using notation as defined in RFC
   822, Sections 2 and 3.3):







Linn                                                           [Page 3
3]


























T. Berners-Lee                                                       120

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   IKsubfld       :=       1*ia-char

   ia-char        :=       DIGIT / ALPHA / "'" / "+" / "(" / ")" /
                           "." / "/" / "=" / "?" / "-" / "@" /
                           "%" / "!" / '"' / "_" / "<" / ">"



   An example Recipient-ID field for the symmetric case is as follows:

   Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,2

   This example field indicates that IA "ptf-kmc" has issued an IK
   component for use on messages sent  to "linn@zendia.enet.dec.com",
   and that the IA has provided the number 2 as a version indicator fo
r
   that IK component.



































T. Berners-Lee                                                       121

   An example Recipient-ID field for the asymmetric case is as follows
:

   Recipient-ID-Asymmetric:
    MFExCzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5j
    LjEPMA0GA1UECxMGQmV0YSAxMQ8wDQYDVQQLEwZOT1RBUlk=,66

   This example field includes the printably encoded BER representatio
n
   of a certificate's issuer distinguished name, along with the
   certificate serial number 66 as assigned by that issuer.

5.2.1  Subfield Definitions

   The following subsections define the subfields of Originator-ID and
   Recipient-ID fields.

5.2.1.1  Entity Identifier Subfield

   An entity identifier (used only for "Originator-ID-Symmetric:" and
   "Recipient-ID-Symmetric:" fields) is constructed as an IKsubfld.
   More restrictively, an entity identifier subfield assumes the
   following form:

                      <user>@<domain-qualified-host>

   In order to support universal interoperability, it is necessary to
   assume a universal form for the naming information.  For the case o
f
   installations which transform local host names before transmission
   into the broader Internet, it is strongly recommended that the host
   name as presented to the Internet be employed.





Linn                                                           [Page 3
4]

















T. Berners-Lee                                                       122

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


5.2.1.2  Issuing Authority Subfield

   An IA identifier subfield is constructed as an IKsubfld.  This RFC
   does not define this subfield's contents for the symmetric key
   management case. Any prospective IAs which are to issue symmetric
   keys for use in conjunction with this RFC must coordinate assignmen
t
   of IA identifiers in a manner (centralized or hierarchic) which
   assures uniqueness.











































T. Berners-Lee                                                       123

   For the asymmetric key management case, the IA identifier subfield
   will be formed from the ASN.1 BER representation of the distinguish
ed
   name of the issuing organization or organizational unit.  The
   distinguished encoding rules specified in Clause 8.7 of
   Recommendation X.509 ("X.509 DER") are to be employed in generating
   this representation.  The encoded binary result will be represented
   for inclusion in a transmitted header using the procedure defined i
n
   Section 4.3.2.4 of this RFC.

5.2.1.3  Version/Expiration Subfield

   A version/expiration subfield is constructed as an IKsubfld.  For t
he
   symmetric key management case, the version/expiration subfield form
at
   is permitted to vary among different IAs, but must satisfy certain
   functional constraints.  An IA's version/expiration subfields must
be
   sufficient to distinguish among the set of IK components issued by
   that IA for a given identified entity.  Use of a monotonically
   increasing number is sufficient to distinguish among the IK
   components provided for an entity by an IA; use of a timestamp
   additionally allows an expiration time or date to be prescribed for
   an IK component.

   For the asymmetric key management case, the version/expiration
   subfield's value is the hexadecimal serial number of the certificat
e
   being used in conjunction with the originator or recipient specifie
d
   in the "Originator-ID-Asymmetric:" or "Recipient-ID-Asymmetric:"
   field in which the subfield occurs.

5.2.2  IK Cryptoperiod Issues

   An IK component's cryptoperiod is dictated in part by a tradeoff
   between key management overhead and revocation responsiveness.  It
   would be undesirable to delete an IK component permanently before
   receipt of a message encrypted using that IK component, as this wou
ld
   render the message permanently undecipherable.  Access to an expire
d
   IK component would be needed, for example, to process mail received
   by a user (or system) which had been inactive for an extended perio
d
   of time.  In order to enable very old IK components to be deleted,
a
   message's recipient desiring encrypted local long term storage shou
ld





T. Berners-Lee                                                       124

Linn                                                           [Page 3
5]






















































T. Berners-Lee                                                       125

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   transform the DEK used for message text encryption via re-encryptio
n
   under a locally maintained IK, rather than relying on IA maintenanc
e
   of old IK components for indefinite periods.

6.  User Naming

   Unique naming of electronic mail users, as is needed in order to
   select corresponding keys correctly, is an important topic and one
   which has received (and continues to receive) significant study.  F
or
   the symmetric case, IK components are identified in PEM headers
   through use of mailbox specifiers in traditional Internet-wide form
   ("user@domain-qualified-host"). Successful operation in this mode
   relies on users (or their PEM implementations) being able to
   determine the universal-form names corresponding to PEM originators
   and recipients.  If a PEM implementation operates in an environment
   where addresses in a local form differing from the universal form a
re
   used, translations must be performed in order to map between the
   universal form and that local representation.

   The use of user identifiers unrelated to the hosts on which the
   users' mailboxes reside offers generality and value.  X.500
   distinguished names, as employed in the certificates of the
   recommended key management infrastructure defined in RFC 1422,
   provide a basis for such user identification. As directory services
   become more pervasive, they will offer originators a means to searc
h
   for desired recipients which is based on a broader set of attribute
s
   than mailbox specifiers alone. Future work is anticipated in
   integration with directory services, particularly the mechanisms an
d
   naming schema of the Internet OSI directory pilot activity.

7.  Example User Interface and Implementation

   In order to place the mechanisms and approaches discussed in this R
FC
   into context, this section presents an overview of a hypothetical
   prototype implementation.   This implementation is a standalone
   program   which is invoked by a user, and   lies above the existing
   UA sublayer.  In the UNIX system, and possibly in other environment
s






T. Berners-Lee                                                       126

   as well,  such a program can be invoked as a "filter" within an
   electronic mail UA or a  text editor, simplifying the sequence of
   operations which must be performed by  the user. This form of
   integration offers the  advantage that the program can be used in
   conjunction with a range of UA  programs, rather than being
   compatible only with a particular UA.

   When a user wishes to apply privacy enhancements to an outgoing
   message, the user prepares the message's text and invokes the
   standalone program, which in turn generates output suitable for
   transmission via the UA.  When a user receives a PEM message, the U
A



Linn                                                           [Page 3
6]







































T. Berners-Lee                                                       127

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   delivers the message in encrypted form, suitable for decryption and
   associated processing by the standalone program.

   In this prototype implementation, a cache of IK components is
   maintained in a local file, with entries managed manually based on
   information provided by originators and recipients.  For the
   asymmetric key management case, certificates are acquired for a
   user's PEM correspondents; in advance and/or in addition to retriev
al
   of certificates from directories, they can be extracted from the
   "Originator-Certificate:" fields of received PEM messages.

   The IK/certificate cache is, effectively, a simple database indexed
   by mailbox names.  IK components are selected for transmitted
   messages based on the originator's identity and on recipient names,
   and corresponding Originator-ID, "Originator-Certificate:", and
   Recipient-ID fields are placed into the message's encapsulated
   header.  When a message is received, these fields are used as a bas
is
   for a lookup in the database, yielding the appropriate IK component
   entries.  DEKs and cryptographic parameters (e.g., IVs) are generat
ed
   dynamically within the program.

   Options and destination addresses are selected by command line
   arguments to the standalone program.  The function of specifying
   destination addresses to the privacy enhancement program is logical
ly
   distinct from the function of specifying the corresponding addresse
s
   to the UA for use by the MTS.  This separation results from the fac





















T. Berners-Lee                                                       128

t
   that, in many cases, the local form of an address as specified to a
   UA differs from the Internet global form as used in "Originator-ID-
   Symmetric:" and "Recipient-ID-Symmetric:" fields.

8.  Minimum Essential Requirements

   This section summarizes particular capabilities which an
   implementation must provide for full conformance with this RFC.

   RFC 1422 specifies asymmetric, certificate-based key management
   procedures to support the message processing procedures defined in
   this document; PEM implementation support for these key management
   procedures is strongly encouraged.  Implementations supporting thes
e
   procedures must also be equipped to display the names of originator
   and recipient PEM users in the X.500 DN form as authenticated by th
e
   procedures of RFC 1422.

   The message processing procedures defined here can also be used wit
h
   symmetric key management techniques, though no RFCs analogous to RF
C
   1422 are currently available to provide correspondingly detailed
   description of suitable symmetric key management procedures.   A
   complete PEM implementation must support at least one of these



Linn                                                           [Page 3
7]
























T. Berners-Lee                                                       129

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   asymmetric and/or symmetric key management modes.

   A full implementation of PEM is expected to be able to send and
   receive ENCRYPTED, MIC-ONLY, and MIC-CLEAR messages, and to receive
   CRL messages.  Some level of support for generating and processing
   nested and annotated PEM messages (for forwarding purposes) is to b
e
   provided, and an implementation should be able to reduce ENCRYPTED
   messages to MIC-ONLY or MIC-CLEAR for forwarding. Fully-conformant
   implementations must be able to emit Certificate and Issuer-
   Certificate fields, and to include a Key-Info field corresponding t
o
   the originator, but users or configurers of PEM implementations may
   be allowed the option of deactivating those features.

9.  Descriptive Grammar




































T. Berners-Lee                                                       130

   This section provides a grammar describing the construction of a PE
M
   message.

   ; PEM BNF representation, using RFC 822 notation.

   ; imports field meta-syntax (field, field-name, field-body,
   ; field-body-contents) from RFC-822, sec. 3.2
   ; imports DIGIT, ALPHA, CRLF, text from RFC-822
   ; Note: algorithm and mode specifiers are officially defined
   ; in RFC 1423

   <pemmsg> ::= <preeb>
                <pemhdr>
                [CRLF <pemtext>]   ; absent for CRL message
                <posteb>

   <preeb> ::= "-----BEGIN PRIVACY-ENHANCED MESSAGE-----" CRLF
   <posteb> ::= "-----END PRIVACY-ENHANCED MESSAGE-----" CRLF / <preeb
>

   <pemtext> ::= <encbinbody>      ; for ENCRYPTED or MIC-ONLY message
s
               / *(<text> CRLF)    ; for MIC-CLEAR

   <pemhdr> ::= <normalhdr> / <crlhdr>

   <normalhdr> ::=  <proctype>

               <contentdomain>
               [<dekinfo>]         ; needed if ENCRYPTED
               (1*(<origflds> *<recipflds>)) ; symmetric case --
                            ; recipflds included for all proc types
               / ((1*<origflds>) *(<recipflds>)) ; asymmetric case --
                            ; recipflds included for ENCRYPTED proc ty
pe




Linn                                                           [Page 3
8]














T. Berners-Lee                                                       131

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   <crlhdr> ::= <proctype>
               1*(<crl> [<cert>] *(<issuercert>))

   <asymmorig> ::= <origid-asymm> / <cert>

   <origflds> ::= <asymmorig> [<keyinfo>] *(<issuercert>)














































T. Berners-Lee                                                       132

                  <micinfo>                        ; asymmetric
                  / <origid-symm> [<keyinfo>]      ; symmetric

   <recipflds> ::= <recipid> <keyinfo>

   ; definitions for PEM header fields

   <proctype> ::= "Proc-Type" ":" "4" "," <pemtypes> CRLF
   <contentdomain> ::= "Content-Domain" ":" <contentdescrip> CRLF
   <dekinfo> ::= "DEK-Info" ":" <dekalgid> [ "," <dekparameters> ] CRL
F
   <symmid> ::= <IKsubfld> "," [<IKsubfld>] "," [<IKsubfld>]
   <asymmid> ::= <IKsubfld> "," <IKsubfld>
   <origid-asymm> ::= "Originator-ID-Asymmetric" ":" <asymmid> CRLF
   <origid-symm> ::= "Originator-ID-Symmetric" ":" <symmid> CRLF
   <recipid> ::= <recipid-asymm> / <recipid-symm>
   <recipid-asymm> ::= "Recipient-ID-Asymmetric" ":" <asymmid> CRLF
   <recipid-symm> ::= "Recipient-ID-Symmetric" ":" <symmid> CRLF
   <cert> ::= "Originator-Certificate" ":" <encbin> CRLF
   <issuercert> ::= "Issuer-Certificate" ":" <encbin> CRLF
   <micinfo> ::= "MIC-Info" ":" <micalgid> "," <ikalgid> ","
                  <asymsignmic> CRLF
   <keyinfo> ::= "Key-Info" ":" <ikalgid> "," <micalgid> ","
                 <symencdek> "," <symencmic> CRLF     ; symmetric case
                 / "Key-Info" ":" <ikalgid> "," <asymencdek>
                 CRLF                                ; asymmetric case
   <crl> ::= "CRL" ":" <encbin> CRLF

   <pemtypes> ::= "ENCRYPTED" / "MIC-ONLY" / "MIC-CLEAR" / "CRL"

   <encbinchar> ::= ALPHA / DIGIT / "+" / "/" / "="
   <encbingrp> ::= 4*4<encbinchar>
   <encbin> ::= 1*<encbingrp>
   <encbinbody> ::= *(16*16<encbingrp> CRLF) [1*16<encbingrp> CRLF]
   <IKsubfld> ::= 1*<ia-char>
   ; Note: "," removed from <ia-char> set so that Orig-ID and Recip-ID
   ; fields can be delimited with commas (not colons) like all other
   ; fields
   <ia-char> ::=  DIGIT / ALPHA / "'" / "+" / "(" / ")" /
                  "." / "/" / "=" / "?" / "-" / "@" /
                  "%" / "!" / '"' / "_" / "<" / ">"
   <hexchar> ::= DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
                                                      ; no lower case



Linn                                                           [Page 3
9]








T. Berners-Lee                                                       133

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93






















































T. Berners-Lee                                                       134

   ; This specification defines one value ("RFC822") for
   ; <contentdescrip>: other values may be defined in future in
   ; separate or successor documents
   ;
   <contentdescrip> ::= "RFC822"

   ; The following items are defined in RFC 1423
   ;  <dekalgid>
   ;  <dekparameters>
   ;  <micalgid>
   ;  <ikalgid>
   ;  <asymsignmic>
   ;  <symencdek>
   ;  <symencmic>
   ;  <asymencdek>


NOTES:

     [1]  Key generation for MIC computation and message text encrypti
on
          may either be performed by the sending host or by a
          centralized server.  This RFC does not constrain this design
          alternative.  Section 5.1 identifies possible advantages of
a
          centralized server approach if symmetric key management is
          employed.

     [2]  Postel, J., "Simple Mail Transfer Protocol", STD 10,
          RFC 821, August 1982.

     [3]  This transformation should occur only at an SMTP endpoint, n
ot
          at an intervening relay, but may take place at a gateway
          system linking the SMTP realm with other environments.

     [4]  Use of a canonicalization procedure similar to that of SMTP
          was selected because its functions are widely used and
          implemented within the Internet mail community, not for
          purposes of SMTP interoperability with this intermediate
          result.

     [5]  Crocker, D., "Standard for the Format of ARPA Internet Text
          Messages", STD 11, RFC 822, August 1982.

     [6]  Rose, M. T. and Stefferud, E. A., "Proposed Standard for
          Message Encapsulation", RFC 934, January 1985.

     [7]  CCITT Recommendation X.509 (1988), "The Directory -
          Authentication Framework".






T. Berners-Lee                                                       135

Linn                                                           [Page 4
0]






















































T. Berners-Lee                                                       136

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


     [8]  Throughout this RFC we have adopted the terms "private
          component" and "public component" to refer to the quantities
          which are, respectively, kept secret and made publicly
          available in asymmetric cryptosystems.  This convention is
          adopted to avoid possible confusion arising from use of the
          term "secret key" to refer to either the former quantity or
to
          a key in a symmetric cryptosystem.

Patent Statement

   This version of Privacy Enhanced Mail (PEM) relies on the use of
   patented public key encryption technology for authentication and
   encryption.  The Internet Standards Process as defined in RFC 1310
   requires a written statement from the Patent holder that a license
   will be made available to applicants under reasonable terms and
   conditions prior to approving a specification as a Proposed, Draft
or
   Internet Standard.

   The Massachusetts Institute of Technology and the Board of Trustees
   of the Leland Stanford Junior University have granted Public Key
   Partners (PKP) exclusive sub-licensing rights to the following
   patents issued in the United States, and all of their corresponding
   foreign patents:

      Cryptographic Apparatus and Method
      ("Diffie-Hellman")............................... No. 4,200,770

      Public Key Cryptographic Apparatus
      and Method ("Hellman-Merkle").................... No. 4,218,582

      Cryptographic Communications System and
      Method ("RSA")................................... No. 4,405,829

      Exponential Cryptographic Apparatus
      and Method ("Hellman-Pohlig").................... No. 4,424,414

   These patents are stated by PKP to cover all known methods of
   practicing the art of Public Key encryption, including the variatio
ns
   collectively known as El Gamal.

   Public Key Partners has provided written assurance to the Internet
   Society that parties will be able to obtain, under reasonable,
   nondiscriminatory terms, the right to use the technology covered by






T. Berners-Lee                                                       137

   these patents.  This assurance is documented in RFC 1170 titled
   "Public Key Standards and Licenses".  A copy of the written assuran
ce
   dated April 20, 1990, may be obtained from the Internet Assigned
   Number Authority (IANA).




Linn                                                           [Page 4
1]













































T. Berners-Lee                                                       138

RFC 1421        Privacy Enhancement for Electronic Mail    February 19
93


   The Internet Society, Internet Architecture Board, Internet
   Engineering Steering Group and the Corporation for National Researc
h
   Initiatives take no position on the validity or scope of the patent
s
   and patent applications, nor on the appropriateness of the terms of
   the assurance.  The Internet Society and other groups mentioned abo
ve
   have not made any determination as to any other intellectual proper
ty
   rights which may apply to the practice of this standard. Any furthe
r
   consideration of these matters is the user's own responsibility.

Security Considerations

   This entire document is about security.

Author's Address

   John Linn

   EMail: 104-8456@mcimail.com





























T. Berners-Lee                                                       139

Linn                                                           [Page 4
2]






















































T. Berners-Lee                                                       140


                       A SHELL SERVER FOR HTTP
                                   
   The HTTP protocol is very simple. The following is an example of a
   server program written in sh:
   
#! /bin/sh
read get docid
echo "<TITLE>$docid</TITLE>"
echo Here is the data

   The docid may have a trailing carriage return to be stripped off on
   some systems. You can modify that script to produce the data you
   actually want. The HTML syntax for marked-up text is fairly simple,
   but if you want just to send plain text, then just send the
   .PLAINTEXT.tag first:
   
#! /bin/sh
read get docid
sed -f txt2html.sed $docid

   or in csh
   
#! /bin/csh
request = ( `echo $<`)
if ($#request <2) exit
sed -f txt2html.sed $request[2]


   When you have written your script, set the execute bit and then
   configure the inet daemon to run it . A few more examples:
   
      A sh script to generate a menu for files in a directory
      
      An awk script to generate menu from a list of files .
      
      A perl script for all kinds of stuff on the ASIS server
      
      The shell script of the Hytelnet gateway
      
   If you know the perl language, then that is a powerful (if
   otherwise incomprehensible) language with which to hack together a
   server.
   
   See also a case study of mapping a database onto the web .
   
   All contributions to these examples welcome!
   
                                                                Tim BL
                                                                      






T. Berners-Lee                                                       141

Making a server

   Here is a run-through of what is needed to make a www server , with
   examples from a suggested server for the HEPDATA base of Mike
   Whalley . See also etiquette .
   
   Basically, to make the data available, you make a server which is a
   modified version of your program. When a user follows a link to
   HEPDATA (or runs a command to jump straight there), the client
   program opens a connection to a server program on a VM machine
   (say, but could be VMS or unix). The server in turn runs your
   program.
   
   Let me just describe the essence of the changes needed so that you
   can get an idea of how much effort would be involved.
   
   The first thing you do is to make up an arbitrary naming method for
   anything which HEPDATA can display.  In this I include the welcome
   page, any menu, any article, any help text.  Typically one invents
   a hierarchical naming scheme, like
   
        /HEPDATA                        The first "welcome" menu
        /HEPDATA/HELP                   The top-level help

        /HEPDATA/HELP/REAC              The help on the reaction datab
ase.

        /HEPDATA/REAC                   The reaction database itself

        /HEPDATA/REAC?P+PBAR            list of reactions involving p
and pbar (?)

        /HEPDATA/DATA/RD125V687         Some article (say).

   You do this because, whereas an interactive user follows a path
   through the program, the W3 user calls the program once for each
   thing. There is no "state" information. This allows one to make a
   hypertext link to any part of the scheme and jump back in again
   later. For example, one might want to quote an article, or the
   reaction database, or a particular list of reactions.
   
   Now all you do is modify the program so that, given a name above,
   it will
   
   return the required document.  This means basically turning it from
   a sequence the user goes through into a set of conditionals to
   isolate each of the individual cases above. Apart from that, the
   data retrieval code is unchanged apart from the output formatting.
   Many of the options in fact mean mapping the name onto a fixed
   
   file's name its the searches which have to activate real code.
   
   The hypertext trick you need to use in the menus. Where an option



T. Berners-Lee                                                       142

   is normally output to the screen, you have to tell the client what
   to ask for is the user selects that option. For example, in the
   main menu /HEPDATA you have an option which gives the help. You
   would represnt this "anchor" as
   
<A NAME=4 HREF=/HEPDATA/HELP> Help </A>

   "Help" is all that is displayed, with some indication that it is an
   option. If the user choses (clicks a mouse on, choses by number
   depending on which client he has) then the client asks the server
   for /HEPDATA/HELP. ("A" is for "anchor", "HREF" is for "hypertext
   reference")
   
   For the index searches, it's as simple. When the server sends the
   text called /HEPDATA/REAC it also sends a special tag . This tells
   the client to enable a FIND command, or find panel etc (depending
   on the client). You don't have to do any human interface work. The
   client automatically comes back with a search coded up in the form
   /HEPDATA/REAC?P+PBAR etc. Your server in turn returns a menu (say)
   with pointers to the data which has been found.
   
   You can also put some formatting tags (like headings) which will
   make the data look really nice on a window system.
   
   _________________________________________________________________
   
                                                                Tim BL
                                                                      
                           W3 AND HTMLTOOLS
                                   
   These tools, part of the available WWW software ,  are managements
   of W3 servers, generation of hypertext, etc.
   
Generating HTML

  List of filters         and converters between various formats and
                         HTML, collected by Richard Brandwein.
                         
  Mail Archive to HTML    Make that mail archive available on the web.
                         Markus.Stumpf@Informatik.TU-Muenchen.DE
                         
  Framemaker interface    There are some tar files on the anonymous
                         FTP archive on file://info.cern.ch/www/src
                         which allow FRAMEmaker to be used as a W3
                         tool. Dan Conolly, Convex. Incldues MIF HTML
                         translation.
                         
  Generating HTML         These are scripts for generating SGML
                         hypertext from things like directory
                         listings, etc. Also, for checking and
                         correcting dubious HTML.
                         
  WP5.1 to HTML           WordPerfect 5.1 to HTML conversion



T. Berners-Lee                                                       143

  LaTex to HTML           Code from Nikos Drakos, Computer Based
                         Learning Unit, University of Leeds.
                         
Editing HTML

  BBEdit Extensions       Allow easier edit of HTML files with BBedit
                         on the Mac.
                         
  NeXTStep editor         WYSIWYG hypertext.
                         
  html-mode for Emacs     Not wysiwyg but useful.
                         
Generating things from HTML

  Plain text             Use the line mode browser, www with options
                         like -n and -na or -listrefs.
                         
  LaTeX                  There are some scripts around to generate
                         LaTeX or variations from HTML.  Other sed
                         scripts can be used to combine documents at
                         various levels into one big book.
                         
Analysing Log Files

  Server log analysis     Analysing server logs requires first of all
                         changing the numeric internet node numbers
                         into domain names. httpd-analyse.c is a
                         program to do that. Feed the results through
                         awk and grep of your choice!  Some
                         documentation on the program.
                         
  Server log analysis     Getsites .c is a program which generates
                         reports on a weekly or monthly basis.
                         
Web Wanderers

  Web-roaming  robot etc
                          Guido van Rossum's knobot code in "Python"
                         language.
                         
  Web Checker             James Pitkow's web checking robot
                         
Public WWW Access Services

  Telnet server           Setting up a service machine for anonymous
                         users to log in to a www client.
                         
  Mail Robot              A program to return any information in the
                         web information by electronic mail
                         
                                                                Tim BL
                                                                      
HTMLGeneration



T. Berners-Lee                                                       144

   Here are some example files you can use for generating HTML from
   lists of files and other things.
   
  RTF to HTML             Convert RTF (using specific styles) into
                         HTML.
                         
  fix-html.pl            written by Dan Connolly, is a perl script to
                         legitimize old HTML files into SGML-abiding
                         HTML (as per the DTD that Dan created).
                         
  texi2html              Lionel Con's converter from Gnu TeXInfo
                         format.
                         
  text2html.sed           A sed script to turn plain text into
                         plain-looking valid HTML markup so that it
                         will be rendered just as it was.
                         
  ls2html.awk            is an awk script which will just take a list
                         of names and generate a menu.
                         
  dir2html               is a shell script which generates a menu of
                         pointers to files with particular suffixes in
                         a set of directories. It also includes a
                         README file at the head of the hypertext list
                         if one exists.
                         
  htn2html.c              See the Hytelnet gateway for the program to
                         convert hytelnet data into HTML.
                         
  findrefs.pl             Written by Ari Lemmke, finds references
                         http:... in plain text files and generates
                         anchors out of them.
                         
  LaTeX to HTML           Latex to HTML converter program by Nikos
                         Drakos - not only does it successfully show
                         the more complex Latex formatting, for
                         example for mathematics, but it also has a
                         set of iconic images, which are included for
                         navigation, and to mark footnotes and
                         references.
                         
   You can make any variations on these you like of course. [CERN does
   not accept any responsability for things quoted in these lists].
   
Updating the Newsgroup lists

   To update some of the news pages automatically you must be logged
   on to the news server or have the news directories mounted.
   
    Carl mentioned that you must be a member of the UNIX group news
   (otherwise you won't have permission to read the news directories)
   but that doesn't seem to be necessary for these functions.
   



T. Berners-Lee                                                       145

  UPDATEGROUPS
  
   This script updates the list of newsgroups. For the overview list ,
   it saves everything before the "Others" heading, and adds on a list
   of pointers to newsgroup stems not already mentioned in the saved
   hypertext.
   
   For each stem, it saves any command before the glossary list of
   groups, and then regenerates that list of groups.
   
  NEWSPAGE_UPDATE (OLD)
  
   The script NewsPage_Update creates complete lists of active groups
   for the following groups: alt, bionet, bit, biz, cern, ch, comp,
   eunet, gnu, news, rec, sci, soc, talk, vmsnet. It does this by
   writing the header in explicitly for each group, and then
   generating a list of of subgroups using FindGroups
   
   For comp and news, a full list is placed in fullcomp.html and
   fullnews.html. The files comp.html and news.html are formatted by
   hand already, and so are not touched by the script.
   
   NewsPage_Update works by writing some HTML text into a file for
   each group to be updated, called [newsgroup_name].html.new, then
   calling the script FindNewsGroups.  This checks the file
   /usr/local/lib/news/newsgroups for the groups within the current
   group which are active.  Finally the new file is renamed to remove
   the .new.
   
   The list of stems to search, and their titles and any other comment
   is hardcoded into the NewsPage_Update script, and the list is
   DUPLICATED in Others_Update.
   
  OTHERS_UPDATE
  
   The Others_Update script finds stems which are not included in the
   Overview.html file, but which are active.  This list of which
   groups not to include is hardcoded into the script.  For each
   group, it calls GrpCreate.  This adds the name to
   OtherGroups/Overview.    It then runs FindNewsGroups for each
   group.
   
    NOTE
    
   Once the script has completed all the .new groups must be renamed
   manually to remove the .new extension.
   
  GRPCREATE
  
   This reads a newsgroup stem name from stdin.
   
   It then creates the top of a file for the list of groups with that
   stem. This will be called ${nn}.html.new. where ${nn} is the stem



T. Berners-Lee                                                       146

   name. Unfortunately there is no way to get a description of the
   stem to include in this file. However, if the .html file already
   exists, it will use everything up to an excluding the first DL tag
   from the .html file for the .html.new file. Therefore, everything
   above the DL tag may be hand edited.
   
   GrpCreate adds a pointer from OtherGroups/Overview.html.new to the
   .html file.
   
   The .html file is renamed .html.old, and teh .html.new becomes
   .html, with diffs being stored in a .diffs file under the date..\"
   Macros for HTML
 .\" Jim Davis 6 Nov 92
 .ps 12
 .in 5
 .de B
 ..
   .de R
 ..
 .de H1
 .ti -5
 .ps 18
 \fB\\$1\fR
 .ps 12
 .br
 ..
 .de
   H2
 .ti -3
 .ps 14
 \fB\\$1\fR
 .ps 12
 .br
 ..
 .de H3
 \\$1
 .br
   ..
 .de H4
 \\$1
 ..
 .de H5
 \\$1
 ..
 .de H6
 \\$1
 ..
 .de H7
   \\$1
 ..
 .de H8
 \\$1
 ..
 .de H9
 \\$1
 ..
 .de DL
 .in +5
 ..
   .de DE
 .in -5
 ..
 .de DT
 .ti -3
 * \\$1
 ..
 .de DD
 .br
 ..
 
   

Date: Wed, 4 Nov 1992 16:48:34 -0500
From: Jim Davis <davis@dri.cornell.edu>
To: wei@xcf.berkeley.edu, www-talk@nxoc01.cern.ch
Subject: improved printing of WWW files

   If you can't quite manage to live without hardcopy, you may wish
   sometimes to print WWW files.  I have written a couple of scripts
   to do this.  They are particularly useful with Pei Wei's excellent
   Viola WWW browser.
   
   A tar archive is available for anonymous FTP:
   
   dri.cornell.edu/pub/davis/print-www.tar
   
   It contains:
   

README
print-www
print-www.l
html-to-latex
html2latex.sed (modified version of original CERN version)

   The hardest part was writing the perl script to obtain documents
   via http protocol - turns out you cant just run pipes through
   telnet.
   
   The conversion from HTML to LaTex is not really robust yet -  this
   is doubly hard since there is no guarentee that the HTML is legal.
   But at least it works for my test cases.  No doubt it will be
   improved in time.
   
   best wishes
   



T. Berners-Lee                                                       147

                           GATEWAY SOFTWARE
                                   
   See also: W3 server software , W3 client software
   
   These are servers which provide data extracted from other systems.
   they are built using code from the basic daemon, or scripts.
   
  ACEDB gateway (see also the french version )
                          ACEDB is the database program written for
                         the nematode genome project.
                         
  FIND gateway           for CERN/VM XFIND which calls a REXX exec to
                         get the information from the XFIND system
                         running on the CERNVM mainframe.
                         
  Hytelnet gateway        A gateway to Peter Scott's list of telnet
                         sites
                         
  News Indexer           Index a news spool file using gateway to
                         "ni".  Mitchel Charity, MIT.
                         
  VMS Help gateway        This allows any VMS help files to be made
                         available to WWW clients. Runs on VAX/VMS.
                         
  WAISGate                A gateway to information available using the
                         W.A.I.S. protocol.
                         
  DCLServer               A server for VMS systems which allows you to
                         write a gateway to your own favorite
                         information system using DCL.
                         
  System33                A (big) csh script server providing data
                         including Xerox System33 documents, man pages
                         in plain text, phone numbers, etc. etc...!
                         
  Oracle                  A generic server to oracle. Could be used as
                         a basis for gateways to specific Oracle
                         databases.
                         
  Geography               Gateway to the Geography server at U
                         Michigan
                         
  TechInfo                TechInfo is the CWIS from MIT.  A gateway
                         exists thanks to Linda Murphy/Upenn.
                         
                                                                Tim BL
                                                                      
Geography gateway

                                                      Wed, 18 Nov 1992
                                                                      
                                                             Jim Davis
                                                                      



T. Berners-Lee                                                       148

   Here is a quickly hacked up Gateway from WWW to the University of
   Michigan Geography server.  It expects one argument, a  WWW doc id.
    It ignores the "pathname", extracts the search words, then passes
   those to the server.  It does NOT parse the data returned by the
   server (that is an improvment yet to be done) but you can
   understand the output.
   
   To use this, you would need to have an HTTP server running
   someplace where you can attach this gateway.  I can provide the
   very simple HTTP server I use here, but this subject is already
   documented in the WWW online documentation.
   
   Source code in perl
   
The WWW TechInfo gateway

This is a gateway built using the basic server code, plus one source f
 ile in C. Thanks to Linda Murphy of Univerity of Pennsylvania for the
                                                        etchinfo code.
                                                                      
      The gateway data as running at CERN
      
      The source file
      
                                                                Tim BL
                                                                      
The W.A.I.S. - WWW gateway

   This is an example of a WWW server and a WAIS client. It is just
   the regular httpd daemon linked with:
   
      a version of the libwww library which was compiled with the
      DIRECT_WAIS option, and includes the HTWAIS module;
      
      the freeWAIS libraries from CNIDR .
      
   See a summary of some data available through the gateway .
   
  WSRC FILES
  
   The gateway keeps a cache of WAIS "source" files. These are files
   describing WAIS servers. They are normally picked up automatically
   by searching a "directory of servers" index. Once the gateway has
   picked up a desciption of  a server,  it uses the description to
   describe the server to those who follow links to it. (See the
   HTWSRC module of libwww)
   
   These source files are parsed, and are kept in the directory
   /usr/local/lib/WAIS under the server name, port, and database name.
   
                                                                Tim BL
                                                                      
   Warning: this is no longer working with http 1.0 . This is a known



T. Berners-Lee                                                       149

   bug
   
VMS Help server

   This server can provide WWW users with any information stored in
   VMS Help format.
   
    Additional information available:       :->
    
  Try me !               An example server running at CERN
                         
  Status                 The current state, pointers to more
                         information
                         
                                                                   JFG
                                                                      
  GATEWAY TO VMS HELP: INTERNALS
  
   These are technical and installation notes about the gateway to VMS
   Help . Please send bug reports and suggestions to Jean-Francois
   Groff (jfg@cernvax.cern.ch).
   
    Sources
    
   The program consists of the generic daemon HTDaemon.c , and a
   special function, stored in VMSHelpGate.c , to retrieve VMS Help
   data and convert it to HTML.
   
    Installation
    
   The files you need are as follows. You should customise them,
   putting in your own directory names.:
   
  launchgate.com         Runs the server as a detached process. Put a
                         call to this from your sys$startup procedure,
                         wherever that is. This detaches a job to use
                         www_server.com ans input, and a log file as
                         output.
                         
  www_server.com         The server command file, a wrapper for the
                         actual server executable.  In this file, set
                         the temporary directory for the storage of a
                         cache of .HLP files. This file runs the
                         executable.
                         
  test.com               Here is just an example of  a file to build
                         and test the server.
                         
  descrip.mms            This is an MMS file to build the executable.
                         If you don't have MMS, you may be able to
                         figure out from loking at it which commands
                         you should use.  You can find a machine
                         running MMS and generate the equivalent .com



T. Berners-Lee                                                       150

                         files. See comments at the top of this file
                         on how to run it.
                         
   The source files and executable .EXE are currently (October 92)
   available on HEP  decnet in vxcrna::disk$d1:[jfg.www...].  Note
   also you can pick up the master sources from dxcern:: automatically
   by running
   
   MMS /MACRO=(U=DXCERN::).
   
   If you are not in HEP decnet, you should find the sources in the
   WWWDaemon_v.vv.tar.Z file in the distribution. See the README file.
   
   _________________________________________________________________
   
                                                                   JFG
                                                                      
  VMS HELP SERVER BUGS
  
This is a list of known bugs and desired improvements. Don't let it sh
rink too fast : send your bug reports and suggestions to Jean-Francois
                                          Groff (jfg@cernvax.cern.ch).
                                                                      
      The keyword search works fine on any number of levels down, but
      then the generic daemon doesn't know how deep the server went,
      so anchor names lack the intermediate levels. Solution :
      generate anchor names relative to the input path (before '?').
      
      DANGER : Attempts to access VMS topics with a weird name like
      ":=" will crash the server because VMS will try to create a .HLP
      file with an invalid file specification due to these special
      characters. Solution : Make a good escaping system (that works
      with VMS and Un*x styles as well). Crude and bulletproof
      solution : Ignore any offending topic name !
      
      Reference to another help library through @ will only search
      SYS$HELP for the corresponding .HLB file.
      
      We need an overview page that lists all help libraries
      available.
      
        __________________________________________________________ JFG
                                                                      
  VMS HELP SERVER FEATURES
  
   This lists the main features of the VMS Help gateway, with
   improvements in reverse chronological order. Help make it grow fast
   : send your bug reports and suggestions to Jean-Francois Groff
   (jfg@cernvax.cern.ch).
   
    Experimental gateway 0.4 -- 2 Oct 91
    
      Accepts user queries by number or by name. In the latter case,



T. Berners-Lee                                                       151

      can go down several levels, for instance, from the main help
      page : "cc /lib" will go to topic CC, subtopic /LIBRARY.
      
      On invocation with only //node:port/HELP, displays the contents
      of the standard VMS Help library SYS$HELP:HELPLIB.HLB (function
      lis_to_html).
      
      Address format : //node:port/HELP/[@library/][topic[/subtopic]*]
      
   __________________________________________________________
   
                                                                   JFG
                                                                      
                             STYLE GUIDE
                                   
This guide is designed to help you create a hypertext database effecti
 vely communicates your knowledge to the reader.  It has been prepared
 in the light of comments by readers, and many demands by providers of
online documentation.   Some of the points made may be influenced by p
 ersonal preference, and some may be common sense, but a collection of
                          points has been demanded, and so here it is.
                                                                      
The guide is designed to be read sequentially, but feel free to depart
                              from this.  The sections are as follows:
                                                                      
      Introduction
      
      Overall structure of your work
      
      Within each document
      
      Test your document
      
      Background reading
      
      Reader comments
      
This document is open to comment

 Suggestions are strongly invited, if you think of anything mail it to
timbl@info.cern.ch, mentioning the Style Guide for Online Hypertext or
                                                       its URL. Tim BL
                                                                      
Introduction

You are going to write (or generate ) some online hypertext. Because h
ypertext is potentially unconstrained you are a little daunted. Do not
 be. You can write a document as simply as you like.  In many ways, th
                                                 e simpler the better.
                                                                      
You will be writing a number of separate files.  These files will be l
inked to each other, and to external documents, to make your final wor
                                                                    k.



T. Berners-Lee                                                       152

You may think of your work as a "document", and if it were on paper, t
hen you would call it that.  In the online case though, we tend to ref
 er to each individual file as a document. A  document may correspond,
in the book analogy, to a section or a subsection, or even a footnote.
         In this guide, we'll refer to the whole collection as a work.
                                                                      
The document is the unit by which information is picked up.  At any on
e time, a document is completely loaded into the reader's computer. It
 is also normally the amount you edit at any one time, though with a g
ood editor you will probably have a number of documents open at a time
                                                                     .
                                                                      
The section on structure discusses how you organize your material into
  documents.   Another section discusses how to organise your material
                                                   within a document .
                                                                      
                           (Up to overview ,  on to structure ) Tim BL
                                                                      
Structure

If you have in mind a body of information to put across to your reader
 , you probably have a mental organisation for it.  Normally this is a
 sort of hierarchical tree, like the chapters of a book if you were to
                                                         write a book.
                                                                      
Keep this structure.  It helps readers to have a tree structure as a b
 asis for the book: it gives them a feeling of knowing where they are.
  You can also us this structure for oganising your files in directori
                                                                   es.
                                                                      
                                         You should also bear in mind:
                                                                      
      The reader's preconceived structure
      
      The idea of overlapping trees
      
      How big to make each document
      
 (Up to overview , back to Introduction, on to: writing each document)
                                                                Tim BL
                                                                      
  THE READER'S STRUCTURE .
  
Remember always the audience for whom you are writing.  If they are no
vices in the subject,  it will normally help if you are firm about the
 structure of your work, so that they can learn the structure of the k
nowledge itself.   For example, if you feel that the subject falls int
     o three distinct areas, then that is an importnat thing to teach.
                                                                      
If, however, your readers will already have some knowledge in the subj
ect, then they will already have formed their own structure for it.  I
n this case they will conciously or subconsiouly know where they expec
t to find things. If your structure is different from theirs,  enforci



T. Berners-Lee                                                       153

                ng it too strongly will confuse them and put them off.
                                                                      
 You may in this case have resist a strong tendency to put across your
 own structure strongly and to the detriment of all others.  There are
                                                        two solutions.
                                                                      
If you have a single well-defined audience in mind, who will share a s
imilar world view, then try to write excatly for that world view rathe
                                                         r than yours.
                                                                      
If you are simultaneously writing for more than one group, then you mu
                                                  st provide for both.
                                                                      
When you make a reference,  qualify it  with a clue to allow soime peo
 ple to skip it. For example, "If you really want to know how it works
inside, see the Internals guide", or "A step-by-step introduction is i
                                                      n the tutorial".
                                                                      
Provide links for both reader's views. Your work will be more connecte
 d than a simple tree, but with proper qualifiaction, noone should get
                                                                 lost.
                                                                      
Provide two sepate tree "roots". For example, you can write a step-by-
step tutorial  and a functionaly direct reference tree for the same da
ta. Both will at the lowest level have the same data, but while the fi
rst will deal with the simple things first, the second may be function
naly grouped.   This is just like having several indexes to a book.  T
he tutirial might also include information which the reference work do
                                                               es not.
                                                                      
                                                                      
 (Up to overview , back to Introduction , on to: writing each document
                                                              ) Tim BL
                                                                      
  OVERLAPPING TREES
  
Here is an example of a work (describing some programming functions, s
                                     ay) with two separate structures:
                                                                      
                        Tutorial                        Reference
                           |                                |
                  Let's do it togther                  ---------------
--
                from simple to difficult              |
|
                            |                   by Functional      Alp
habetical
                            |                       group            b
y name
                  Task oriented examples              |
|
                            |                          ---------------
--



T. Berners-Lee                                                       154

                            |                               |
                  Examples of use of               Syntax definition f
or
                  specific functions   <-------->    specific function
s

 The novice user starts at the top left, and works his way down. Where
he needs specific details, he will get down to the examples and from t
hem a link to the underlying definitive desctiptions of each. As far a
 s he is concerned, he is reading a tree-strucured work.   In fact, he
 is reading the same information as the expert who, coming in to check
      on one particular function, then looks up an example of its use.
(Up to structure , back to user's structure , on to: document size ) T
                                                                 im BL
                                                                      
  HOW BIG TO MAKE EACH DOCUMENT
  
The most important point here is that a document should put across a w
ell-defined concept.  It is not generally worth splitting one idea arb
itrarily into two bits in order to make the bits smaller.  Nor is it a
 good idea to put together ideas which area really separate just to ma
                                                 ke a bigger document.
                                                                      
                            A document can be as small as a footnote .
                                                                      
There are two upper limits on a document's size.  One is that long doc
 uments will take longer to transfer, and so a reader will not be able
to simply jump to it and back as fast as he or she can think.  This de
                              pends a lot on the link speed of course.
                                                                      
The other limit is the difficulty for a reader to scroll through large
  documents. Readers with character based terminals don't general read
 more than a few screens.  They often only absorb what is on the first
screen, as if that is not interesting they won't be bothered to scroll
 down.  Readers are also put off by being left at the top of a large d
                                                              ocument.
                                                                      
Readers with graphic interfaces generally scroll through long document
s with a scroll bar.   When the scroll bar is moved a small amount, th
e document should move a sufficiently small amount so that some of the
 original window-full is still left in the window.  This allows the re
ader to scan the document. If the document is any bigger, then it is b
asically unreadable, in that any movement of the scroll bar will loses
                          the place and leaves the reader disoriented.
                                                                      
Advantages with longer documents are that it is easier for readers wit
 h scrollbars to read through in an uninterrupted flow, if that is how
                                              the document is written.
                                                                      
Also,  one doesn't have to go to the trouble of making (or generating)
  so many links and keeping them up to date if things are altered.  If
 making the links is a problem, just settle for one link to a contents
page.  Some browsers have "next" and "previous" buttons to allow a doc



T. Berners-Lee                                                       155

                     ument to be browsed serially according to a list.
                                                                      
(In fact, one can normally scroll up and down explicitly page by page,
        but this is gives the same feeling as the terminal interface.)
                                                                      
                   A rough guide, then, for the size of a document is:
                                                                      
      For online help, menus giving access to other things: small
      enough to fit on 24 lines.  Check this by using a terminal
      browser.
      
      For textual documents, of the order of half a letter-sized (A4)
      page to 5 pages.
      
(Up to structure , back to overlapping trees , on to: within each docu
                                                         ment ) Tim BL
                                                                      
Within each document

   This section of the style guide deals with the layout of text
   within a "document", the unit of retrieval of information on the
   web.
   
   To be completed.
   
   You should try to:
   
      Sign your work
      
      Give its status
      
      Make links into context .
      
      Use context-free document titles
      
      Format device-independantly
      
      Write for the printed work too
      
      Write readable text despite the links
      
      Avoid talking about mechanics
      
   (up to overview , back to structure , on to testing )
   
                                                                Tim BL
                                                                      
  SIGN IT!
  
   An important aspect of information which helps keep it up to date
   is that one can trace its author.  Doing this with hypertext is
   easy -- all you have to do is put a link to a page about the author
   (or simply to the author's phone book entry).



T. Berners-Lee                                                       156

   Make a page for yourself with your mail address and phone number.
   At the bottom of files for which you are responsible, put a small
   note -- say just your initials -- and link it to that page. The
   address style (typically right justified) is useful for this.
   
    Your author page is also a convenient place to put and
   disclaimers, copyright noitices, etc which law or convention
   require. It saves cluttering up the mesages themselves with a long
   signature.
   
   If you are using the NeXT hypertext editor, then you can put this
   link from your default blank page so that it turns up on the bottom
   of each new document.
   
   ( up , back to ..., on to  giving your document's status)
   
  THE STATUS OF YOUR DOCUMENT
  
   Some information is definitive, some is hastily put together and
   incomplete. Both are useful to readers, so do not be shy to put
   information up which is incomplete or out of date -- it may be the
   best there is. However, do remember to state what the status is.
   When was it last updated? Is it complete? What is its scope? For a
   phone book for example, what set of people are in it?
   
   Not every document needs a status declaration, if  there is
   something in the overview page of the work which covers it.
   
   You can of course also give a feel for the status of the text by
   its language ... bad spelling, missing capitals, and relaxed
   grammer all indicate informal notes.     Careful use of verbs such
   as "shall" and "should", and the introduction of Long Capitalised
   Noun Phrases (LCNPs) will give at least the impression of an ISO
   standard.  ;-)
   
    Date it
    
   In some cases it can be useful to put creation dates and last
   modified dates on your work.  (Note that this is the sort of thing
   which one could make a server do automatically with a little
   programming).
   
   Figure out whether putting one might later save the reader from
   following out of date information.
   
   (back to Sign It, On to links into context )
   
  LINKING TO CONTEXT
  
   A major difference between writing part of a serial text, and an
   online document, is that your readers may have jumped in from
   anywhere.   Even though you have only made links to it from one
   place, any other person may want to refer to that particular point,



T. Berners-Lee                                                       157

   and will so make a link to that particular part of your work from
   their own. So  you can't rely on your reader having followed your
   path through your work.
   
   Of course if you are writing a tutorial, it will be important to
   keep the flow from one document to the next in the order you
   intended for its primary audience.   You may not wish to cater
   specially for those who jump in out of the blue, but it is wise to
   leave them with enough clues so as not to be hopelessly lost. Some
   ways of doing this are:
   
      Watch that your text and vocabulary stands by itself. Starting a
      document with "The next thing we we consider is..." or "The only
      solution to this problem is..." will certainly confuse.
      
      Sometimes the opening words refer to the context, and can be
      linked to background information.   For example, in the WWW
      project documentation, the first occurence of the acronym WWW is
      often linked back to the central project document.
      
      The navigation hints at the top or bottom of the document can
      give explicit pointers.  Examples are at the bottom of this
      document.
      
   It can also be useful to imagine as you are writing that  you
   yourself may wish to reuse the document. some day.
   
   (Part of style guide for online hypertext . Up to Writing each
   document , on to Title tag)
   
                                                                Tim BL
                                                                      
  DEVICE INDEPENDENCE
  
   The hypertext you write is stored in HTML language, which does not
   contain information about the fonts and paragraph shapes and
   spacing which should be used for displaying the document.
   
   This gives great advantages in that your document will be rendered
   successfully on whatever platform it is viewed, including a plain
   text terminal.
   
   You should be aware that different clients do use different spacing
   and fonts.   You should be careful to use the structuring elements
   such as headers and lists in the way in which they were intended.
   If you don't like the rendering on your particular client, don't
   try to fix it by using inappropriate elements, or trying for
   example to force extra spacing with empty elements.  This may well
   end up being interpreted differently by other clients and looking
   very strange.  You can in many cases configure the client displays
   each element.
   
   For example:



T. Berners-Lee                                                       158

      Always use heading levels in order, with one heading level 1 at
      the top of the document, and if necessary several level 2
      headings, and then if necessary several level 3 headings under
      each level 2 heading.  If you don't like the way heading level 2
      is formatted, fix it on your client, don't just skip to heading
      level 3.
      
      Don't put extra spaces or blank lines into your text to pad it
      out, except in preformatted (PRE) sections.
      
      Don't refer in your text to facets of particular browesrs.
      Asking someone to "click here" won't make sense without a mouse,
      just as asking someone to "select a link by number" will betray
      the fact that you were using the line mode browser.  Just leave
      a link.  The instructions get boring as the user will normally
      know how to select a link.
      
   See also: testing your document .
   
   Following these guidelines you may find that the end result does
   not appear on your screen exactly as you would like, but your
   readers will probably be happier.
   
   (Part of the Style Guide for Online Hypertext .  Up to within each
   document , back to , on to printable hypertext)
   
                                                                Tim BL
                                                                      
  PRINTABLE HYPERTEXT
  
   In an ideal world, paper might not be necessary.  In a next to
   ideal world, one would have enough time to write a hypertext
   version of a document and also a completely reauthor a paper
   version.  In the real world, you wilkl probably want to generate
   any printed documents and online documents from the same file.
   
   Suppose the HTML files will be the master, and you will generate
   the printable from this, by translation into TeX, etc.
   
   If you might one day want to do this, try to avoid references in
   the text to online aspects.  "See the section on device
   independence " is better than "For more on device independence,
   click here .".  In fact we are talking about a form of device
   independence .
   
   Unfortunately the recommended practices of signing each document
   and giving navigational links  tend to mess up the printable copy,
   though one can of course develop ways of stripping them out if they
   follow a common format.
   
   (Up to:  within each document;  back to device independece , on to
   .readable text)
   



T. Berners-Lee                                                       159

                                                                Tim BL
                                                                      
Test your document

   In a way your hypertext is like a book, which you should have
   proof-read. In a way, it is like a program which you should have
   tested.  At least get someone from the group for which you wrote
   the document to read it and give you some feedback.  Other ideas
   are:
   
      Read the document several different client programs, to ensure
      that you have formatted it in a device independent way.
      
      Monitor the readership of your document. You can do this by
      analysing the server log files .    You may find that some parts
      are not being read, perhaps because people are looking in the
      wrong place for them.  You may see that people often follow a
      path and backtrack. If you can guess what they were looking for,
      you can make the clues around the link more helpful.  (Remember
      to keep log information confidential until you have removed user
      information from it.)
      
      Make it clear whether your will accept criticism or suggestions
      from your readers, and how they should send it.
      
      Ask people to solve problems using the document, and report on
      their success. If they fail, find out what they were looking
      for, whether it was in the document at all,
      
  HOW MUCH TESTING?
  
   Testing takes time.    The decision of how  much testing you do is
   based on the quality of the document you wish to provide.  You are
   balancing your reader's time and effort against yours.   If your
   document is "selling" an idea, or if you are selling the document
   or providing a service, you will want  to make it as easy as
   possible for the reader.   If many people will read your work, a
   little of your time will save a lot of theirs.
   
   If however you are documenting some obscure part of a system in
   which no one other than yourself is likely to be interested,  or if
   you feel that your readers are lucky to have anything available at
   all, there is no point wasting time testing it.  In the event of
   someone needing the information, they might have to go to some
   extra trouble  to follow several links to find what they want, and
   then to understand what you have written.  This may be the most
   efficient way of working.  I emphasize this because there is very
   much information which is for a fleeting moment in people's minds,
   or is hastily scribbled down on some file, and which may be
   important to posterity.  It is better for this information to be
   available even in unpolished form than for it to be hidden out of
   embarrassment for its form.   Before electronic technology, the
   effort of publishing was such that this information was never seen,



T. Berners-Lee                                                       160

   and it was a waste, and and considered an insult to one's readers,
   to publish something which was not of high quality.  Nowadays,
   there is "publishing" at all levels, and both high quality and
   hasty documents have their value.    It is important, though, to
   make it clear what the quality of a document is when making a
   reference to it, to avoid disappointment.
   
   Monitoring the server log files will tell you which documents are
   really being read.  You can use your time most efficiently to
   improve the quality of those.  Of course, analysing the server log
   files also takes time!
   
   (Part of the Style Guide for Online Hypertext . Back to Within each
   doument, On to Background reading)
   
                                                                Tim BL
                                                                      
Within each document

   This section of the style guide deals with the layout of text
   within a "document", the unit of retrieval of information on the
   web.
   
   To be completed.
   
   You should try to:
   
      Sign your work
      
      Give its status
      
      Make links into context .
      
      Use context-free document titles
      
      Format device-independantly
      
      Write for the printed work too
      
      Write readable text despite the links
      
      Avoid talking about mechanics
      
   (up to overview , back to structure , on to testing )
   
                                                                Tim BL
                                                                      
Background reading

   Some other documents which may be of relevance, if you are reading
   the Style Guide for Online Hypertext :
   
      The HTML Specification and references from it



T. Berners-Lee                                                       161

      A Beginner's Guide to writing HTML
      
      World-Wide Web server software - a list of pointers
      
      Web Ettiquette -- for Server Administrators
      
   (Back to testing, on to ...)
   
                              MAIL ROBOT
                                   
   The mail robot is a program which will accept incoming mail and
   allow remote users to:
   
      Subscribe to mailing lists (and unsubscribe)
      
      Retrieve information given a W3 addresss (URL)
      
   Originally from UC Berkeley, an enhanced robot is distributed as
   part of the world-wide web global information initiative . Futhur
   information available is:
   
  Help                    The help file for users of the robot service
                         
  Installation            Installation instructions for unix system
                         managers
                         
  Bugs                    Lists of improvements requested or needed.
                         
  Change history          A list of features introduced and bugs
                         fixed.
                         
  See also               Other WWW software
                         
Using the W3 mailing robot

   This robot maintains the W3 mailing lists, and allows W3 documents
   to be retrieved on request.
   
   You can subscribe or unsubscribe to any of the various WWW mailing
   lists by sending email to the robot "listserv@info.cern.ch" -- see
   the commands listed below.
   
   If you have any problems, requests or questions for a human being,
   mail "www-request@info.cern.ch". Lists are:
   
  www-announce            Anyone interested in WWW, who would like
                         information about new releases or new online
                         data available. Please refrain from posting
                         administrivia to this large list !
                         
  www-talk                Developers of WWW code, or those interested
                         in discussions of technical details
                         



T. Berners-Lee                                                       162

   You can also find information on WWW (as well as many other
   things!) by telnetting to info.cern.ch (no username, no password).
   
   If you want to pick up the WWW software, then use anonymous FTP to
   info.cern.ch and look in directory /pub/www. Subdirectories are src
   for the latest source packages, bin for executables for various
   machines, doc for "paper copies" of articles on WWW in PostScript
   and ASCII form. To read the latest documentation, use WWW !
   
  COMMANDS
  
   The commands understood by the listserv program are:
   
  HELP                    lists this file.  This is also sent whenever
                         a message to listserv is received from which
                         no valid command could be parsed.
                         
  HELP groupname          lists a brief description of the group
                         requested.
                         
  ADD listname            Add yourself to the list
                         
  DELETE listname         take yourself off the list
                         
  ADD address listname    Add yourself with a given mail address to
                         the given list. The address must not contain
                         spaces!
                         
  DELETE address listname
                          Remove the given name from the given list.
                         For all ADD/DELETE commands, mail is sent to
                         the address given to confirm the add or
                         delete operation.
                         
  SEND document-address   returns a document with the requested W3
                         address.
                         
  STOP                    Stop processing requests: ignore the rest of
                         the message. Needed if you send a signature
                         on the end of your message (or if some
                         gateway adds one). If in doubt, use it.
                         
   A command must be the first word on each line in the message.
   Lines which do not start with a command word are ignored.  If no
   commands were found in the entire message, this help file will be
   returned to you. A single message may contain multiple commands; a
   separate response will be sent for each.
   
    Examples
    

        add www-announce




T. Berners-Lee                                                       163

        add me@host.uni.edu www-announce

        delete me@host.uni.edu www-talk
        
        send http://info.cern.ch/hypertext/DataSources/bySubject/Overv
iew.html

  SUBSCRIPTION
  
   If you are not sending mail from your preferred mail address, then
   you can use the second form of the command to give your mail
   address. If you are not on the internet, please convert your
   address into arpa stye. (For example, UK users please use
   international ordering joe@host.ac.uk) Just speficy the mailbox,
   without any spaces.
   
   If you omit the 'address' the command will assume the mailbox that
   is in the From: line of the message.  Note that SUBSCRIBE is a
   synonym for ADD; UNSUBSCRIBE for DELETE.
   
   Please note that is IS possible to add or delete someone else's
   subscription to a mailing list.  This facility is provided so that
   subscribers may alter their own subscriptions from a new or
   different computer account. There is therefore some potential for
   abuse; we have chosen to limit this by mailing a confirmation
   notification of any addition or deletion to the address added or
   deleted including a copy of the message which requested the
   operation.  At least you can find out who's doing it to you.
   
   Note that although you would mail submissions to a mailing list by
   addressing mail to e.g., www-talk@info.cern.ch, in a subscription
   request you specify the name of the list simply (without the
   @hostname part) as in the first example above.
   
  RETRIEVING DOCUMENTS
  
   The SEND command (or the WWW command which is equivalent) returns
   the document with the given W3 address, subject to certain
   restrictions. Hypertext documents are formatted to 72 character
   width, with links numbered. A separate list at the end gives the
   document-addresses of the related documents.
   
   If the document is hypertext, it links will be marked by numbers in
   brackets, and a list of document addresses by number will be
   appended to the message. In this way, you can navigate through the
   web, albeit only at mail speed.
   
   If you don't know where to start, try asking for one of
   

 http://info.cern.ch./hypertext/DataSources/bySubject/Overview.html
 http://info.cern.ch./hypertext/DataSources/bySubject/Physics/HEP.html
 http://info.cern.ch./hypertext/WWW/TheProject.html



T. Berners-Lee                                                       164

   for lists of futher pointers.
   
  CAUTIONARY NOTE
  
   As the robot gives potential mail access to a *vast* amount of
   information, we must emphasise that the service should not be
   abused. Examples of appropriate use would be:
   
      Accessing any information about W3 itself;
      
      Accessing any CERN and/or physics-related or network development
      related information;
      
   Examples of INappropriate use would be:
   
      Attempting to retrieve binaries or .tar files or anything more
      than directory listsings or short ASCCII files from FTP archive
      sites;
      
      Reading internet newsgroups which your site doesn't take;
      
      Repeated automatic use;
      
   There is currently a 1000 line limit on any returned file. We don't
   want to overload other people's mail relays or our server. We
   reserve the right to withdraw the service at any time. We are
   currently monitoring all use of the server, so your reading will
   not initially enjoy privacy. End of cautionary note.
   
   Enjoy!
   
                           The W3 team at CERN  (www-bug@info.cern.ch)
                                                                      
Installation

   Here are the steps necessary to install the Mail Robot product on
   your unix system.
   
  CUSTOMISATION
  
   Set up the variables in listserv.h and CommonMakefile to suit your
   site.
   
  POSTMASTER              The address from which messages appear to
                         come. Why not listserv? Perhaps to prevent
                         mail loops.
                         
  SECUREWWW               The executable W3 line mode browser (v1.3 or
                         later, so as to have the -listrefs option).
                         This is a separate product. For security, www
                         should be writable only by root.
                         
  SERVERDIR               The directory in which you want to put your



T. Berners-Lee                                                       165

                         mailing lists and help about them.
                         
  COMPILE THE PROGRAMS
  
   Everything compiled on AEM's MicroVax II running ULTRIX 3.0 then
   TBL's NeXT without any problem at all. Your results may vary.
   
  CREATE YOUR SERVDIR
  
   wherever you specified in listserv.h. Install a HELP file, perhaps
   using the example-files/HELP in this directory as a template.
   
  SET UP AN ALIAS "LISTSERV"
  
   Make an alias in your /etc/aliases (or /etc/sendmail/aliases,
   whatever you have) that points to this program, for example:
   

                listserv:       "|/usr/local/mail/listserv"
                robot:          "|/usr/local/mail/listserv"


  FOR EACH MAILING LIST
  
   Create a name.info file giving a bit of information about that
   mailing list. see the *.info files in the example-files
   subdirectory.
   
   Create a name file in the same directory, consisting of email
   addresses one to a line of subscribers to a group. If it is for a
   brand-new group, create an empty file. Remember that this file must
   be writable by the mail daemon. The name of the file is just the
   name of the group.
   
   Depending on how you have your mailing lists set up, you may need
   to add an alias to the /etc/aliases file for each of the mailing
   lists. For example:
   
        real-recipes: :include:/usr/local/mail/maillists/recipes

   So sending mail to real-recipes actually goes to each of the
   subscribers listed in /usr/local/mail/maillists/recipes
   
  INSTALL LISTSERV
  
   Install in the appropriate directory.  Edit the CommonMakefile and
   then
   
                make install

  RUN NEWALIASES
  
   This gets sendmail to read the changes in /etc/aliases.



T. Berners-Lee                                                       166

                newaliases

  TRY IT OUT
  
   Send mail to listserv with body
   

                HELP

   for example.  You should get a plain text version of the help file.
   
Mail Robot

   This is a "listserv" type program which maintains mailing lists,
   and allows W3 documents to be retrieved by electronic mail.
   
  Author:                 Various, modified by TBL.
                         
  Status:                 Source available  by anonymous FTP. (Oct 92)
                         
  Current version:        1.0
                         
  Platforms:              Unix only.
                         
  More information:       Overview , Bugs , change history .
                         
Bugs

   This is a list of bugs in or improvements desired in the Mail
   Robot. See also the list of bug fixes .
   
      The INDEX command ought to be implemented, but for some reason
      always returns an empty list.  Occasionally it seems to work.
      
Change History

   Changes to the Mail Robot , in reverse chronological order:
   
  OCTOBER 1992
  
   TBL added information retrieval possibility using WWW. Release as
   an unsupported W3 product to those who ask for it.
   
  1991
  
   TBL rewrote str.c (used to overwrite its arguments).
   
  AEM
  
   A. E. Mossberg, aem@mthvax.cs.miami.edu made a couple minor
   changes, to make it slightly less UCSD-specific. He also added a
   README, and example files in the subdirectory example-files.
   



T. Berners-Lee                                                       167

  ORIGIN
  
   Note this is NOT the bitnet LISTSERV program. The term "mail robot"
   is yused to attempt to prevent confusion between these two
   products, which have different functionality although they do
   basically the same sort of thing.
   
   This was the UCSD listserv program, which AEM retrieved from
   ucsd.edu by anonymous ftp, TBL retrieved from ftp.eff.org  As
   retrieved, from file://ftp.eff.org/pub/listserv2.shar, it consisted
   of the following files:
   
                        README
                        Makefile
                        commands.c
                        listserv.h
                        main.c
                        str.c
                        subscribe.c

   




































T. Berners-Lee                                                       168

