Introduction

By default IMSpector will read the file /usr/etc/imspector/imspector.conf but this can be changed at runtime with the -c /path/to/imspector.conf switch.

IMSpector will always fork into the background when started, unless it is running in debug mode with the -d switch.

General options

port=16667

This option sets the listening port, and is the port that a redirect rule should point outgoing connetion requests too. 16667 is the default port, for historical reasons (IMSpector started out as an IRC-only proxy).

http_port=18080

If present, then IMSpector will listen for HTTP proxy requests on this port. Note that IMSpector only understands the HTTP CONNECT method (used for SSL proxying, but also by IM clients). IMSpector is not a real HTTP proxy.

Presently no authentication or client IP checks are preformed. Anyone who can connect to this port can make use of it to connect to any remote port that is underestood by IMSpector. You may wish to make use of the listenaddr directive to restict access, eg to localhost.

listeneaddr=0.0.0.0

Sets the IP address to listen on. This can be used, for example, to limit access to localhost (127.0.0.1).

plugin_dir=/usr/lib/imspector

Sets the directory that IMSpector will look in for its various plugins. /usr/lib/imspector is the default.

user=imspector
group=imspector

If set, IMSpector will setuid and setgid to this named user and group. By default, IMSpector will continue running as whatever user it was started by, including root (which is not recommended).

pidfilename=/var/run/imspector.pid

Sets the PID filename. The PID file is created before optionally dropping privs.

SSL

There are two methods of performing a man-in-the-middle attack on SSL IM connectins. Firstly, with a static certificate:

ssl=on
ssl_cert=/usr/etc/imspector/servercert.pem
ssl_key=/usr/etc/imspector/serverkey.pem

If enabled, IMSpector will proxy SSL connections, presenting the certificate configured with the ssl_cert and ssl_key options to the client. This will enable it to log SSL/TLS IM sessions, but with the obvious drawback of the wrong certificate being given to the client.

Alternativly a certificate can be created on demand. In this mode IMSpector will create and sign a certificate with the correct Common Name. This certificate will be signed by a CA managed by IMSpector. The public part of this CA can be loaded into client machines, after which the certificate presented by the IM proxy will appear genuine:

ssl=on
ssl_ca_cert=/usr/etc/imspector/cacert.pem
ssl_ca_key=/usr/etc/imspector/cakey.pem
ssl_key=/usr/etc/imspector/serverkey.pem
ssl_cert_dir=/var/lib/imspector

Here, the ssl_ca_cert and ssl_ca_key options set the CA certificate and key, whilst the ssl_key option sets the key which all proxied IM clients will receive. The ssl_cert_dir should point at a directory, writeable by the IMSpector user, which will be used to store on-demand created server certificates. The certificates will be created with hashed filenames, and will be valid for one year after the date of creation.

A disadvantage with any form of man-in-the-middle atack, from the point of view of the client, is that the client will be unable to do any validation checks on the server certs. This opens clients up to the possibility of connecting to IM servers which have invalid or expired certificates. The solution to this problem is to do validation on the IM proxy, on the clients behalf. This feature can be used either with on-demand or static MITM certificates, but is more useful if on-demand certs are used.

ssl_verify_dir=/usr/lib/ssl/certs
ssl_verify=block

The ssl_verify_dir option sets the location of system-wide CA certs within the machine running IMSpector. This is the set of certs that is provided with the OpenSSL package, and unfortunately its exact location varies from system to system. The example given above is for Debian derived systems, and within Debian the certs are contained within the 'ca-certificates' pacakge. If IMSpector has this list of CAs available, it is able to do its own CA validation, as well as other checks.

There are three actions available for dealing with connections to IM servers who's certificate fails validation:

    * off - No validation. This is the default. All certs created will be signed by the built in CA.
    * selfsigned - Certs that fail validation will be presented to the client as self-signed certs, affording the client user the chance to cancel the connection.
    * block - Close the connection if validation fails.

Note that these SSL options are global options, but presently only Jabber supports SSL. It does so in two ways:

    * The "starttls" command over the normal port, 5222.
    * Port 5223, the "old style" SSL port.

IMSpector supports both methods, but obviously an extra redirect firewall rule is needed for port 5223.

IMSpector could easily support SSL for other protocols, but SSL is not very popular with IM providers, with the excpetion of Jabber via Gtalk.

Enabling protocols

As well as creating the required redirection firewall rules, IMSpector also needs to be told to handle each protocol you wish it to process:

icq_protocol=on
irc_protocol=on
msn_protocol=on
yahoo_protocol=on
gg_protocol=on
jabber_protocol=on
https_protocol=on

Note that the https_protocol option is only required under the following conditions:

    * You are using IMSpector's built in HTTP proxy port.
    * You are using MSN.

This is because MSN utilises HTTPS on port 443 in its authentication stage. IMSpector does not imspect this traffic, but it does need to be able to pass it between the client and the server.

Typing events

By default, IMSpector will not log "typing events", because they are not usually interesting. None the less, it is possible to log them if you desire:

log_typing_events=on

IRC does not have the concept of typing events and thus IMSpector cannot log them.

IRC and group chats

IRC is alittle "special". With its channels, it suports group chatting. IMSpector copes with this in the following way.

The local ID is always the nickname of the person making the IRC connection. The remote ID will be a channel for channel messages, and the other persons nick for private messages. When a remote user says a message on a channel, the event data is prefixed with their nick and a colon followed be the message said.

MSN and Yahoo group chats are logged in a simular way; in both cases a remote party is created "on demand" to hold the group. ICQ/AIM support within IMSpector currently does not support group chats.

File log format

Logs are filed under a path like this:

{protocol}/{local id}/{remote id}/{year}-{month}-{day}

A prefix to this path is stored in the config file:

file_logging_dir=/var/log/imspector

This directory must be owned by whoever the imspector program will run as.

{protocol} is the name of a supported protocol (currently MSN or ICQ-AIM), {local id} is the local identity who sent or received a message (ie the person "behind" the machine running IMSpector, {remote id} is the person "in front" of the IMSpector machine, and the last item is obviously the date of the chat.

The local and remote IDs are different depending on the protocol. For MSN it is the persons registered passport email addresss, for AIM and ICQ it is just a simple string (ICQ is now the same as AIM but has a numeric user ID).

An actual log file will look something like this:

192.168.0.4:62013,1160853400,0,1,0,,yep 
192.168.0.4:62013,1160853401,0,1,0,,:~)
192.168.0.4:62013,1160853407,1,1,0,1 smoothie;,smoothie should be in the other rack

Columns are:

   1. IP address:Port of the client machine.
   2. UNIX timestamp.
   3. Outgoing. 1 for an outgoing event, 0 for an incoming event.
   4. Type of event. Currently supported are:
          * 1 - Message.
          * 2 - File transfer. Some protocols only.
          * 3 - Typing event. User is typing a message. Some protocols only.
          * 4 - Webcam event. User requested a webcam session. Yahoo only presently.
   5. Filtered. 0 means the message was not filtered (blocked), 1 means it was. Does not relate to content manipulation, but rather ACLs and other filter plugins.
   6. Message categories. This is usually empty, but if a message has been filtered or censored by one of the filtering plugins, then the "category" of the message will be here. For instance, the badwords filter records the count of replaced words here.
   7. The event data, which for messages happens to be what ever was said. For file transfers, the name and length is logged here. Both the typing events and webcam events do not use this field. Note that the un-manipulated text is logged, ie. before it has been censored for bad words, if the option is enabled. If the message includes embedded new-lines, they are backslashed as \n. Note that, depending on the client software and protocol, the message may contain embedded HTML or other formatting elements.

The log files are never held open, so for the purposes of log rotation, can be truncated or removed freely.

Category file logging

Most people can ignore this little IMSpector feature.

This is a special logging plugin which is not generally needed, but is useful in some situations.

cats_logging_filename=/var/log/imspector/cats

This will log, to a flat file, all events where the category field is not empty. The purpose is to have a seperate log of all questionable messages.

The format of this file is identical to the ordinary file logging plugin, except that each line will be prefixed with the protocol, the local id and the remote id.

MySQL logging

MySQL support is not built unless you invoke the make mysqlloggingplugin.so target. Of course, the client side C MySQL library is required to build this plugin. It adds the following options to the config file:

    * mysql_server - The hostname of the DB server.
    * mysql_database - The database name in the server.
    * mysql_username and mysql_password - Login details for the DB.

The table in the database should be called "messages". The following SQL can be used to be create it:

CREATE TABLE `messages` (
        `id` int(11) NOT NULL auto_increment,
        `timestamp` int(11) NOT NULL default '0',
	`clientaddress` text NOT NULL,
        `protocolname` text NOT NULL,
        `outgoing` int(11) NOT NULL default '0',
        `type` int(11) NOT NULL default '0',
        `localid` text NOT NULL,
        `remoteid` text NOT NULL,
        `filtered` int(11) NOT NULL default '0',
        `categories` text NOT NULL,
        `eventdata` blob NOT NULL,
        PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

If this table does not exist, and the DB user is capable of doing so, this table will automatically be created when IMSpector starts.

Note that only MySQL version 4 and 5 has been tested. Prepared statements are used for efficiency reasons and I believe that these will not work on version 3. Testing on this is welcome, as I am neither an MySQL or SQL bod. Then again, MySQL 3 is probably considered ancient.

The plugin supports some level of resiliance against server disconnects and other errors; if the server is remote and the connection between IMSpector and the DB server is broken, IMSpector will queue the events and periodically attempt a reconnection. When the connection is reestablished, the queued events will be logged at the database in one hit.

PostgreSQL logging

PostgreSQL support is not built unless you invoke the make postgresqlloggingplugin.so target. Of course, the clientside C PostgreSQL libary is needed to build this plugin. It adds the following options to the config file:

    * pgsql_connect - Connection paramaters.

The connection parameters are varied and it depends on the details of your client library and so on, but an example would be:

pgsql_connect=host=localhost dbname=imspector user=dbuser password=Password

The table in the database should be called "messages". The following SQL can be used to be create it:

CREATE TABLE messages (
        id serial primary key,
        "timestamp" timestamp with time zone default now(),
	clientaddress varchar,
        protocolname varchar,
        outgoing int default 0,
        type int default 0,
        localid varchar,
        remoteid varchar,
        filtered int default 0,
        categories varchar,
        eventdata text )

If this table does not exist, and the DB user is capable of doing so, this table will automatically be created when IMSpector starts.

The PostgreSQL plugin supports the same form of disconnected operation as the MySQL plugin.

SQLite logging

SQLite support is not built unless you invoke the make sqliteloggingplugin.so target. It adds the following options to the config file:

    * sqlite_file - The filename of the DB.

Once again, a "messages" table is required:

CREATE TABLE messages (
        id integer PRIMARY KEY AUTOINCREMENT,
        timestamp integer NOT NULL,
	clientaddress text NOT NULL, 
        protocolname text NOT NULL,
        outgoing integer NOT NULL,
        type integer NOT NULL,
        localid text NOT NULL,
        remoteid text NOT NULL,
        filtered integer NOT NULL,
        categories text NOT NULL,
        eventdata blob NOT NULL
);

This table is automatically created if it dosn't already exist.

Content manipulation

IMSpector is able to remove offensive words from all IM messages. Three config options are used:

    * badwords_filename - Should be pointed at a file of naughty words, one per line.
    * badwords_replace_character - Should be a single character. Bad words will be replaced with the character. Default is an asterisk.
    * badwords_block_count - If a message contains more then this many bad words then the message will be completely blocked, not just replaced.

This filter is by no means uncircumventable, but the supplied list of naughty words is enough to filter most strong swear words. The filter is not enabled by default.

This filter utilises the "categories" event field. If a replacement is made, the field will contain the number of replacements.

File-backed ACL filtering

In addition to content replacement, IMSpector is able to completely block messages and other events from reaching the recipient. This is useful for, say, limiting people to certain contacts, or from blocking a listed group of people. There are two implementations for this type of blocking: file based, and database (SQlite) based. In both cases, the mechanism used is a kind of ACL.

In the file-based filter, a single file holds the ACL. For example:

allow sales@hotmail.com client@hotmail.com company.com
allow admin@company.com
allow all support@company.com
deny all

The format of these lists is:

allow|deny localid|all [remoteid1 ... remoteidN]

Lines are processed in file order, from top to bottom. The action is the first thing on the line, followed by the local ID, followed by an optional list of remote IDs. If the remote ID list is empty, then all remote IDs will match. Also, if the local ID is "all" then all local IDs will match.

IDs can either be complete, such as user@company.com, or partial.

Thus the example above tranlates to:

   1. sales@hotmail.com can talk to client@hotmail.com and everyone at company.com.
   2. The local user admin@company.com can talk to anyone at all.
   3. The remote user support@company.com can talk to any local user.
   4. Otherwise the communications are blocked.

To enable ACL support, include the following options in the configuration file:

acl_filename=/path/to/file

Of course, the file must be readable by the user IMSpector runs as.

Database-backed filter

The DB-backed filter is not built by default. To build it, run make dbfilterplugin.so. SQLite client libraries are needed to build this plugin. It adds the following options to the config file:

    * db_filter_filename - The filename of the DB.

The table in this database will be called "lists", and will be automatically created if needed:

CREATE_TABLE "CREATE TABLE IF NOT EXISTS lists (
        id integer PRIMARY KEY AUTOINCREMENT,
        localid text,
        remoteid text,
        action integer NOT NULL,
        type integer NOT NULL,
        timestamp integer NOT NULL );

localid and remoteid may be NULL, which means when a search is done for an entry, it will match any value.

"action" can be one of the following values:

    * 1 - ACCEPT - allow the message to pass.
    * 2 - BLOCK - reject the message.
    * 3 - AWL - matching outgoing messages will have the localid and remote id automatically inserted as ACCEPT rules, so replies can pass.

"type" can be one of the following values:

    * 1 - MANUAL - the type value to use for manually added rules.
    * 2 - AUTO - AWL entries will be given ths type.

The timestamp is set when an entry is created by the AWL rules and is useful if one wished to, say, remove all entries over a month old.

The matching logic is as follows:

   1. First look for matches with action=ACCEPT, allowing the messages if we find any.
   2. If a message is outgoing, then look for action=AWL. If we find a match, then allow this message to pass, and automatically create a rule with action=ACCEPT and type=AUTO.
   3. Look for a match with action=REJECT, blocking the message if we find any.
   4. Finally, allow the message. 

Note that the ordering of rules within the table is not relevent; only the fact that a match was found somewhere in the table is important.

Example: to enable AWL for all local users except the user example@company.com - which is to always be allowed, create three rows:

   1. localid=example@company.com, remoteid=NULL, action=ACCEPT, type=MANUAL
   2. localid=NULL, remoteid=NULL, action=AWL, type=MANUAL
   3. localid=NULL, remoteid=NULL, action=BLOCK, type=MANUAL

Note 1: because of the way SQLite works, you can freely modify the table from outside of IMSpetor, while it is still running, without any problems.

Note 2: What would happen if this filter was combined with the file-based ACL filter is unclear.

Other blocking

Also two additional, global, options are available that are applied regardless of the outcome of ACL processing:

block_files=on

This option will block all file-transfers on all protocols that IMSpector is watching and understand file-transfers.

block_webcams=on

This option will block all webcam sessions. Currently IMSpector can only spot webcam sessions on Yahoo.

Socket-API for filtering

IMSpector is able to talk to an external processes by way of a UNIX socket, in order to determine the fate of a message. This enables integrators to implement their filtering routines in any language that is able to listen on a socket. The hypothetical deamon that listens for these connections is called "censord".

Presently only message events are handled by this plugin; all other event types are always allowed.

censord=on

This is the only option needed in IMSpector. The censord filtering plugin will connect to the UNIX socket at /tmp/.censord.sock and send the following information. All lines are CR+LF ended.

imspector-{incoming|outgoing}
protocol {im rotocol}
localid {local id}
remoteid {remote id}
charset UTF-8
length {count of message bytes}

{message bytes}

Currently IMSpector knows nothing about character sets, but the charset header is sent for future use.

The censoring deamon should respond with the following. Again CR+LF is the line ending.

{response}
result {category info}
length {count of message bytes}

{optional replacement bytes}

Response is one of the following:

    * BLCK - IMSpector should drop the message completely.
    * PASS - IMSpector should pass the message as is.
    * ERR! - Censoring server had an error.
    * MDFY - Censord has replcement text.

In the case of MDFY, the length returned must currently match the source length. The replacement bytes is the text that will be passed back.

In all cases, if a result header is present, this will be appened to the category field in the log entry. You could, for example, use this field to classify the type of profanity in the message.

Example:

IMSpector sends the following request.

imspector-incoming
protocol MSN
localid local@local.com
remoteid remote@remote.com
charset UTF-8
length 11

Mmmm pizza!

Censord responds with the following.

MDFY
result food
length 11

Mmmm *****!

This socket-API opens up the ability to implement blocking and censoring policies in any language that can manipluate a UNIX socket. In the future, it would be nice to include a simple censoring service, probably written in perl, into the IMSpector project.