RunQ user guide


This is a very brief user guide.  A more extensive one will be available soon.

RunQ is a computer performance management tool.  It is designed to provide IT professionals with a simple tool to keep track of the usage of their computer resources. It can be used to keep track of daily performance data and/or to do specific performance or capacity studies.

More information on how to use RunQ for specific tasks can be found in a not yet published document.

Currently it is only available on Sparc Solaris 2.6 and Intel Linux 2.4 platform.  Future versions are in development.


RunQ architecture


RunQ uses a two phase approach.  First you have the data gathering phase, where performance metrics are collected and stored in a binary file for further processing.  Secondly you have the data processing phase where the collected metrics are analyzed and presented.  The data processing output can then be used directly or stored into a Performance Database for historical follow-up.

This version of RunQ is command line driven.  It is our believe that to be a good performance management tool, you don't need a good looking GUI, but rather a compact, quick and functional utility.  But the output of RunQ can be used directly by tools who can read comma separated value files (CSV) like for example spreadsheets.  By this means you can produce any good looking graph you would like to see.  Also once the data is stored into a database, you can use a lot of third party tools to explore the data.


RunQ options and commands


RunQ is driven by a single executable for both data gathering and data analyzing.  This reduces the overall disk space needed for the tool.  A trimmed down version for collection is also available for sites, which wants so.


Data collection


To invoke the data collector you need to start RunQ as follows:
 

$ runq collect -m number_of_minutues [-d datafile] [-S span time] [-N]
or
$ rqcollect -m number_of_minutues [-d datafile] [-S span time] [-N]


Description of the options:

 
Option Default value Type and format Description
-m No default integer Number of minutes the data collection process has to to run.
-d perf.dat string The name of file where the performance metrics of this collection will be stored.
-S 60 integer The number of seconds before a sample record would be written to the performance data file.
-N No - Keeps collecting data without yielding the CPU.
User this option with care, as in this mode, the collector hogs the systeml and consumes all CPU cycles it can get.


Reporting overall consumption


To have brief report on global system usage during the data collection, you can invoke RunQ with the report option.
RunQ will show one line for each sample found the the performance data file.

 
$ runq report [-d datafile] [-s start time] [-e end time] [-C] [-S]


Description of the options:

 
Option Default Value Type and format Description
-d perf.dat string The name of the file where the performance metrics are stored.
-C No - Produce output in CSV format
-S No - Report only the summary for the interval in CSV mode.
-s 00:00 HH:MM Start time stamp for the report
-e 24:00 HH:MM End time stamp for the report (that time not included)


Listing all processes


You can list all processes which were present on the system during the data collection period.
There will be one line for each process record for each sample.  The data show in the listing are raw data and most of them are cumulative.  This means that to know the resource consumption for user cpu for a single process you need to calculate the delta between two samples.
 

$ runq procs [-d datafile] [-s start time] [-e end time]

Description of the options:

 
Option Default Value Type and format Description
-d perf.dat string The name of the file where the performance metrics are stored.
-T No - Output date and Time in DD-MM-YY ; HH:MM:SS format.
-D No - Output time stamps as delta seconds s from the start.  This option is not used if -T is also given.
-s 00:00 HH:MM Start time stamp for the report
-e 24:00 HH:MM End time stamp for the report (that time not included)


Analyzing a sample


One of the key ideas behind RunQ is to analyze gathered performance metrics.
The purpose of analyzing the data is to reduce the amount into key metric data and format it into an user understandable way.  The outcome of the analyze phase is a system model describing the system at a certain point of time.  Also once a system model has been build you can apply mathematics to that model to answer many questions capacity planners mostly ask.

Currently RunQ does a basic job on analyzing and modeling the system.  RunQ only reports CPU usage and the derivatives of it like service time and waiting time (which gives the response time). The waiting time is calculated according to the M/M/m queuing formula.

The data reduction is done by grouping processes into groups and report the performance metrics against those groups.  The purpose of the groups is also to group system processes into a single business unit.  Also it is not uncommon that the execution of task on a Unix system, for example, involves many little processes.  When using conventional tools like top or ps, you don't see the aggregate performance of all those little processes.

During this analysis phase RunQ verifies the global CPU usage and the per process CPU usage.  Due to the sampling technology RunQ misses some of the short lived processes.  Short lived processes are processes who are created and died between two samples.  In certain situations (like compiles of a lot of small sources on a powerful machine) this part can be a major part of the processes.  RunQ has an option to activate an algorithm to interpolate this data.  This algorithm is current very beta but helped already a lot.  Also when running the collector over longer periods of time this unaccounted data tend to become a smaller portion of the activity.
 

$ runq analyze [-d datafile] |-w workload_definition_file] [-s start time] [-e end time] [-C] [-F]


Description of the options:

 
Option Default Value Type and format Description
-d perf.dat string The name of the file where the performance metrics are stored.
-w workloads.wkl string The name of the workload definition file.
-C No - Produce output in CSV format
-s 00:00 HH:MM Start time stamp for the report
-e 24:00 HH:MM End time stamp for the report (that time not included)
-F No - Use a interpolation algorithm to fix unaccounted CPU usage.  This option is very beta but can be useful, as it is able to recover most of unaccounted CPU cycles.


Syntax of the process groups definition file


The workload definition files follow a very simple syntax.  For defining the process names RunQ uses the regexp API of Unix.  By using "man regexp" or "man -s3 regexp" you can find more information on
using wild cards, repeaters, etc...
RunQ always prepend the expression with an "^" and appends a "$".  This is to enforce complete matching of the given expression.

The evaluation of the matching rules is done in order of the coding sequence in the source file.  When a match has been found by the include rule or by PPID (in case of the with children clause), the process name is checked against the exclude rule.  If a exclude matches, RunQ doesn't use that process group and carries on with next process group. If the optional argument is given in the regular expression, that argument is also taken into account. Arguments are given by using a plus sign "+" followed by a regular expression. To even have a more precise selection a username and a group name can be specified. This is done by appending a colon ":" followed by a regular expression for the user name, which is then followed by an optional plys sign "+" and regular expression for the group name.

Also keep in mind that RunQ will check the child parent relationship before starting matching the expressions,  but if there is a matching parent child relation ship it will always check the exclude list
to ensure that the match may be used.

Below you find the structure of a definition file.  Mind that [with children] means that the "with children clause" is optional and when used it has to be written without the brackets.
 

workload wkl-id
{
    processgroup pg-id [with children]
    include
    {
        "regular expression" [ + "regular expression" ] [ : "regular expression" [ + "regular expression" ] ]
        ...
    }
    exclude
    {
        "regular expression" [ + "regular expression" ] [ : "regular expression" [ + "regular expression" ] ]
    }
    ....
}
....


Be sure to define a ending process group with the ".*" wild card as catch-all!
 


Example of a definition file

workload Development
{
        processgroup compile with children
        include
        {
                "gcc"
                "g'++'"
                "cpp"
        }
        processgroup tools
        include
        {
                "make"
                "vi"
        }
}

workload Office
{
        processgroup StarOffice with children
        include
        {
                "soffice.*"
        }
        processgroup NetScape with children
        include
        {
                "netscape.*"
        }
}

workload System
{
        processgroup runq
        include
        {
                "runq"
        }
        processgroup KDE
        include
        {
                "X.*"
                "xfs"
                "k.*"
        }
        exclude
        {
                "k.*d"
        }
        processgroup Postgres with children
        include
        {
                "postmaster"
        }
        processgroup Network
        include
        {
                "inetd"
                "portmap"
                "netserv"
                ".*ppp.*"
                ".*ftp.*"
        }
        processgroup System
        include
        {
                "init"
                ".*logd"
                "cardmgr"
                "autom.*"
                "lpd"
                "cron"
                "getty"
                "gpm"
                "sendmail"
                "exim"
                ".*pkg.*"
                "modprobe"
                "rmmod"
                "get_it"
        }
        processgroup OtherRoots
        include
        {
                ".*" : "root"
        }
        processgroup WildGroup
        include
        {
                ".*"
        }
}


Last Updated 17-March-2001 serge.robyns@rc-s.be