OpenMCL uses an interface translation system based on the FFIGEN system (described here and here) to make the constant, type, structure, and function definitions in a set of .h files available to lisp code.
The basic idea of the FFIGEN scheme is to use the C compiler's frontend and parser to translate .h files into semantically equivalent .ffi files, which use an S-expression - based syntax to represent the definitions contained in those headers. Lisp code can then concentrate on the .ffi representation, without having to concern itself with the semantics of header file inclusion or the arcana of C parsing.
The original FFIGEN system used a modified
version of the LCC C compiler to produce .ffi files. Since many
LinuxPPC header files contain GCC-specific constructs, OpenMCL's
translation system uses a modified version of GCC (called,
somewhat confusingly, ffigen
.) A LinuxPPC binary
is available at
ftp://clozure.com/pub/ffigen.tar.gz
and source differences are at
ftp://clozure.com/pub/ffigen-src.tar.gz
A shell script (distributed with the source and binary packages) called
h-to-ffi.sh
reads a specified .h file (and optional
preprocessor arguments) and writes a (hopefully) equivalent .ffi file
to standard output, calling the installed C preprocessor and the
ffigen
program with appropriate arguments.
Another shell script (distributed with OpenMCL as
"ccl:headers;C;populate.sh"
calls h-to-ffi.sh
on a large number of the header files in /usr/include and creates
a parallel directory tree in "ccl:headers;C;usr;include;"
,
populating that directory with .ffi files.
A lisp function defined in
"ccl:library;parse-ffi.lisp"
translates a specified
list of .ffi files into a set of corresponding .lisp files (in
"ccl:headers;usr;include;"
) and, in the process,
generates new versions of the GDBM
databases
("ccl:headers;constants.gdbm"
,
"ccl:headers;functions.gdbm"
,
"ccl:headers;records.gdbm"
, and
"ccl:headers;types.gdbm"
.) The .lisp files produced
in this step aren't used directly by OpenMCL, but may be interesting
as reference material: the information in the .gdbm files is an
encoded version of the union of the information in the .lisp files.
Most of the entities in the .gdbm files are named (this is true of all types, constants, and functions and of most record types.) These names (and gensyms used to uniquely identify anonymous records) are mapped to upper case and the resulting strings are used as database keys. (The case of external function names is preserved, and this information is stored - along with parameter type information - in the "value" associated with that key.)
This means that if two distinct foreign entities - the hypothetical
functions Open
and oPeN
, for instance -
differ only in case, one of these (arbitrarily) will be accessible
in the database under the key OPEN
. It's assumed that
there are some cases where this occurs, but it's not known how
often conflicts happen. At this point, the convenience of being
able to ignore case issues from Lisp seems to be more important
in practice.
The GDBM databases are used by the #$ and #_ reader macros and
are used in the expansion of RREF, RLET, and related macros.
GDBM is licensed under different terms (the GPL) than OpenMCL
(which is licensed under the LGPL) and this may have
implications for those parties wishing to distribute
OpenMCL-based applications. The code in OpenMCL that uses GDBM
is isolated in the files "ccl:lib;db-io.lisp"
and
"ccl:binppc;db-io.pfsl"
and OpenMCL's use of that
code is limited to read-time and macroexpand-time. The intent
is that GDBM-related code is easily isolated and removed from
an OpenMCL application; another approach would be to replace
GDBM - which is very good at what it does - with something
that offered different licensing terms.
There's probably no such thing as a "standard" set of Linux header files. (Perhaps it's more accurate to say that there are a large number of standards.) Different releases of different distributions may install different versions of different sets of header files in different locations, and users install different sets of different optional packages. (For the benefit of those not paying careful attention, the operative word seems to be "different".)
The populate.sh
shell script generates .ffi files
from a set of header files installed on a Debian 2.2 LinuxPPC
system with a number of optional and local packages installed.
Most of the foreign code used internally in OpenMCL is from
a small, fairly stable Linux subset; there may be significant
differences between the header file information used to generate
the distributed GDBM database files and the APIs used in a more
recent distribution. (This may be especially true of some
high-level user-interface libraries.)
"ccl:headers;*.gdbm"
"populate.sh"
shell script. When you're
confident that the files and preprocessor options match your
environment, cd to the "ccl:headers;C;"
directory
and invoke ./populate.sh
. Repeat this step until
you're able to cleanly translate all files refrenced in the
shell script.
? (require "PARSE-FFI") PARSE-FFI ? (parse-standard-ffi-files (directory "ccl:headers;C;**;*.ffi")) ;;; lots of output ... after a while, shiny new .gdbm files should ;;; appear in "ccl:headers;"