SQL is a popular relational database language first standardized in 1986 by the American National Standards Institute (ANSI). Since then, it has been formally adopted as an International Standard by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). It has also been adopted as a Federal Information Processing Standard (FIPS) for the U.S. federal government.
Database Language SQL is under continual development by the above mentioned standardization bodies. The most recent published version was in 1992, a 580 page specification published by ANSI as American National Standard X3.135-1992 and by ISO/IEC as International Standard 9075:1992. The two specifications are word-for-word identical. Both versions are available in hard copy only from ANSI (Sales telephone: +1-212-642-4900). Further extensions and enhancements are under development (see SQL ENHANCEMENTS below).
The 29 page FIPS SQL specification is published as FIPS PUB 127-2; it points to the ANSI specification by reference, presents four FIPS SQL conformance levels, and identifies additional FIPS SQL requirements for flagging extensions, for documentation of supported features, and for character set support. A PostScript copy of FIPS PUB 127-2 is available electronically by anonymous FTP protocol; click here to download.
The SQL standard is very popular with a large and increasing number of conforming implementations. It is, or soon will be, the basis of definition for a majority of Federal databases and database applications involving structured data.
A database language standard is appropriate for all database applications where data will be shared with other applications, where the life of the application is longer than the life of current equipment, or where the application is to be understood and maintained by programmers other than the original ones.
SQL data manipulation operations may be invoked through a cursor or through a general query specification. The language includes all arithmetic operations, predicates for comparison and string matching, universal and existential quantifiers, summary operations for max/min or count/sum, and GROUP BY and HAVING clause to partition tables by groups. Transaction management is achieved through COMMIT and ROLLBACK statements.
The standard provides language facilities for defining application specific views of the data. Each view is the specification of database operations that would produce a desired table. The viewed table is then materialized at application execution time.
The SQL standard provides a Module Language for interface to other languages. Each SQL statement may be packaged as a procedure that can be called and have parameters passed to it from an external language. A cursor mechanism provides row-at-a-time access from languages that can only handle one row of a table at one time.
Access control is provided by GRANT and REVOKE statements. Each prospective user must be explicitly granted the privilege to access a specific table or view using a specific statement.
The SQL Integrity Enhancement facility offers additional tools for referential integrity, CHECK constraint clauses, and DEFAULT clauses. Referential integrity allows specification of primary and foreign keys with the requirement that no foreign key row may be inserted or updated unless a matching primary key row exists. Check clauses allow specification of inter-column constraints to be maintained by the database system. Default clauses provide optional default values for missing data.
The Embedded SQL specification provides SQL interface to programming languages, specifically Ada, C, COBOL, FORTRAN, Pascal, and PL/I. Applications may thereby integrate program control structures with SQL data manipulation capabilities. The Embedded SQL syntax is just a shorthand for an explicit SQL Module accessed from a standard conforming programming language.
SQL-92 significantly increases the size of the original 1986 standard to include a schema manipulation language for modifying or altering schemas, schema information tables to make schema definitions accessible to users, new facilities for dynamic creation of SQL statements, and new data types and domains. Other new SQL-92 features include outer join, cascade update and delete referential actions, set algebra on tables, transaction consistency levels, scrolled cursors ,deferred constraint checking, and greatly expanded exception reporting. SQL-92 also removes a number of restrictions in order to make the language more flexible and orthogonal.
The following textbooks identify and explain all of the features in the SQL-92 specification:
Jim Melton and Alan R. Simon Understanding the New SQL: A Complete Guide Morgan-Kauffman Publishers, San Mateo, CA 94403 U.S.A. October 1992 C.J. Date with Hugh Darwen A Guide to the SQL Standard, 3rd Edition Addison-Wesley Publishing Company, Reading, MA 01867 U.S. A. October 1992 Stephen Cannan and Gerard Otten SQL - The Standard Handbook McGraw-Hill Book Company Europe, Berkshire, SL6 2QL, England November 1992
The relational data model, and thereby the SQL standard, is appropriate for database applications requiring flexibility in the data structures and access paths of the database. It is desirable both for applications under production control and when there is a substantial need for ad hoc data manipulation by end users who are not computer professionals.
The SQL Schema Definition Language is particularly appropriate for describing tables of information that may be transferred among various data management applications. Used with the Remote Database Access (RDA) standard, it is possible to exchange data occurrences in a standard manner and for conforming systems to interoperate.
In 1993, the ANSI and ISO development committees decided to split future SQL development into a multi-part standard. The Parts, as of August 1994, are:
Part 1: Framework A non-technical description of how the document is structured.
Part 2: Foundation The core specification, including all of the new ADT features; currently over 800 pages.
Part 3: SQL/CLI The call level interface. Currently being processed for DIS Ballot.
Part 4: SQL/PSM The stored procedures specification, including computational completeness. Currently being processed for 1st CD Ballot.
Part 5: SQL/Bindings The Dynamic SQL and Embedded SQL bindings taken from SQL-92. No active new work at this time, although C++ interfaces are under discussion.
Part 6: SQL/XA An SQL specialization of the popular XA Interface developed by X/Open (see below).
The early piece could be progressed to CD in 1995 with final adoption in 1997. All of the user-defined VALUE ADTs, subtype families, Assertions, Triggers, Savepoints, and Cursor extensions easily made it into the "early" piece as everyone agrees they are stable specifications. The OBJECT ADT facility is on the edge between "early" and "later" and the WITH OID VISIBLE alternative is currently in the "later" piece. All of the collection type stuff (i.e. lists, sets, arrays, multisets) are in the "later" piece because their specification is still incomplete.
The Call Level Interface is an alternative mechanism for executing SQL statements. The SQL/CLI consists of a number of routines that:
-- Allocate and deallocate resources.
-- Control connections to SQL-servers.
-- Execute SQL statements using mechanisms similar to Dynamic SQL.
-- Obtain diagnostic information.
-- Control transaction termination.
The CLI ExecDirect routine and the CLI Prepare routine each support an input character string parameter identified as StatementText. If P is the value of StatementText, then P shall satisfy the following restrictions:
1. P shall conform to the Format, Syntax Rules, and Access Rules for a "preparable statement" as specified in Subclause 17.6, "prepare statement", of the SQL'92 standard.
2. P shall not be a "commit statement" or a "rollback statement".
3. P shall abide by the Leveling Rules of the level of SQL support claimed by the SQL/ERI Server.
The SQL/CLI specification is intended to support CLI routines embedded into both pointer-based programming languages and non-pointer-based programming languages.
In particular, CLI routines may be embedded into any one of the following standard programming languages: Ada, C, COBOL, Fortran, MUMPS, Pascal, and PL/I.
An SQL/ERI Server will indicate which of those languages it supports.
SQL/CLI contains a number of features to make it fully compatible with SQL-92 (e.g. handling of international character sets, scrolled cursor support, and support for dynamic SQL), and "concise" routines have been added to make it backward compatible with MicroSoft's ODBC and the early SQL Access and X/Open Snapshot specification.
The Editor will produce DIS text by mid-October, 1994, with the expectation that the DIS Ballot can be completed by May 1995. If all goes well, there should be a published ISO/IEC SQL/CLI standard in late 1995.
This is a very important requirement in client/server environments.
An advantage of the PSM specification is that non-SQL data repositories will now be able to present themselves to an SQL Environment as conforming to a Minimal level of SQL data manipulation, while at the same time offering their "value- added" facilities as SQL-callable functions and procedures. This should be an important stimulant to get legacy systems and other non-SQL data repositories, such as Document Management Systems and Geographic Information Systems, to claim conformance to one of the SQL/ERI profiles defined in the draft FIPS PUB for SQL Environments (see SQL ENVIRONMENTS below).
The SQL/PSM specification will undergo its 1st CD Ballot this Fall, with likely progression to DIS no later than July 1995. It should be formally adopted as an ISO/IEC standard in calendar year 1996.
The current schedule for SQL/XA calls for CD registration in 1996, with formal adoption as an ISO/IEC standard in 1998.
The Project Plan for SQL/MM indicates that it will be a multi-part standard consisting of an evolving number of parts. Part 1 will be a Framework that specifies how the other parts are to be constructed. Each of the other parts will be devoted to a specific SQL application package. The following SQL/MM Part structure exists as of August 1994:
Part 1: Framework A non-technical description of how the document is structured.
Part 2: FullText Methods and ADTs for text data processing. Only minimal content at present.
Part 3: Spatial Methods and ADTs for spatial data management. About 125+ pages with active contributions from Spatial Data experts from 3 national bodies.
Part 4: Gen Purpose Methods and ADTs for complex numbers, Facilities trig and exponential functions, vectors, sets, etc. Currently about 85 pages.
The Draft FIPS PUB for SQL Environments is the beginning of a continuing effort to define appropriate conformance profiles that can be used by both vendors and users to specify exact requirements for how various products fit into an SQL environment. The emphasis is to specify general purpose, SQL external repository interface (SQL/ERI) profiles for non-SQL data repositories. These profiles specify how a subset of the SQL standard can be used to provide limited SQL access to legacy databases, or to support SQL gateways to specialized data managers such as Geographic Information Systems (GIS), full-text document management systems, or object database management systems. All of the profiles specified therein are for server-side products, that is, products that control persistent data and provide an interface for user access to that data. Subsequent versions of the FIPS PUB for SQL Environments may specify SQL profiles for client-side products, that is, products that access data and then present that data in graphical or report-writer style to an end user, or process the data in some other way on behalf of the end user.
Many applications require a logically integrated database of diverse data (e.g. documents, graphics, spatial data, alphanumeric records, complex objects, images, voice, video) stored in geographically separated data banks under the management and control of heterogeneous data management systems. An over-riding requirement is that these various data managers be able to communicate with one another and provide shared access to data and data operations and methods under appropriate security, integrity, and access control mechanisms. Much of this source data may be stored in simple file systems, legacy data management systems, or very specialized data repositories that satisfy only a small percentage of these data management requirements. The objective of an SQL environment is to logically integrate these diverse data repositories "as if" they were under the control of a single SQL data manager. User presentation tools, such as graphical user interfaces or report writers, can then use this SQL interface to collect data from various sources, merge it together under ad hoc join conditions, and present it to the user in a pleasing graphical format.
A properly functioning SQL environment will use the SQL language to describe this data using standardized facilities, integrate it into a single federated collection, enforce any integrity or access control constraints, and make it available as a logical whole to sophisticated user productivity or presentation tools. These client-side tools can then use the full power and flexibility of SQL for data retrieval and manipulation. The underlying data managers may implement non-relational data models and thus may have difficulty supporting SQL requirements for nested subqueries, multi-table joins, derived columns in a Select list, referential integrity, or other relational model features. On the other hand, they may offer advanced features of other data models that are rarely supported by relational implementations. With emerging features in the SQL language for user-defined abstract data types (ADTs), stored procedures, encapsulation, polymorphism, and other object management facilities, these diverse data repositories can be described as specialized SQL repositories and accessed using already standardized SQL binding alternatives.
With this approach, SQL may prove to be as successful as an integrator of heterogeneous data repositories as it has been as a language interface to the relational data model. The SQL language can meet these integration objectives if non-SQL implementations provide a "simple" SQL interface to their data and services, and if full-function SQL implementations use that simple interface to provide full-function services to end user tools. The Draft FIPS for SQL Environments specifies standard profiles for such "simple" SQL interfaces, thereby making it easier to specify and support the desired integration. A PostScript copy of the Draft FIPS for SQL Environments is available electronically by anonymous FTP protocol; click here to download.
SQL and RDA provide the basis for standardized communication. An SQL external repository interface (SQL/ERI) makes it possible for non-SQL data repositories to share their data with user applications. With emerging SQL3 enhancements for object-oriented and knowledge-based data management and emerging RDA extensions for distributed database, the ERI can evolve to support "seamless" data integration.