SQL Processor

This project has retired. For details please refer to its Attic page.

SQL Processor

Description

SQLProcessor is deprecated and no longer supported, and it is recommended that you use the ESQL XSP taglib instead. This documentation is provided for those who are still using SQLProcessor.

SQLProcessor is a processor for Cocoon that performs SQL queries, translates the resultset into an XML fragment, and inserts the fragment in the original document. I wrote it because I've got quite a bit of "legacy" data in SQL databases that I wanted to be able to easily access from Cocoon. I believe that servers and data structures capable of storing and retrieving large amounts of XML data natively will arise, obviating the need for this type of middleware. I also believe that it's going to take a while for them to achieve the same performance and stability we've come to expect from SQL databases, hence the current need for SQLProcessor or something like it.

Installation

Check your cocoon.properties for this line and add it if it's not already there:


	processor.type.sql = org.apache.cocoon.processor.sql.SQLProcessor

Configuration

Add this PI to the XML files to which you wish to add SQL queries:


	<?cocoon-process type="sql"?>

You probably want this before any "xslt" processing directives. Then add SQL connection information to your XML files. The tagset for connection defitions looks like this:

<connectiondefs>
 <connection name="foo_connection">
  <driver>org.gjt.mm.mysql.Driver</driver>
  <dburl>jdbc:mysql://mysql.example.com/foo_database</dburl>
  <username>foo_user</username>
  <password>foo_password</password>
 </connection>
 ... other connections if desired ...
</connectiondefs>

Username and password are optional. This tagset will be removed from the document after processing. Multiple connectiondefs should be processed properly, but there's no good reason I can think of to do that. It's probably desirable to put the connectiondefs in a standalone XML file and link it in with external entities. The syntax for that is:


	<!DOCTYPE page [ <!ENTITY connection_defs SYSTEM "http://www.example.com/cdefs.xml"> ]> <page> &connection_defs; ... other XML data ... </page>

but this has the significant disadvantage that your database usernames and passwords are by default accessible from the WWW. You may wish to protect the cdefs.xml file by restricting access to the directory it lives in to the ip address(es) of your server(s).

The other warning here is that if cocoon is operating on the www.example.com site, it will try to process the cdefs.xml file before giving it to the XML parser. As long as cdefs.xml has no cocoon-process PI, you will not have any trouble but it will be somewhat slower than if cocoon is not involved. A simple trick to get around this is to rename the file to an extension that does not trigger cocoon. I use .sxml, shorthand for Source XML. Those of you with real operating systems can just make a link to the .xml file.

In addition to the connection element, you can also have querydefs tags in your connectiondefs node. A querydefs node must have a unique name attribute. query tags (described below) may reference a querydefs tag in their defs attribute, which causes the query tag's attributes to default to the attributes of the querydefs tag. This is useful for setting commonly used groups of attributes. You may also specify a default querydefs. Here is an example:


	<querydefs name="standard" doc-element="options" row-element="option" connection="foo"/> <query defs="standard"/>

is equivalent to


	<query doc-element="options" row-element="option" connection="foo"/>

You can also flag a querydefs node as the default for all query tags by setting the default attribute to yes. Note that query tags can always override the defaults.

Usage

Add SQL queries to your XML files. The tagset looks like this:


	<query connection="foo_connection"> select name,number,message from foo_table order by name </query>

This will be replaced by a tagset that looks like this:


	<ROWSET> <ROW ID="0"> <NAME>Donald Ball</NAME> <NUMBER>23</NUMBER> <MESSAGE> The Illuminati do not exist. This message paid for by the Illuminati.</MESSAGE > </ROW> ... other rows ... </ROWSET>

You can also have SQLProcessor substitute values from the servlet request into your query. The syntax for that is:


	<query connection="foo_connection"> select name,number,message from foo_table where id={@id} order by name </query>

This is, of course, highly configurable by setting attributes of the query tag. A partial list of attributes is:

doc-element -
The tag with which to wrap the whole shebang. Default is ROWSET. If an empty string is specified, e.g. doc-element="", the whole shebang will not be wrapped.
row-element -
The tag with which to wrap each row. Default is ROW. Same deal as with doc-element.
tag-case -
The case in which to present the column name tags. Default is "preserve", meaning use whatever the JDBC driver likes to call the columns. Options are "upper" for forced uppercase (boo) or "lower" for forced lowercase (yay).
null-indicator -
What do we do with null columns. default is to not print anything for null columns, not even a column tag. If this is set to "y" or "yes", we add a NULL="YES" attribute to the column and put an empty string inside.
id-attribute -
What is the name of the attribute that uniquely identifies the rows? Default is ID. This is, of course, meaningless if row-element is set to an empty string.
id-attribute-column -
Which column should we use for the id attribute value (think primary key column). Default is to use the offset of this row in the query's resultset.
max-rows-attribute -
How many rows should we display
skip-rows-attribute -
How many rows should we skip before starting to display rows
variable-left-delimiter -
What string delimits variables in the query (HttpRequest.getParameter(name)) on the left side? Default is '{@'.
variable-right-delimiter -
What string delimits variables in the query (HttpRequest.getParameter(name))on the right side? Default is '}'.
session-variable-left-delimiter -
What string delimits session-variables (HttpSession.getValue(name)) in the query on the left side? Default is '{@session.'.
session-variable-right-delimiter -
What string delimits session-variables (HttpSession.getValue(name)) in the query on the right side? Default is '}'.
update-rows-attribute -
Simple query or are rows updated. Default is ''. For a update query use attribute 'update-rows-attribute="yes"'
count-attribute -
Add a attribute with the specified name to the doc-element which contains the row count of the query result. Default is '' (do not use this feature).
query-attribute -
Add a attribute with the specified name to the doc-element which contains the ULR encoded query string. Default is '' do not use this feature).

Creating Complex Queries

URL request parameter substitution using {@var} is fine for many situations, but is inadequate for handling queries whose conditions are dependent on the state of many variables. To accomodate these queries, you can specify the name of a subclass of SQLQueryCreator that will create the query string. The method that should be overridden is:


	public String getQuery(Connection conn, String query, Element query_element, Properties query_props, Hashtable parameters) throws Exception;

This looks like a lot of parameters. I didn't figure that it made sense to deny any of the potential information available to the query creator. Just ignore what you don't care about. The parameters that deserve explanation are query_props, which is a Properties object containing all of the attribute values of the query tag, keyed by name. The parameters table is passed from cocoon to its processors, and contains lots of informatin, notable the original HttpServletRequest, keyed by the string "request".

You specify the particular subclass of SQLQueryCreator in the creator attribute of the query tag. It's probably easiest if you put your query creators in a package, e.g. com.webslingerZ.cocoon.ComplexQueryCreator.

One utility method is provided for your convenience. The SQLEscape method should parse a string for apostrophes and replace them with whichever string their database likes to use instead. It's not working correctly for my database drivers, so the default getQuery method does not use it. If anyone can suggest the correct way to go about doing this, please contact me with your suggestion or, ideally, a patch.

This is probably an overly complex way to handle most complex queries. It is hopefully the case that a more sophisticated variable substitution scheme or alternate attribute set will arise that accomodates more common complex queries without the need to write Java code. There is also some rationale for making SQLQueryCreator an interface. I welcome suggestions along these lines.

Error Handling

In a perfect world, nothing would ever go wrong in your queries but this is not a perfect world. In our world, you can check for SQLExceptions in your stylesheets and present them to your users in a nice way. The tags used to render the error are configurable using attributes from the query tag. The attributes in question are:

error-element -
The name of the error-element to create, default is sqlerror.
error-message-attribute -
The name of the attribute of the error-element in which to put the Exception's message. Default is message.
error-message-element -
The name of the child element of the error-element in which to put the Exception's message. The default is to not do so.
error-stacktrace-attribute -
The name of the attribute of the error-element in which to put the Exception's stacktrace. The default is to not do so.
error-stacktrace-element -
The name of the child element of the error-element in which to put the Exception's stacktrace. The default is to not do so.
error-message-text -
The text to pass out as an error message instead of the Exception's own message. This is useful for displaying user friendly error messages. The default is to use the Exception's message.

So by default, when an SQLException occurs, the processor generates this XML:


	<sqlerror message="The database server is on fire."/>

Notes and Disclaimers

I sure hope this is useful to you. It's proving to be very useful to me. The DTD used here is a superset of that used by Oracle's XSQL servlet (which provides no methods for specifying alternate variable delimiters, creating complex queries, query default attributes, or error handling). Technical notes for the XSQL servlet are available - note you apparantly must now fill out an annoying survey to get access to this page. We're sharing the DTD with the tacit approval of Steve Meunch, who kindly pointed me in that direction when he saw a) how similar our methodologies were and b) how horrible my original DTD was. Thanks also go out to Stefano Mazzocchi, who started this whole cocoon business in the first place, and Ricardo Rocha, whose DCP processor provided the framework for SQLProcessor's parent, XMLResultSet. Stefano Malimpensa has suggested many useful features that are beginning to make their way into the code.

Planned Features

support for a <querydefs> tag for setting default query attributes

time-based caching of query results

simple postprocessing of SQL values (e.g. date formats)

error message display