A Publishing Infrastructure
The Cocoon project aims at changing the way web information is
created, rendered and served. This new paradigm is based on the fact
that document content, style and logic are often created by different
individuals or working groups. Cocoon's goal of a complete separation
of the three layers, allows them to be independently designed, created
and managed. This reduces management overhead, increases work reuse
and reduces time to market.
The Cocoon publishing model is heavily based on the XSLT
transformation capabilities. XSLT allows complete separation of
content and style (something that is much harder to obtain with HTML,
even using CSS2 or other styling technologies). But Cocoon goes
further and defines a way of separating content and style from the
programming logic that drives server side behavior.
This fact has been widely acknowledge by most major vendors:
Microsoft, IBM, SUN, Oracle, etc. All of these vendors offer XML and
XSL processors. Cocoon uses these technologies and incorporates them
into a publishing infrastructure.
An infrastructure must have these characteristics:
- pervasive scope
- standardize access points
- minimal (almost zero) user intervention
- transparent service provision
- applicability to existing and future applications
Cocoon has a pervasive scope as it address all aspects of the web
publishing needs. Even if the most common use of Cocoon is the
automatic creation of HTML (and XHTML) through the processing of
statically or dynamically generated XML files, Cocoon is also able to
perform more sophisticated formatting, such as XSL:FO rendering on
PDF, client-depending transformations such as WML formatting for
WAP-enabled devices or direct XML serving to XML and XSL aware
clients. Cocoon can even apply different stylesheets to a unique XML
content based on the requesting client.
Once installed on a web server such as Site Server, Apache or
Netscape. Cocoon doesn't require any user intervention. It is there
and usable for content publishing. Only when creating a custom XML
processor does one need to touch Cocoon itself.
Cocoon offers transparent services with its API available through a
simple XML syntax suitable for most of any users needs. With the more
specialized needs a simple and effective Java interface provides the
access needed for a more powerful manipulation of XML elements.
Finally, Cocoon is applicable to existing and future
applications. This is due in part to the fact that XML is the glue
used to connect most of the parts together; ensuring a long life to
the content and an easy interface to current applications. The way
Cocoon handles dynamic site creation also enhances Cocoon's
applicability to current and future applications needs.
Publishing Static Content
Cocoon can apply different stylesheets to different clients. In the
more esoteric examples, it allows one to create a WML and HTML output
version of the same content file. In most cases, HTML is the desired
output. The reason is that unless the XML and XSL aware client can
understand the same version of the standard as the one Cocoon is using
(XML 1.0 and XSLT 1.0) then only HTML can be truly be seen as a
suitable output for today's web publishing needs.
This means that server side processing of the XML content files are
needed. This is done by creating an XML and an appropriate XSL
file. This is the standard way of generating HTML files from XML
content. What Cocoon offers is a transparent way of doing this
processing. There is no need to write an ASP, JSP or CGI to call an
XML parser and then process the output with an XSL processor. This is
automatically handled by Cocoon's infrastructure.
In that infrastructure, it is possible to chain calls to XSLT
processors. This has the possibly great advantage of separating all
the layers of the publishing needs: flow, layout and display. A user
concentrates on writing a content file. Cocoon can then call a first
transformation that deals with the flow. The flow transformation
usually adds menus suitable for a directory, a whole site or any
special need of the content file. The flow transformation can then
call a layout transformation sheet. That sheet will add more layout
information such as headers, footers, color schemes, etc. Finally the
display transformation would handle the actual HTML transformation
from the output of all the previous processing.
Although the previous example is overkill for most web sites
needs, Cocoon's effective caching system ensures that multiple XSLT
transformations do not hinder in any way the speed at which they are
presented to the user. The transformation is done once and the
resulting HTML output is presented many times.
Cocoon's infrastructure can also be applied to an offline
scenario. This generates HTML files in the case of web sites. However
a very complex processing can be performed on the XML files, a
processing that might be too time consuming on runtime scenarios:
automatic image generation, flow generation by directory analysis,
etc. An example of an offline scenario can be seen at Cocoon's site
(http://cocoon.apache.org/1.x/).
On that site, the menus and headers
are automatically generated by processing XML content files, this
allows for a more professional look and feel, yet it doesn't require
an artist each time a label changes.
Publishing Dynamic Content
In dynamic content generation technology, content and logic are
combined: in every page there is a mix of static content and dynamic
logic that work together to create the final result, usually using
run-time or time-dependent input. XSP is no exception, since it
defines a syntax to mix static content and programmatic logic in a way
that is independent of both the programming language used and the
binary results that the final source-rendering generated.
But it must be understood that XSP is just a piece of the
framework: exactly like how formatting objects mix style and content,
XSP objects mix logic and content. On the other hand, since both are
XML DTDs, XSLT can be used to move from pure content to these final
DTDs, placing the style and logic on the transformation layers and
guaranteeing complete separation and easier maintenance.
Other ways to create dynamic content is through XSP taglib,
processors and formatters. XSP taglib allows cocoon to map XML elements
to processing instructions. The custom cocoon processors can go
through an XML document and modify its content. Finally, the formatters
are used to generate XML documents based on a user's request and
its session information.
Together these elements offer the equivalent of COM and CORBA. They
offer a way to solve the interoperability problem between different
applications. In the Cocoon infrastructure, XML becomes the API from
which other components are called. Whether COM, CORBA, Java or any
other language is actually called becomes transparent to the user. The
user only needs to know about the XML API which, by definition, should
strive to be human readable. This makes debugging and maintenance a
lot easier.
This approach to dynamic content generation is quite different from
other methods. The typical method proposed by SUN and Microsoft is to
generate and XML output from a servlet or ASP page, then process it
with an XSLT processor. Some of the drawbacks of that method are:
- Logic and content are mixed. Editing such a file requires more
senior programmers than editing a simple XML file.
- There is no proper multiple XSLT processing. This can be
individually programmed, however it is a difficult problem and might
prevent some web servers from effectively caching the output.
- The XML output must be display friendly. This means that the
output XML must be easily understandable by the final XSLT file to
generate a suitable HTML file. Sometimes, an XML element must be
converted to be more suitable to a given display context. This process
can be automated inside Cocoon by adding XSLT transformations and
adding display hints. However, inside a JSP or ASP page,
programmers will tend to write more in term of the display and less of
the semantic.
- Reusability is limited. With Cocoon's approach, it is
straightforward for a content writer to add a <locale:date> tag
inside his document. That tag will call the proper functionality
required to display the current date in the user's current
locale. Without Cocoon, this becomes a lot more cumbersome since the
programmer must explicitly call the proper functions and incorporate
the result inside the XML output stream. In that case, locale
considerations might not have been made across the site, this is
difficult to check with the content mixed with the logic.
Conclusion
The Cocoon model allows web sites to be highly structured and well
designed, reducing duplication efforts and site management costs by
allowing different presentations of the same data depending on the
requesting client (HTML clients, PDF clients, WML clients) and
separating on different contexts different requirements, skills and
capacities. Cocoon allows a better human resource management by
giving to each individual its job and reducing to a minimum the
cross-talks between different working contexts.
To do this, the Cocoon model divides the development of web content
in three separate levels:
XML creation -
the XML file is created by the content
owners. They do not require specific knowledge on how the XML content
is further processed rather than the particular chosen
DTD/namespace. This is done through human intervention or through
dynamic generation.
XML processing -
the requested XML file is processed and
the logic contained in its logicsheet is applied. Unlike other dynamic
content generators, the logic is separated from the content
file.
XSL rendering -
the created document is then rendered by
applying an XSL stylesheet to it and formatting it to the specified
resource type (HTML, PDF, XML, WML, XHTML)
Unlike other XML projects, Cocoon concentrates on solving the
publishing infrastructure problem. In that respect it is ahead of a
lot of the major vendors that up to now seem to only worry about the
technology needed and less about how it integrates into a publishing
framework.
|