What is it?
Cocoon is a 100% pure
Java publishing framework that relies on new W3C
technologies (such as DOM, XML, and XSL) to provide web content.
The Cocoon project aims to change the way web information is created,
rendered and served. The new Cocoon paradigm is based on the fact that
document content, style and logic are often created by different individuals
or working groups. Cocoon aims for a complete separation of the three layers,
allowing the three layers to be independently designed, created and managed,
reducing management overhead, increasing work reuse and reducing time to market.
The Introduction to Cocoon
Technologies white paper provides a clear, introductory-level overview.
What does it do?
Web content generation is mostly based on HTML, but HTML doesn't separate
the information from its presentation, mixing formatting tags, descriptive tags and
programmable logic (both on server side and client side). Cocoon
offers a different way of working,
allowing content, logic and style to be separated out into different XML files,
and uses XSL transformation capabilities to merge them.
What does it change for me?
Even if the most common use of Cocoon is the automatic creation of HTML
through the processing of statically or dynamically generated XML files, Cocoon is also
able to perform more sophisticated formatting, such as XSL:FO rendering
to PDF files,
client-dependent transformations such as WML formatting for WAP-enabled
devices, or direct XML serving to XML and XSL aware clients.
The Cocoon model allows web sites to be highly structured and
well-designed, reducing duplication efforts and site management costs by
allowing different presentations of the same data depending on the requesting
client (HTML clients, PDF clients, WML clients)
and separating out different contexts with
different requirements, skills and capacities. Cocoon allows better human
resource management by giving each individual their job and reducing to a
minimum the cross-talks between different working contexts.
To do this, the Cocoon model divides the development of web content into three
Are there any known problems?
XML creation -
The XML file is created by the content owners. They do not
require specific knowledge on how the XML content is further processed -
they only need to know about the particular chosen "DTD" or tagset
for their stage in the process. (As one would expect from a fully generic
XML framework, DTDs are not required in Cocoon, but can be used and
This layer is always performed by humans directly,
through normal text editors or XML-aware tools/editors.
XML processing -
The requested XML file is processed and the logic contained in its
logicsheet(s) is applied. Unlike other dynamic content generators, the logic
is separated from the content file.
XSL rendering -
The created document is then rendered by applying an XSL
stylesheet to it and formatting it to the specified resource type
(HTML, PDF, XML, WML, XHTML, etc.)
The biggest known problem in this framework is the lack of XML/XSL
expertise - both being relatively new formats. We do believe, though, that this publishing
framework will be a winner in web sites of medium to high complexity and will lead the
transition from an HTML-oriented to a XML-oriented web publishing model,
whilst still allowing the
use of existing client technologies as well as supporting new types of clients
(such as WAP-aware ultra thin clients like cell phones or PDAs).
Even if considered heavy and over-hyped, the XML/XSL pair will do
magic once it receives the widespread public knowledge it deserves. This project
intends to be a small step in that direction - helping people to learn this
technology and to focus on what they need, with examples, tutorial and source code
and a real-life system carefully designed with portability, modularity and
real-life usage in mind.
The main concern remains processing complexity: the kind of operations required
to process the document layers are complex and not designed for real-time
operation on the server side. For this reason, Cocoon is designed to be a page compiler
for dynamic pages,
trying to hardcode, whenever possible, those layers in precompiled binary
code coupled with an adaptive and memory-wise cache system for both static and
dynamic pages. A key development goal is performance
improvement of both processing subsystems as well as the creation and testing
of special cache systems.
Are there books that cover this stuff?
Cocoon can not be employed to do general-purpose HTML transformations, unless
all the HTML involved is well-formed XHTML (i.e. XML).
Yes, even though XML publishing is a brand new area, the incredible acceptance
of these technologies urged editors to provide books that covered the subject.
While many books that cover XML exist, one of them, "Java and XML",
dedicates an entire chapter
to XML publishing frameworks and Cocoon in particular, and that chapter
was made available free of charge
Our grateful thanks go to both O'Reilly and Brett McLaughlin for this.
Where do I get it?
The official distribution site is here.
Since Cocoon requires many different packages to work (Xerces, Xalan, FOP, etc...)
but sometimes there are small incompatibilities between them that make the
installation harder, we decided to help you by placing all the required
binary libraries inside the Cocoon distribution. So, after you have
downloaded the latest Cocoon distribution, you don't need anything else to get
started (unless you want to use optional components such as the LDAP processor).
But, if you want, you can find unofficial RPM packages here
(which may not always be up-to-date).
How can I Contribute?
The Cocoon Project is an Open Source volunteer project under the auspices of the
Apache Software Foundation (ASF),
and, in harmony with the Apache webserver itself, it is released under
a very open license.
This means there are
lots of ways to contribute to the project...