Apache » Cocoon »

  Cocoon Core
      2.2
   homepage

Cocoon Core 2.2

Sitemap Evaluation

Introduction

While you can probably understand the basics of a simple sitemap just by looking at it, for some constructs it is essential to have some understanding of how the sitemap is evaluated by Cocoon. Here we will have a deeper look into this.

What you already knew

When a request enters Cocoon, Cocoon uses a sitemap to determine what should be done. The sitemap is an XML file. There is one root sitemap called sitemap.xmap which is located in the root of the Cocoon web application. The root sitemap can mount other sitemaps, allowing modularization of Cocoon-based applications.

Sitemap evaluation result

To decide how a request should be handled, first Cocoon looks for the map:pipelines element in the sitemap file. Usually, Cocoon will look for this element in the root sitemap, with an exception of internal requests, which can be evaluated by Cocoon relative to the current sitemap.

Skipping the details of the sitemap evaluation for a moment, the final result of the evaluation of a sitemap must always be one of the following:

  • an XML-based pipeline is executed (one map:generator or map:aggregate, series of map:transform's, one map:serialize)
  • a reader is executed (map:reader) (a reader's purpose is to serve binary content without using an XML pipeline)
  • a flow controller is called to:
    • start a new flow (<map:call function="..."/>)
    • continue an existing flow (<map:call continuation="..."/>)
  • a redirect is performed (map:redirect-to), which can be:
    • an HTTP redirect (a redirect response is sent to the browser to point the browser to a new URL)
    • an internal Cocoon redirect (does not involve HTTP)
  • none of the above, in which case Cocoon will give the error message "No pipeline matched request".

The simplest pipeline

The simplest pipeline specification is the one that does nothing:
<map:pipelines>
  <map:pipeline>
  </map:pipeline>
</map:pipelines>
Suppose you have a sitemap containing this. Now when a request enters Cocoon, Cocoon evaluates the content of the map:pipelines element to decide what to do. However, here we have specified nothing, thus Cocoon will respond with the error "No pipeline matched request".

Matchers and readers

To determine how a request should be handled, one of the more important tools available in the sitemap is the matcher. A matcher typically tests on some aspect of the request, most commonly the requested URL path.For example:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="mydoc.pdf">
      <map:read src="blabla/mydoc.pdf"/>
    </map:match>
  </map:pipeline>
</map:pipelines>
Lets see what happens now if a user request the mydoc.pdf file by entering an URL like 'http://somehost/mydoc.pdf' in the location bar of the browser. Remember that a reader was one of the 'final evaluation results' mentioned above. When Cocoon encounters the reader, it knows all it has to know to finish of this request (which is asking the reader to do its thing) so it does not look anymore at the rest of the sitemap.Suppose the sitemap would have looked like this:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="mydoc.pdf">
      <map:read src="blabla/mydoc.pdf"/>
      <map:read src="anotherfile.pdf"/>
    </map:match>
  </map:pipeline>
</map:pipelines>
Notice the second map:read element. Since Cocoon will stop evaluating the sitemap once it encounters the first map:read, it will do nothing with, nor complain about, the presence of the second map:read element.

XML pipelines

Cocoon is all about generating pages using XML-processing-pipelines (to be technically correct, it are SAX-processing-pipelines). In the examples till now, we have each time seen the map:pipeline(s) element, and they didn't really define an XML-pipeline. Instead,
the map:pipelines element can describe how an XML-pipeline can be composed, it is not the pipeline itself. When the word pipeline is used in Cocoon context, it usually refers to the XML-pipeline, not to the map:pipelines sitemap element. Lets look at an example of the specification of an XML-pipeline in the sitemap:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:generate src="page.xml"/>
      <map:transform src="page2html.xsl"/>
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>

This could be interpreted as follows: when a request is done for the 'page.html', execute the described generator-transform-serialize pipeline.

In reality, this is how Cocoon looks at it:

  • At the start of the sitemap evaluation, Cocoon creates an empty pipeline object
  • The first matcher matches, so Cocoon will look at its child elements
  • A map:generate element is encountered: put it aside in the pipeline object
  • Then a map:transform element is encountered: put it aside in the pipeline object
  • Then a map:serialize element is encountered: put it aside in the pipeline object
  • Once a serializer is encountered, Cocoon knows enough to finish of this request (it can execute the XML-pipeline), so it doesn't look further to the rest of the pipeline.
For the simple example above, this reasoning might seem overkill, but it is essential to more complex sitemaps. One important thing is that when an element such as a generator or transformer is encountered, these aren't executed immediately. First a complete pipeline must be found, up to the serializer, before Cocoon can execute it.Here is another pipeline, which will do exactly the same as the example above:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:generate src="page.xml"/>
      <map:transform src="page2html.xsl"/>
    </map:match>

    <map:match pattern="page.html">
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>
Of course, it serves no purpose (here) to write the pipeline like this.Another example of a pipeline which does exactly the same when page.html is request, but not when page.pdf is requested:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:generate src="page.xml"/>
      <map:transform src="page2html.xsl"/>
    </map:match>

    <map:match pattern="page.pdf">
      <map:read src="page.pdf"/>
    </map:match>

    <map:match pattern="page.html">
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>
When page.html is requested, first the first matcher will match, and the generator and transformer will be put aside in the pipeline object. The second matcher does not match so its content is ignored. The third matcher matches again, a serialize element is encountered thus the collected pipeline will be executed, and the remainder of the sitemap is ignored (if any).Another example:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:generate src="page.xml"/>
      <map:transform src="page2html.xsl"/>
      <map:read src="page.xml"/>
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>

What will Cocoon do in this case? Well lets follow the usual reasoning:

  • start evaluation at top of the map:pipelines element
  • a matcher is encountered which matches
  • map:generate is encountered: put it aside in the pipeline object
  • map:transform is encountered: put it aside in the pipeline object
  • map:read is encountered. When encountering a reader, Cocoon knows enough to finish of this request (namely by executing this reader), so it doesn't look any further at the rest of the sitemap. The generator and transformer put aside in the pipeline object are ignored.
It is an error to have more then one generator in a pipeline, or to have a pipeline with only a serializer and no generator. Thus when Cocoon encounters a second generator when a generator has already been set, it will give the error "Generator already set.". When a map:serialize is encountered before a generator, it will give the error "Must set a generator before setting serializer".
Note: Everywhere we talk about map:generate or generators, you can substitute this by map:aggregate, which is just a special kind of generator.

Actions

A sitemap action, map:act, is simply some Java code that can be called. The action implementation can return either null or a map. When it returns null, the child elements of the map:act will not be considered, thus a map:act can also serve as an 'if'.To go back to our topic at hand, sitemap evaluation, it is important to note that when a map:act is encountered, it is executed immediately. This makes that the following to pipeline definitions are equivalent:
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:act type="something"/>
      <map:generate src="page.xml"/>
      <map:transform src="page2html.xsl"/>
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>
and
<map:pipelines>
  <map:pipeline>
    <map:match pattern="page.html">
      <map:generate src="page.xml"/>
      <map:act type="something"/>
      <map:transform src="page2html.xsl"/>
      <map:serialize/>
    </map:match>
  </map:pipeline>
</map:pipelines>
Remember that when Cocoon encounters map:generate during the evaluation of the sitemap, it does not execute the generator immediately, but puts it aside in a pipeline object.