apache > cocoon

LinkStatus Generator

LinkStatus Generator

The LinkStatus Generator emits a list of links that are reachable.

The LinkStatusGenerator has serveral configuration options.

  • include-name: RE pattern for including links
    By default include-name is empty.
  • exclude-name: RE pattern for excluding links.
    By default exclude-name is defined as .*\.gif(\?.*)?$, .*\.png(\?.*)?$, .*\.jpe?g(\?.*)?$, .*\.js(\?.*)?$, .*\.css(\?.*)?$ .
  • link-content-type: expected MIME type of xml document requested on view link-query-view
    By default link-content-type is set to application/x-cocoon-links.
  • link-view-query: A query-string appended to the crawling URL
    By default link-view-query is set to cocoon-view=links.
  • user-agent: HTTP user-agent for requesting links, By default user-agent is set to value of org.apache.cocoon.Constants.COMPLETE_NAME, ie. Apache Cocoon 2.1-dev
  • accept: Not currently used

A simple example might help to use the LinkStatusGenerator effectivly:

Add the LinkStatusGenerator to the components in your sitemap.xmap

  <map:generators default="file">
    <map:generator name="linkStatus"
  <map:serialize default="html">
    <map:serializer name="links"
    <map:view from-position="last" name="links">
      <map:serialize type="links"/>

Next define in your pipeline to use the LinkStatusGenerator

<map:match pattern="/linkStatus">
  <map:generate type="linkStatus" name="my-root"/>