WPS design guide

This guide serves as an introduction to the WPS module. As such, it does not contain:

  • a primer to the WPS protocol, that can be found in the WPS specification (the module implements the WPS 1.0 specification).
  • it does not repeat again what can be already found in the classes javadocs
  • it does not explain how to implement a OWS service using the GeoServer OWS framework, that is left to its dedicated guide.

In short, it provides a global vision of how the module fits togheter, leaving the details to other information sources.

General architecture

Note

We really need to publish the Javadocs somewhere so that this document can link to them

The module is based on the usual GeoServer OWS framework application:

  • a set of KVP parsers and KVP readers to parse the HTTP GET requests, found in the org.geoserver.wps.kvp package
  • a set of XML parsers to parse the HTTP POST requests, found int the org.geoserver.wps.xml and org.geoserver.wps.xml.v1_0_0
  • a service object interface and implementations responding to the various WPS methods, in particular org.geoserver.wps.DefaultWebProcessingService, which in turn delegates most of the work to the GetCapabilities, DescribeProcess and ExecuteProcess classes
  • a set of output transformers taking the results generated by DefaultWebProcessingService and turning them into the appropriate response (usually, XML). You can find some of those in the org.geoserver.wps.response package, whilst some others are generic ones that have been parametrized and declared in the Spring context (see the applicationContext.xml file).

The module uses extensively the following GeoTools modules:

  • net.opengis.wps which contains EMF models of the various elements and types described in the WPS schemas. Those objects are usually what flows between the KVP parsers, XML decoders, the service implementation, and the output transformers
  • gt-xsd-wps and gt-xsd, used for all XML encoding and decoding needs
  • gt-process that provides the concept of a process, with the ability to self describe its inputs and outputs, and of course execute and produce results

The processes

The module relies on gt-process SPI based plugin mechanism to lookup and use the processes available in the classpath. Implementing a new process boils down to:

  • creating a ProcessFactory implementation
  • creating one or more Process implementations
  • registering the ProcessFactory in SPI by adding the factory class name in the META-INF/services/org.geotools.process.ProcessFactory file

The WPS module shows an example of the above by bridging the Sextante API to the GeoTools process one, see the org.geoserver.wps.sextante package. This also means it’s possible to rely on libraries of existing processes provided they are wrapped into a GeoTools process API container.

An alternative way of implementing a custom WPS process, based on Java Annotations, is described in the Implementing a WPS Process section.

Bridging between objects and I/O formats

The WPS specification is very generic. Any process can take as input pretty much anything, and return anything. It basically means WPS is a complex, XML based RPC protocol.

Now, this means WPS can trade vector data, raster data, plain strings and numbers, spreadsheets or word processor and whatever else the imagination can lead one to. Also, given a single type of data, say a plain geometry, there are many useful ways to represent it: it could be GML2, or GML3, or WKT, WKB, or a one row shapefile. Different clients will find some formats easier than others to use, meaning the WPS should try to offer as many option as possible for both input and output.

The classes stored in the org.geoserver.wps.ppio serve exactly this purpose: turning a representation format into an in memory object and vice versa. A new subclass of ProcessParameterIO (PPIO) is needed each time a new format for a known parameter type is desired, or when a process requires a new kind of parameter, and it then needs to be registered in the Spring contex so that ProcessParameterIO.find(Parameter, ApplicationContext) can find it.

Both the XML reader and the XML encoders do use the PPIO dynamically: the WPS document structure is made so that parameters are actually xs:Any, so bot

The code providing the description of the various processes also scans the available ProcessParameterIO implementations so that each parameter can be matched with all formats in which it can be represented.

Filtering processes

By default GeoServer will publish every process found in SPI or registered in the Spring context.

The org.geoserver.wps.process.ProcessFilter interface can be implemented to exert some control over how the processes are getting published. The interface looks as follow:

public interface ProcessFilter {
    ProcessFactory filterFactory(ProcessFactory pf);
}

An implementation of ProcessFilter can decide to return null to the filterFactory call in order to have all the processes inside such factory be hidden from the user, or to wrap the factory so that some of its functionality is changed. By wrapping a factory the following could be achieved:

  • Selectively hide some process
  • Change the process metadata, such as its title and description, and eventually add more translations of the process metadata
  • Hide some of the process inputs and outputs, eventually defaulting them to a constant value
  • Exert control over the process inputs, eventually refusing to run the process under certain circumstances

For the common case of mere process selection a base class is provided, org.geoserver.wps.process.ProcessSelector, where the subclasses only have to double check if a certain process, specified by Name is allowed to be exposed or not.

The GeoServer code base provides (by default) two implementations of a ProcessFilter:

  • org.geoserver.wps.UnsupportedParameterTypeProcessFilter, which hides all the processes having an input or an output that the available ProcessParameterIO classes cannot handle
  • org.geoserver.wps.DisabledProcessSelector, which hides all the processes that the administrator disabled in the WPS Admin page in the administration console

Once the ProcessFilter is coded it can be activated by declaring it in the Spring application context, for example the ProcessSelector subclass that controls which processes can be exposed based on the WPS admin panel configuration is registered in applicationContext.xml as follows:

<!-- The default process filters -->
<bean id="unsupportedParameterTypeProcessFilter" class="org.geoserver.wps.UnsupportedParameterTypeProcessFilter"/>
<bean id="configuredProcessesFilter" class="org.geoserver.wps.DisabledProcessesSelector"/>

Implementation level

At the moment the WPS is pretty much bare bones protocol wise, it implements only the required behaviour leaving off pretty much everything else. In particulat: - GetCapabilities and DescribeProcess are supported in both GET and POST form, but Execute is implemented only as a POST request - there is no raster data I/O support - there is no asynch support, no process monitoring, no output storage abilities. - there is no integration whatsoever with the WMS to visualize the results of an analysis (this will require output storage and per session catalog extensions) - the vector processes are not using any kind of disk buffering, meaning everything is kept just in memory (won’t scale to bigger data amounts) - there is no set of demo requests nor a GUI to build a request. That is considered fundamental to reduce the time spent trying to figure out how to build a proper request so it will be tackled sooner rather than later.

The transmute package

The org.geoserver.wps.transmute package is an earlier attempt at doing what PPIO is doing. It is attempting to also provide a custom schema for each type of input/output, using subsetted schemas that do only contain one type (e.g., GML Point) but that has to reference the full schema definition anyways.

Note

This package is a leftover, should be completely removed and replaced with PPIO usage instead. At the moment only the DescribeProcess code is using it.