Multimedia Research @ ECS

Adding references to code.

Note: this is the first in a short series of articles about features in the new 1.1 release of OpenIMAJ.

As academics we are quite used to the idea of throughly referencing the ideas and work of others when we write a paper. Unfortunately, this is not often carried forward to other forms of writing, such as the writing of the code for computer software.

Within OpenIMAJ, we implement and expand upon much of our own published work, but also the published work of others. For the next release of OpenIMAJ we’ve decided that we want to make it explicit where the idea for an implementation of each algorithm and technique came from. Rather than haphazardly adding references and citations in the Javadoc code, we decided that the process of referencing should be more formal, and that the references should be machine readable.

The annotations feature added in Java 5 allows us to explicitly define structured annotations that can be applied to the sourcecode. In the OpenIMAJ core-citation subproject we’ve added two annotation types: Reference and References. The Reference annotation describes a citation to a publication in format that is heavily inspired by BibTeX. The References annotation is a holder for multiple Reference annotations. Both Reference and References annotations can be applied to packages, types (classes and interfaces) and methods. An example of the Reference annotation applied to a class can be seen below:

@Reference(
        type = ReferenceType.Inproceedings,
        author = { "Jonathan Hare", "Sina Samangooei", "David Dupplaw" },
        title = "OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia Analysis and Indexing of Images",
        year = "2011",
        booktitle = "ACM Multimedia 2011",
        pages = { "691", "694" },
        url = "http://eprints.soton.ac.uk/273040/",
        note = " Event Dates: 28/11/2011 until 1/12/2011",
        month = "November",
        publisher = "ACM")
public class OpenIMAJ {
    ...

Manually typing in citations for each publication we’ve based the code on would be rather tedious. Most academic publishers and digital libraries provide the option to export BibTeX formatted citations for specific papers on their websites. In order to relieve most of the pain of writing the Reference annotations, we have also written a tool (BibtexToReference; in the OpenIMAJ core-citation subproject) that automatically converts BibTeX record(s) into Reference/References annotations. This means that its possible to run the tool, paste in the BibTeX and copy and paste the outputted annotations directly into the code.

Quickly annotating code

Once we’ve annotated a the code of a project with Reference and References annotations, it would be quite nice if it were possible to automatically produce a comprehensive bibliography of all the citations in the code. To this end, the core-citation subproject contains an implementation of an annotation processor can be hooked into to the Java compiler in order to generate a complete bibliography as the project is compiled. Currently, the ReferenceProcessor produces BibTeX, html and plain-text formatted bibliographies. In OpenIMAJ, we have the processor hooked up to the standard compile phase, so the bibliographies get generated every time the project builds. As OpenIMAJ is structured as many subprojects, we also have Maven concatenate all the bibliographies for each subproject during the site generation phase. The generated website then contains a complete combined bibliography.

References in the Javadoc

Having the annotations in the code and complete project-level bibliographies is a nice addition. If you are browsing the code, it is easy to see what papers the code is based on. However, this doesn’t help end-users who are using OpenIMAJ but don’t want to look through all source code to find out whose ideas they are using. The first place end-users are likely to look for details about the implementation is in the Javadoc. Java annotations themselves will end up in the Javadoc by default if they are themselves annotated the the @Documented annotation, but as you can see below are not so nice to read (click to enlarge):

unformatted references

In order to improve matters, in the OpenIMAJ Javadoc we’ve injected a little Javascript magic to properly format the annotations, and provide a way for them to be exported:

formatted references

The updated Javadoc for OpenIMAJ with this functionality enabled will be available online with the release of OpenIMAJ 1.1.

Gathering annotations at runtime

Even though it’s easy to look at the citations in the Javadoc, it can still be a bit of a chore digging through the documentation to find references for all the techniques that are used in a particular runnable application. This is where the final nicety of the core-citation project comes into play: the ReferencesClassTransformer.

The ReferencesClassTransformer is a special class that can be loaded by a Java virtual machine to augment or inspect classes from their raw bytecode as they loaded. The ReferencesClassTransformer specifically looks for Reference and References annotations and augments the methods and classes they belong to with calls to a listener object (ReferenceListener) that records the reference whenever the class or method is used. At the end of a programs execution, the ReferenceListener will only contain the citations for techniques and algorithms that have actually been used during the execution. Just before the program finishes, the current list of references can be written to a file in a variety of formats.

There are a number of different ways in which the ReferencesClassTransformer can be loaded – the README.mdown file in the core-aop-support project gives the gory details if you are interested. We’ll look at the most common techniques here:

Using the ReferencesTool

The ReferencesTool is a command line tool for running another java program and collecting and outputting the bibliography of the program to the console or a file. Functionally, the tool behaves just like the java command and has -jar and -cp options that work in exactly the same way as the java command to set the classpath and main-class or the jar file that is to be run:

java -jar ReferencesTool.jar [references output options] -jar jarFile [tool arguments and options]
java -jar ReferencesTool.jar [references output options] -cp classpath mainClass [tool arguments and options]
java -jar ReferencesTool.jar [references output options] mainClass [tool arguments and options]

The following list shows the options that can be specified to select the bibliography output format(s) (the [references output options] of the above commands):

--printBibtex (-pb)      : Print BibTeX formatted references to STDOUT.
--printHTML (-ph)        : Print HTML formatted references to STDOUT.
--printText (-pt)        : Print text formatted references to STDOUT.
--writeBibtex (-wb) FILE : Write BibTeX formatted references to a file.
--writeHTML (-wh) FILE   : Write HTML formatted references to a file.
--writeText (-wt) FILE   : Write text formatted references to a file.

As an example, we can run the OpenIMAJ GlobalFeaturesTool through the ReferencesTool to extract a colour-contrast feature with the following command:

java -jar target/ReferencesTool.jar -pt -jar ../GlobalFeaturesTool/target/GlobalFeaturesTool.jar -f COLOUR_CONTRAST -i /Users/jsh2/Desktop/face.jpg

The -pt option tells the tool to print the bibliography in text format to the console after the application has finished executing. The output from the command looks like the following:

org.openimaj.feature.DoubleFV 1
2.0918493493374553 
J. Hare, S. Samangooei and D. Dupplaw. OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia Analysis and Indexing of Images. ACM Multimedia 2011. ACM. pp691-694. November, 2011. http://eprints.soton.ac.uk/273040/
C. Yeh, Y. Ho, B. A. Barsky and M. Ouhyoung. Personalized Photograph Ranking and Selection System. Proceedings of ACM Multimedia. pp211-220. October, 2010. 
P. F. Felzenszwalb and D. P. Huttenlocher. Efficient Graph-Based Image Segmentation. Int. J. Comput. Vision. Kluwer Academic Publishers. pp167-181. September, 2004. http://dx.doi.org/10.1023/B:VISI.0000022288.19776.77

The first two lines are the output of the GlobalFeaturesTool (the contrast feature), and the remaining output is a list of formatted references to the algorithms that were used to extract the feature.

Modifying your code to listen for references using a custom classloader

The second technique for building a bibliography of your code at runtime involves actually modifying your code to use a special ClassLoader that can modify the bytecode of the classes it loads. Specifically, you need to use a ClassLoaderTransform to tell the ReferencesClassTransformer to augment any classes that are loaded, and at the end of the program’s execution you need to do something with the bibliography that has been created. This is not as hard as it sounds! The following example class shows the standard SIFT feature extractor being run on an image:

import java.io.IOException;
 
import org.openimaj.OpenIMAJ;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
 
public class SIFTDemo {
    public static void main(String[] args) throws IOException {
        final FImage image = ImageUtilities.readF(OpenIMAJ.getLogoAsStream());
 
        final DoGSIFTEngine engine = new DoGSIFTEngine();
        for (final Keypoint keypoint : engine.findFeatures(image)) {
            System.out.println(keypoint);
        }
    }
}

The following code shows the additions needed to make the application load the ReferencesClassTransformer and print the bibliography to System.out at the end of execution:

import org.openimaj.OpenIMAJ;
import org.openimaj.aop.classloader.ClassLoaderTransform;
import org.openimaj.citation.ReferenceListener;
import org.openimaj.citation.ReferencesClassTransformer;
import org.openimaj.citation.annotation.output.StandardFormatters;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
 
public class SIFTDemoRefs {
    public static void main(String[] args) throws Throwable {
        if (ClassLoaderTransform.run(SIFTDemoRefs.class, args, new ReferencesClassTransformer()))
            return;
 
        final FImage image = ImageUtilities.readF(OpenIMAJ.getLogoAsStream());
 
        final DoGSIFTEngine engine = new DoGSIFTEngine();
        for (final Keypoint keypoint : engine.findFeatures(image)) {
            System.out.println(keypoint);
        }
 
        System.out.println(StandardFormatters.STRING.format(ReferenceListener.getReferences()));
    }
}

How does this work? The if statement at the beginning of the method loads the class with a custom classloader that is capable of performing the bytecode augmentation specified by the ReferencesClassTransformer object, calls the main method again and returns true on the first invocation, signalling the main method to exit via the return call. In the second invocation of the if statement from the call to the main method from within the transformed class, the if statement will return false, and the main method will continue beyond the if statement.

Using a dynamically loaded citation-agent instead of the classloader

There is an alternative method to using the custom classloader described above that uses a Java agent to augment the code. The following example shows how the same effect can be achieved by dynamically loading an agent in the main method:

import org.openimaj.OpenIMAJ;
import org.openimaj.citation.CitationAgent;
import org.openimaj.citation.ReferenceListener;
import org.openimaj.citation.annotation.output.StandardFormatters;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
 
public class SIFTDemoRefsAgent {
    public static void main(String[] args) throws Throwable {
        CitationAgent.initialise();
 
        final FImage image = ImageUtilities.readF(OpenIMAJ.getLogoAsStream());
 
        final DoGSIFTEngine engine = new DoGSIFTEngine();
        for (final Keypoint keypoint : engine.findFeatures(image)) {
            System.out.println(keypoint);
        }
 
        System.out.println(StandardFormatters.STRING.format(ReferenceListener.getReferences()));
    }
}

The agent-based method is slightly simpler to code than the classloader, and doesn’t rely on the class being loaded twice in different classloaders. However, it does have some caveats: it will only work on Oracle or OpenJDK JVMs due to the way the agent is loaded at runtime, and it can only transform classes that have yet to be loaded, so you need to call CitationAgent.initialise(); at the earliest possible point in your program, before any other classes are loaded (i.e. at the beginning of the main method, or in a static initialiser in the main class).

Using core-citation outside OpenIMAJ

It’s easy to use the core-citation code in your own project, without needing to links against large parts of OpenIMAJ. In particular, you just need to add the following core-citation dependency so that you can use the annotations. With a Maven-based project this is a matter of adding the core-citation dependency to your pom.xml file:

<dependencies>
    ...
    <dependency>
        <groupId>org.openimaj</groupId>
        <artifactId>core-citation</artifactId>
        <version>1.0.6-SNAPSHOT</version>
    </dependency>
    ...
</dependencies>

and additionally adding the OpenIMAJ maven repositories:

<repositories>
    ...
    <repository>
        <id>openimaj-maven</id>
        <url>http://maven.openimaj.org/</url>
    </repository>
    <repository>
        <id>openimaj-snapshots</id>
        <url>http://snapshots.openimaj.org/</url>
    </repository>
</repositories>

Note that the version number refers to 1.0.6-SNAPSHOT. Once version 1.1 is released, this can be updated.

With Maven, instructing the compiler to run the ReferenceProcessor during the compile process is just a matter of adding the following lines to the project’s pom.xml file:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-compiler-plugin</artifactId>
  <version>2.3.2</version>
  <configuration>
    <source>1.6</source>
    <target>1.6</target>
    <annotationProcessors>
      <annotationProcessor>org.openimaj.citation.annotation.processor.ReferenceProcessor</annotationProcessor>
    </annotationProcessors>
  </configuration>
</plugin>

The core-citation project also needs to be listed as a dependency for this to work, but you’d need that to add the Reference annotations in the first place.

if your project has Reference annotations within it, it can be run through the ReferencesTool. The core-citation dependency (and its sub-dependencies) include everything needed to perform the runtime augmentation described above if you want to take the dynamic approach.

Comments are closed.