Last updated June 22, 2010 22:31, by Ken Cavanaugh
Feedicon  

A Java Method Tracing Facility

One critical facility needed in a production system is the ability to gather trace information from a running system in a variety of ways. Lots of different mechanisms can be used for this:

  • System.out.println
  • java.util.logging
  • Various elaborations of this (e.g. add enter/exit code to every important method)
  • Use AOP to weave tracing code into the product

The ORB has used an increasingly elaborate collection of mechanisms for this over the years. Recently I've cleaned up, simplified, and consolidated this into a new tool and library: the Tracing Facility (TF).

The basic idea is to make it easy to inject a form of "Around Advice" (in AOP terms) into selected methods. Obviously an AOP system could do this, but I don't want a full AOP system in my projects. I also want to use annotations to indicate which methods should be traceable, and thus subjected to the overhead of the additional code for tracing method entry and exit. Furthermore, I want to use meta-annotations to define the annotations which define traceable methods. This last point seems to be a bit beyond the capabilities of most AOP systems.

It's also necessary to reduce the overhead of tracing as much as possible. To that end, I have written a tool using ASM that enhances the bytecode in an existing class in two stages:

  1. First the static initializer is enhanced, new fields are added for tracing information, and certain private methods (and calls to these methods) are rewritten. This stage changes the class "schema", and so must be done at build time.
  2. Second the actual tracing code is injected into the traceable methods. This does not change the class schema, so this step could be done either at build time or at run time (or class load time), as desired. Note that the current tool and library ONLY support stage 2 at build time, but this should be easy to enhance later.

Step 1 has minimal impact on the execution time of an application, since the only runtime change is some extra code in the class's static initializer, which simply registers some information about the class with a registry. Step 2 adds code which may have a noticable (but still small) impact on the execution speed of code, even when no tracing is actively enabled.

Note that all packages are currently in the ORB workspace, but this code is independent of the ORB, and could easily be repackaged elsewhere.

Client view: changes to classes to enable tracing

A number of annotations are defined for the TF in the com.sun.corba.se.spi.orbutil.tf.annotation package (this is in the ORB library):

  • MethodMonitorGroup
    • This meta-annotation is applied to annotations, which themselves are applied to classes and methods. We'll call such annotations "TF annotations". A method can have at most one TF annotation. The presence of a TF annotation indicates that the method should be enhanced to include monitoring of method entry and exit.
    • This meta-annotation also includes a value of type Class<? extends Annotation>[], which lists all meta-annotations that form subgroups of this meta-annotation. This allows the definition of meta-annotations that can be used to collectively control the action of a number of related meta-annotations.
  • InfoMethod
    • This simple marker annotation is used on private non-static methods that are used to trace actions within a method (that is, other than entry and exit). Any such method must have an empty body (the enhancer will replace the body of all such methods).
  • TFEnhanced.
    • A user of TF does not need to use this annotation. It is used to mark enhanced code so that the enhancer cannot enhance the same class multiple times.

Here's a simple example of how this is used:

@MethodMonitorGroup( { IORead.Class, IOWrite.Class } )
public @interface IO { }

@MethodMonitorGroup
public @interface IORead { }

@MethodMonitorGroup
public @interface IOWrite { }

@IORead
@IOWrite
public class IOStream {
    @IORead
    int readInt() (
        ...
    }

    @IORead
    long readLong() {
        ...
    }

    @IOWrite
    void writeInt( int arg ) {
        ...
    }
    @InfoMethod
    private void interestingWriteLong( String msg, long value ) {
         // must be empty!
    }

    @IOWrite
    void writeLong( long arg ) {
        ...

        // something interesting happened in writeLong
        interestingWriteLong( "An odd value", arg ) ;

        ...
    }
}

Some additional considerations are needed here to handle overloaded methods. The basic problem is that some anticipated uses of the timing facility require unique names for the methods used in the tracing facility. This leads to another annotation (TracingName), which optionally (mandatory for overloaded methods) defines the name to use instead of the method name in the tracing facility. This annotation can be applied to any InfoMethod or Monitored method. Note that overloading is all or nothing: if one method name is overloaded, all uses of that method name must be annotated the same way: either they are all InfoMethods, all Monitored methods, or none are either.

For timing points, each monitored method corresponds to a TImingPoint, and entering and exiting the method correspond to entering and exiting the corresponding timing point. InfoMethods have no inherent enter/exit semantics, so an enumerated type (TimingPointType) is optionally used in the InfoMethod annotation to indicate whether the InfoMethod corresponds to an ENTER TimingPoint, EXIT TImingPoint, or NONE (no TimingPoint).

The TF library provides methods that can attach monitoring behavior to anything tagged with a MethodMonitorGroup annotation. For example, attaching monitoring to IORead will cause the monitoring interface to be invoked on calls to readInt and readLong. Since the IO annotation includes both IORead and IOWrite, attaching monitoring to IO will cause the monitoring interface to be invoked on calls to readInt, readLong, writeInt, and writeLong. Note that it is also possible to trace actions inside a method by using an @InfoMethod, like interestingWriteLong above. Such methods must be private non-static methods with a void return type and a null body.

The TF library

The main interface in the library is MethodMonitor, defined as follows:

public interface MethodMonitor {
    /** The class for which this MethodMonitor is defined.
     *
     * @return The class of this MethodMonitor.
     */
    Class<?> myClass() ;

    /** Invoked at the start of a method, before any actions in the method
     * take place.
     *
     * @param ident The method identifier.
     * @param args The arguments passed into the method.
     */
    void enter( int ident, Object... args ) ;

    /** Invoked anywhere in the method after enter and before exit, to indicate
     * some useful tracing information in the method.
     *
     * @param ident The method identifier.
     * @param args Any information needed in the info call.
     * @param tpType The type of TimingPoint corresponding to this 
     * InfoMethod (ENTER, EXIT, or NONE)
     */
    void info( Object[] args, int callerIdent, int selfIdent, 
        TimingPointType tpType ) ;

    /** An exit from a method that has a void return type.  Called as the last
     * operation in the method.
     *
     * @param ident The method identifier.
     */
    void exit( int ident ) ;

    /** An exit from a method that has a non-void return type.  Called as the last
     * operation in the method.  result will be null if the method terminates
     * by throwing an exception.
     *
     * @param ident The method identifier.
     * @param result The method result.
     */
    void exit( int ident, Object result ) ;

    /** Called to report an exception that is thrown in the method.  If the
     * method throws and catches the exception, it will still be reported.
     *
     * @param ident The method identifier.
     * @param thr The exception that terminates the method.
     */
    void exception( int ident, Throwable thr ) ;

    /** Provided for MethodMonitor instances that maintain state.  Simply removes
     * the state and resets the MethodMonitor to its initial state.
     *
     */
    void clear() ;
}


public interface MethodMonitorFactory {
    /** Return an instance of a MethodMonitor suitable for use in the given
     * class cls, according to the currently registered MethodMonitorFactory
     * instances in the MethodMonitorRegistry.
     *
     * @param cls The class for which we need the MethodMonitor.
     * @return The MethodMonitor for cls.
     */
    MethodMonitor create( Class<?> cls ) ;
}
A private static field of type SynchronizedHolder%lt;MethodMonitor> is added to the class by the enhancer for each TF annotation carried by the class. The SynchronizedHolder is a simple class that allows getting and setting the value in the field: this makes it easy to externally modify the field. Obviously this is a security problem, so the API that allows actually setting the MethodMonitor bound to a class field needs to carefully manage permissions.

The basic use of this should be clear: in the generated code,

  • enter is called (with the method parameters) each time the method is entered.
  • exit is called (form depends on whether the return type is void) each time the method is exited.
  • If a method explicitly throws an exception, that exception is reported to the exception method before it is thrown.
  • info is called from inside an info method.

Note that ints are used for identifiers. This is done for efficiency, so that minimal space is needed for recording and analyzing the data gathered in a MethodMonitor. A per-class mapping between identifiers and method names (as Strings) is maintained for each class. No attempt is made to distinguish different overloaded forms of the method in the identifiers: this can usually be determined easily enough from the arguments passed into the enter method.

Note also that a MethodMonitorFactory interface is provided. This is used simply because many uses of the MethodMonitor interface also want the class, so the factory provides a convenient place to record the class in a MethodMonitor that is stored in a class field.

There is also a class, MethodMonitorFactoryDefaults, that provides several pre-defined methods for creating MethodMonitorFactory instances:

  • operationTracer(). This provides a simple buffer that accumulates only the current state of the nested enter calls. This can be used to provide additional context in error messages.
  • noOp(). Does nothing at all.
  • dprint(). Similar to the ORB's existing dprint mechanisms, this MethodMonitorFactory prints everything to the system output.
  • compose( final Coolection<MethodMonitorFactory> factories ). This provides a way to compose a number of MethodMonitorFactories into a single MethodMonitorFactory.

Finally, there is the MethodMonitorRegistry class:

    /** Register a class with the tracing facility.  The class must be an
     * instrumented class that is annotated with an annotation with a
     * meta-annotation of @MethodMonitorGroup.  Note that this method should
     * only be called from the enhanced class, not directly by the user.
     *
     * @param cls  Class to register, which must have 1 or more MM annotations.
     * @param methodNames The list of method names used in the enhanced code.
     * The index of the name is the value used in the method.
     * @param annoMM The MM holders for each MM annotation on the class.
     */
    public static void registerClass( final Class<?> cls,
        final List<String> methodNames,
        final Map<Class<? extends Annotation>,
            SynchronizedHolder<MethodMonitor>> annoMM ) { ... }

    /** Provided so that implementation of the MethodMonitor interface can
     * obtain the method name for use in log reports or for other purposes.
     *
     * @param cls The enhanced class
     * @param identifier An Integer representing the method name.
     * @return The name of the method corresponding to the identifier.
     */
    public static String getMethodName( Class<?> cls, int identifier ) { ... }

    public static int getMethodIdentifier( Class<?> cls, String mname ) { ... }

    /** Register a particular MethodMonitorFactory against an annotation.
     * Annot must be annotated with the MethodMonitorGroup meta-annotation.
     * Only a single mmf may be registered against an annotation.
     * A subsequent register call overwrites the registered annotation.
     * Annot must not be null.
     *
     * @param annot
     * @param mmf
     */
    public static void register( Class<? extends Annotation> annot,
        MethodMonitorFactory mmf ) { ... }

    /** Remove the MethodMonitorFactory (if any) that is associated with annot.
     *
     * @param annot
     */
    public static void clear( Class<? extends Annotation> annot ) { ... }

   /** Return the MethodMonitorFactory registered against the annotation, or
     * null if nothing is registered.
     *
     * @param annot
     * @return
     */
    public static MethodMonitorFactory registeredFactory(
        Class<? extends Annotation> annot ) { ... }

    /** Return the current MethodMonitor in use for the given cls and annot.
     * Returns null if no MethodMonitor is in use. Throws an exception if
     * either cls is not a traced class, or annot is not a tracing annotation
     * on cls.
     *
     * @param cls The Traced class.
     * @param annot A trace annotation on cls.
     * @return The MethodMonitor, if any.
     */
    public static MethodMonitor getMethodMonitorForClass( final Class<?> cls,
        final Class<? extends Annotation> annot ) { ... }
}
This singleton class controls the entire TF library. The various methods are used as follows:

  • registerClass. A call to this method is generated in the traced class's static initializer (one is generated if the class does not have one, otherwise the TF code is appended to the class's static initializer). This informs that TF library of the existence of classes that can be traced, and gives the TF code the method names (the index in the list is the method name's identifier) and mapping from TF annotation to SynchronizedHolder for each TF annotation applied to the class. This allows the MethodMonitorRegistry to provide string-to-identifier mapping APIs, and also provides the information needed for the MethodMonitorRegistry to update the class fields for the SynchronizedHolders as MethodMonitorFactory instances are registered against TF annotations. Note that the method names are the tracing method names as defined (as needed) by @TracingName annotations as well as the method names. All names are sorted to give a standard order.
    • Security Impact: none. Calling this on a traceable class with at worst make tracing malfunction.
  • getMethodName, getMethodIdentifier. These methods provide the String-to-Identifier mapping.
    • Security Impact: minor. Can expose names of private methods.
  • register. This is the main runtime API. It allows any MethodMonitorFactory to be associated with any TF annotation.
    • Security implact: significant. There is the potential to expose any secure information associated with any private method that is traceable, since an attacker could create a special MethodMonitorFactory that is used to attack specific code paths. This method should have its access protected by a Permission (details TBD).
  • clear. Simply removes an associated MethodMonitorFactory from a TF annotation.
    • Security impact: none. Calling this simply removes access to a TF anotation's methods.
  • registeredFactory, getMethodMonitorForClass: these simply provide access to the current state of the MethodMonitorRegistry, primarily for testing.
    • Security impact: none.

The Enhancer Tool

Finally, we need to take a brief look at the bytecode enhancer tool. This tool uses ASM to modify class files, and the ORB library's file package to scan directories for class files. The built-in help text in EnhanceTool describes its arguments as follows:

Error in argument parser: No argument supplied for keyword help
    Legal arguments are:
        -debug <A valid boolean>
            (default false)
            Debug flag

        -dir <A string that can create a java.io.File>
            (default .)
            Directory to scan for class file

        -dryrun <A valid boolean>
            (default false)
            Indicates a run that only prints out actions, but does not perform them

        -newout <A valid boolean>
            (default false)
            If true, write output to a .class.new file

        -rf <A string that can create a java.io.File>
            (default tfannotations.properties)
            Name of resource file containing information about tf annotations

        -tracegen <A valid boolean>
            (default true)
            If true, do second phase (add tracing code) processing

        -verbose <A valid int>
            (default 0)
            Verbose flag
Note that -dryrun is not currently implemented. The maximum level for verbose that will add any additional output is currently 4.

Simple invoking java com.sun.tools.corba.se.enhancer.EnhanceTool by default will:

  • scan all files and directories in the current directory for .class files
  • run with no debug or verbose output
  • generate a tfannotations.properties file in the current directory
  • fully enhance each .class file found

The tfannotation.properties file is used to list all of the TF annotations found when scanning the directory. This is useful for applications that want to generically apply standard actions to whatever TF annotations exist in the project. For example, the ORB debug flags can be converted to this approach simply by requiring that the arguments passed to the ORBDebug property be the simple class names of a TF annotation. So for example we may create annotations like @transport, @subcontract (do we care about case? Should I do something about this?) to correspond to annotations such as com.sun.corba.se.spi.tfannotation.Transport (or perhaps transport? or we just modify the case?).

tfannotations.properties contains two sorts of properties:

  • com.sun.corba.tf.annotations.size, which gives the number of the second kind of property name in the file
  • com.sun.corba.tf.annotation.{N}, which gives the name of a TF annotation class. N ranges from 1 to the size.

EnhanceTool operates roughly as follows:

  • Arguments are parsed.
  • The directory scanner is initialized. The scanner will only look at class files.
  • The first directory scan is set up. This scanner simply looks for @MethodMonitorGroup-annotated annotation classes.
  • A Set<String> of all TF annotations is extracted from the first scanner.
  • The second directory scan is set up. This operates as follows on each class file:
    • First the class is read into a ClassNode using the ASM ClassReader.
      • An instance of EnhancedClassData (ECD) is constructed from the ClassNode. This contains some useful information about the class:
        • Whether or not this class should be enhanced (if not, no further processing is performed, and the class file is not changed).
        • The SynchronizedHolder for each TF annotation on the class.
        • The list of method names (in order for the identifiers)
    • The ECD and the byte[] representing the class is the only information passed into the second and third phases.
      • Note that the ECD can be easily constructed either from ASM, or from reflection on a Class instance
        • currently only the ASM case is written, in EnhancedClassDataASMImpl
    • The second phase (see ClassEnhancer) is simply an ASM Visitor that does the following:
      • It adds a private static final field of type SynchronizedHolder<MethodMonitor> for each TF annotation carried by the class.
      • It either creates or enhances the static initializer as follows:
        • a class constant is constructed
        • the list of method names is constructed
        • The map from annotation name to SynchronizedHolder<MethodMonitor> is constructed.
        • The above information is passed to the MethodMonitorRegistry.registerClass method.
      • Each info method is re-written to add 2 new arguments to the end of its existing arguments:
        • The MethodMonitor, which was used in the monitored method that called this info method
        • The callerId, the identifier of the monitored method that called this info method
      • The null body of the info method is replaced with code that is essentially as in the <INFO METHOD> box below.
        • Note that selfId is the index of the name of the info method in the method names
        • Note that this body could actually be generated in the second phase, saving an if null check when no tracing is used.
      • Each call to an info method from a traced method is re-written to pass null and 0 as the last two arguments of the call.
      • Each normal method is scanned to make sure that it does NOT call an info method (which is an error)
      • At the end, a byte[] representing the transformed class is generated and (optionally) passed to the third phase
  • The third phase (which could be done at runtime if desired) operates on the class as follows:
    • Each monitored method is augmented by bytecode that basically does what is described in the <TRACED METHOD> box below
    • Note that some additional transformations are needed (or implied by the above) on the method's body:
      • Each RETURN and ATHROW bytecode must have the finally block code included before it exits the method (standard finally block implementation)
      • Each call to an info method is rewritten, and the NULL-0-call sequence is replaced with MethodMonitor-identifier-call
        • a simple state machine is used to detect these sequences

<INFO METHOD>

if (mm != null) {
    mm.info( <args>, callerId, selfId )
}
<TRACED METHOD>

    <return type> __result = <init value of return type> // only present if return type not void
    MethodMonitor __mm = <this method's SynchronizedHolder class field>.content() ;
    if (__mm != null) {
        __mm.enter( <this method's identifier>, <wrapped args of method> ) ;
    }
    try {
        <body of method>
    } finally {
        if (__mm != null) {
            __mm.exit( <this method's identifier>, <optional return value> ) ;
        }
    }

Integrating Tracing and Timing

The ORB has a flexible timing system that can be used to time events, such as executing a method, or an interval between two significant events, such as the start and end of marshaling. This is organized around timing points. Each timing point has an ENTER and an EXIT, and the duration for an event (which corresponds to a particular timing point) is simply the different in the EXIT and ENTER times.

This has an obvious analog to the tracing facility, since each monitored method has an ENTER and an EXIT associated with it. InfoMethods require a little more work: for each InfoMethod, we need to know whether the InfoMethod is an ENTER, EXIT, or simply not of interest in the timing system.

The construction and initialization of the timing points is handled in part by code generated from an XML file that describes all of the timing points (com.sun.tools.corba.timing.timing.xml) and generates serveral classes in the package com.sun.corba.se.impl.orbutil.newtimer.generated:

  • TimingPoints
    • This is the interface for accessing the defined Timers and TimerGroups, and for the enter/exit methods for each Timer
  • TimingPointsBase
    • This abstract class initallizes the Timers and TimerGroups
  • TimingPointsDisabled
    • This provides no-op implementations of the enter and exit methods
  • TImingPointsEnabled
    • This provides real implementations of the enter and exit methods

The disabled/enabled classes are no longer needed, nor are the enter/exit methods themselves, as all of this is easily handled through the tracing facility. In fact, the generated code that is needed is just a single class that defines and initializes the Timers and TimerGtroups.

One small problem still remains: it is hard to fit in the information about TimerGroup containment. The ideal solution is to generate this in TimingPoints. Another possibility is to fill it in dynmically.

The information we need for generating TimingPoints could be expressed as follows:


interface TPData {
    /** Map class name to a Map from method tracing name to description.  If the description is "",
     * a default description identifying the class and method name should be used.
     */
    Map<String,Map<String,String>> classMethodDescriptions() ;

    Map<String,Map<String,String>> classMethodAnnotation

    /** Map TimerGroup name to its description.  If description is "", a default description using 
     * the timer group name should be used.
     */
    Map<String,String> timerGroupDescriptions() ;

    /** For each TimerGroup name, give the names of the 
    Map<String,List<String>> subgroups() ;
}

Computing this is fairly simple:

  • timingPointsForClass is taken from EnhancedClassData.getMethodNames for each EnhancedClassData
    • Still need descriptions if we do that
  • timingGroups and subgroups can be gotten from the AnnotationScannerAction if it is modified to visit annotations
  • Mysql
  • Glassfish
  • Jruby
  • Rails
  • Nblogo
Terms of Use; Privacy Policy;
© 2013, Oracle Corporation and/or its affiliates
(revision 20140418.2d69abc)
 
 
Close
loading
Please Confirm
Close