[Article] Monitoring Applications with the Analytics Instrumentation Java Agent
- Udani Weeraratne
- Software Engineering Intern - WSO2
Table of Contents
- Introduction
- Applies to
- Architecture of the agent
- Java agent concepts
- Configuration file layout
- How to run the agent
- Building and using the agent
- Use case
- Summary
Introduction
When using complex products, there are situations where we question the internal working of the product. Some common questions include
- When did the first method execute?
- How long did it take this method to execute? Does it take more time than the second method?
- What were the values passed to the third method during runtime?
- How many times did the fourth method execute during run time?
Obtaining the answers to these questions is harder than when working with our own product. This is because we lack control over the source code of the third party products.
What if we were able to modify the source code of these applications by injecting a our own set of code? Then all the answers will be available within a matter of seconds. But, modifying the source code is not as easy as it sounds because we only have access to the .class files in the jars, which contain bytecode. Decompiling and editing this causes many problems.
The analytics-instrumentation-agent of WSO2 Data Analytics Server (WSO2 DAS) provides a quick solution for this. It gives you the ability to inject a set of lines at a specific line number of a method in a class. The results obtained will be published to a stream in WSO2 DAS, which can later be analyzed by using its tools.
Applies to
Client | Any Java application |
---|---|
Server | WSO2 DAS version 3.0.1 and above |
Architecture of the agent
The analytics-instrumentation-agent is built upon the Java agent feature of Java. More details on the Java agent feature will be explained later. Its architecture is presented in the following diagram:
Figure 1
As shown in the diagram, the end to end scenario requires three main parts:
- Configuration file
- analytics-instrumentation-agent jar running with a Java based product
- WSO2 DAS to receive and store published results
Firstly, we need to run the analytics-instrumentation-agent with the subject product before the product starts its main tasks. This is done through the java-agent.sh file. It sets JVM parameters (Javaagent, classpath), which are required to run the agent before the main class of the subject product. This step is required, because we need to perform the modifications before the class files of the product are executed by the JVM.
As mentioned earlier, the agent can run with any Java based product (WSO2 Carbon product or any other third party product).
Once the agent is up and running with the product, it fetches the set of classes and methods that need to be instrumented from the configuration file (inst-agent-config.xml). It iterates through all the classes loaded onto the JVM to find a match with the classes given in the configuration file. If a match is found, it will obtain the method body by matching the method name and method signature. Once this is obtained it injects the method at three different locations:
- At the beginning - this refers to the part after the method signature and before the first line of the method body. The injected code will define two local variables to store the start time of the method and a correlation ID (to keep track of all the events sent with a single method execution).
- At a specific line - this is included only if the user has specified a set of parameters that they wish to monitor. It injects the code at a specific line of the method body. E.g. lineNo=3 adds content to the third line of the method body. The current content of the third line is then shifted down.
- At the end - this refers to the part after all the lines of the method body and before the final curly brace (‘}’) of the method. It generates a map here including parameter values. The total method execution time is calculated here as well.
If the user has specified any parameters they want to monitor, it assigns those values to a map (referred to as an arbitrary map) at all the three locations mentioned above. The arbitrary map allows you to publish values that are not included in the stream definition. It may contain any number of
Values intercepted at each method and location are published to a stream in WSO2 DAS as wso2events. These events consist of three sections (meta_data, correlation_data, payload) out of which we’ll be using only the correlation_data and payload. Values passed with arbitrary maps will also be considered as part of the payload. The layout of events at each location are shown below:
Injecting Location | Metadata | Correlation Data | Payload Data | Arbitrary Map |
---|---|---|---|---|
Beginning | N/A | method-id | [scenario-name,class-name,method-name,’start’] | Map of <key,value> |
Insert at | N/A | method-id | [scenario-name,class-name,method-name,’line-no’] | Map of <key,value> |
End | N/A | method-id | [scenario-name,class-name, method-name,’end’,duration] | Map of <key,value> |
In the above map, ‘key’ contains the mapping to the column name in the table, where the value may contain a parameter value or a String provided by the user. Events published at each method is received by an event receiver and stored in the WSO2 DAS analytics table through a persisted event stream.
Note: For the moment we can intercept only String parameters.
instrumentation.agent.receiver.xml
The layout of the event receiver of the agent should have arbitrary maps enabled as shown below in order to receive and store maps.
<?xml version="1.0" encoding="UTF-8"?> <eventReceiver name="instrumentation.agent.receiver" statistics="disable" trace="disable" xmlns="https://wso2.org/carbon/eventreceiver"> <from eventAdapterType="wso2event"> <property name="events.duplicated.in.cluster">false</property> </from> <mapping arbitraryMaps="enable" customMapping="disable" type="wso2event"/> <to streamName="test.stream.1" version="1.0.0"/> </eventReceiver>
Note: Without an event receiver, values published from the subject product are not received by the event stream, which results in an empty table.
instrumentation.agent.stream.xml
The layout of the event stream should include one correlation attribute and five payload attributes as shown below. The order of values in the payload will match the definition below.
{ "name": "instrumentation.agent.stream", "version": "1.0.0", "nickName": "", "description": "", "correlationData": [ { "name": "method_Id", "type": "LONG" } ], "payloadData": [ { "name": "scenario_name", "type": "STRING" }, { "name": "class_name", "type": "STRING" }, { "name": "method_name", "type": "STRING" }, { "name": "inst_location", "type": "STRING" }, { "name": "duration", "type": "STRING" } ] }
Note:
- inst_location will notify the instrumented location (start / line_no / end) of the method
- Duration contains method execution time in nanoseconds
Deploying a stream and receiver that match the above layouts and configuring them through the agent configuration file enables users to fetch and store data from different products at different analytics tables.
Once all the intercepted data is stored properly, users may carry out relevant analyses using WSO2 DAS tools. This may be useful when making better decisions in the future and collecting data that can later be used with WSO2 Machine Learner (WSO2 ML) as well.
Java agent concepts
The analytics-instrumentation-agent of WSO2 DAS is built upon the Javaagent1 feature of Java. It is a powerful feature that is designed to facilitate bytecode instrumentation.
The Javaagent feature allows you to implement a Java based agent to instrument programs running on JVM. The task of instrumentation involves modification of bytecodes. The term bytecode stands for the compiled Java files or the .class files that are difficult to read and modify.
Once the agent is deployed as a JAR file with a MANIFEST, which specifies the Agent class, it runs the agent before the subject product starts executing its main class.
- Passing the path of the agent jar as a parameter to the JVM (through
-javaagent:jarpath [=options]
) notifies the JVM of the presence of the agent. - A string of options prefixed by an ‘=’ sign is used to pass a set of parameters to the agent.
- The premain method included in the Agent class ensures that the agent runs before the product’s main class executes.
The Javaagent can start executing at two different instances. These include starting the agent
- Before the product starts and the relevant class loads to JVM
- At specific point of time and instrumenting only the required classes that are already loaded to JVM
The analytics-instrumentation-agent in this context is built upon the first method which involves the premain class mentioned above.
Once the agent starts, the next step is to carry out the instrumentation on the required class files. This is done through the ClassFileTransformer of the agent. When we add a transformer using the agent premain (addTransformer()
), the ClassFileTransformer starts iterating through all the class files loaded onto the JVM. Once it comes across the bytecode of a class specified by the user, it carries out the required instrumentation.
Although Java provides the required services for the instrumentation, a bytecode modification library is also needed to handle the actual modifications of bytecode. ASM, Javassist and AspectJ are a few libraries that can be used in the process. For the analytics-instrumentation-agent we used the Javassist library since it is easier to handle than ASM and doesn't require close engagement with the bytecode.
Configuration file layout
The following configuration file, inst-agent-config.xml
, is used as the input that carries the instrumenting class related data and other authentication settings to the agent.
inst-agent-config.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <instrumentationAgent > <agentConnection> <streamName>...</streamName> <tableName>...</tableName> <version>...</version> <receiverURL>...</receiverURL> <authURL>...</authURL> <username>...</username> <password>...</password> <hostName>...</hostName> <servicePort>...</servicePort> </agentConnection> <scenarios> <scenario name=".."> <instrumentingClass name=".."> <instrumentingMethod name=”.." signature=".."> <insertBefore> <parameterName key="..">...</parameterName> </insertBefore> <insertAts> <insertAt lineNo=".."> <parameterName key="..">...</parameterName> </insertAt> </insertAts> <insertAfter> <parameterName key="..">...</parameterName> </insertAfter> </instrumentingMethod> </instrumentingClass> </scenario> ... </scenarios> </instrumentationAgent>
The agent configuration file contains two main sections - agentConnection and scenarios.
<agentConnection>
This section obtains parameters required to establish the connection with WSO2 DAS to publish events from the instrumented product. Among them, a few attributes will be used to modify the schema of the analytics table as well.
Attribute | Description | Sample Value |
---|---|---|
streamName | Name of the stream in WSO2 DAS | instrumentation.agent.stream |
tableName | Table created DAL by persisting the above stream | INSTRUMENTATION_AGENT_STREAM |
version | Version of the stream | 1.0.0 |
receiverURL | The WSO2 DAS thrift receiver URL | tcp://localhost:7611 |
authURL | The WSO2 DAS thrift receiver auth URL | ssl://localhost:7711 |
username | Username for connecting WSO2 DAS receiver node | admin |
password | Password for connecting WSO2 DAS receiver node | admin |
hostName | Hostname or the IP address of the server | localhost |
servicePort | Service port of the server | 9443 |
<scenarios>
This involves the instrumentation class details. A scenario lists a set of classes to be instrumented. Within a single class there can be a list of methods along with their name and the method signature. Javap tool included with JDK can be used to generate method signatures.
Once a method is selected a user may select where to inject it and what values to intercept. Instrumentation can be done at all three locations (beginning, specific line, end) or at only one specific location according to the user’s need. If the user wishes to instrument at a specific line of the method, then it requires a valid line number as well. Likewise, the user can use more than one insertAt
tag to instrument at different line numbers of the same method.
When specifying classes, user can provide three types. These include
- Interface: instrument all the classes that have implemented a given interface
- e.g.
Java.sql.PreparedStatement
- e.g.
- Superclass: instrument all the classes that have extended a given class
- e.g.
org.h2.jdbc.JdbcStatement
- e.g.
- Simple class: instrument only the given class
- e.g.
org.h2.jdbc.JdbcPreparedStatement
- e.g.
The <parameterName>
is used to specify what values are expected. At each location (insertBefore
, insertAt
, insertAfter
) there can be more than one parameter. The value of the parameter can vary based on the requirement.
- String - “value to print”
- Method parameter - $1 (first parameter of the method) / $0.attributeName ($0 stand for this in Java)2
The attribute ‘key’ includes the name of the column in which user wishes to store the value. Once all the values are set properly, the analytics-instrumentation-agent will carry out the rest of the instrumentation as expected.
Attribute | Description | Sample Value |
---|---|---|
scenarioName | Unique name for the scenario | jdbc-monitoring |
instrumentingClass
|
Fully qualified class name | Java.sql.PreparedStatement |
instrumentingMethod
|
|
executeQuery()LJava/sql/ResultSet; |
insertBefore | Values to obtain after the method signature and before the first line of method body | <parameterName> tags |
insertAt
|
Line number to instrument within method body (value less that number of lines of method body) | 1 |
insertAfter | Values to obtain before method ends | <parameterName> tags |
parameterName
|
|
column_1 $1 / “string_value” |
Note: There can be some cases where the agent fails to instrument certain classes due to internal factors (e.g. Javassist.NotFoundException
). These can be logged and checked in the inst-agent.log
file.
How to run the agent
Step 1 - Copying files
WSO2 product
- Create folder named
javaagent
in<PRODUCT_HOME>/lib/
and copy required jars to the folder. - Copy
java-agent.sh
file to<PRODUCT_HOME>/bin/
. - Create a folder named
javaagent
in<PRODUCT_HOME>/repository/conf/
and copyinst-agent-config.xml
andlog4j.properties
files. - Copy
analytics-instrumentation-agent.jar
to<PRODUCT_HOME>/repository/components/lib
file.
Other product
- Create a folder within the product folder structure and copy the following files to it.
inst-agent-config.xml
log4j.properties
data-agent-config.xml
client-truststore.jks
- Copy the
analytics-instrumentation-agent.jar
to the above folder or any preferred location. - Copy the set of jars mentioned below to a preferred location.
- Copy
java-agent.sh
file to a preferred location.
Note: jars to be copied to the lib/ folder:
Javassist-3.18.1-GA.jar
commons-logging-1.2.jar
commons-pool-1.6.jar
disruptor-3.3.2.jar
libthrift-0.8.0.jar
log4j-1.2.13.jar
org.wso2.carbon.databridge.agent-5.0.8.jar
org.wso2.carbon.databridge.commons.thrift-5.0.8.jar
org.wso2.carbon.databridge.commons-5.0.8.jar
json-simple-1.1.1.jar
org.wso2.carbon.utils_4.4.3.jar
org.wso2.carbon.base_4.4.3.jar
Step 2 - Passing parameters to the agent
In order to run the agent you have to pass the paths of configuration files and the set of other files mentioned above, as agent parameters. These values are passed as a string of options with the -javaagent parameter. E.g. -javaagent:path/to/agent.jar[=options]
WSO2 product
- Single parameter ‘true’ will inform the agent that it is WSO2 product
-javaagent:/path/to /analytics-instrumentation-agent-1.0-SNAPSHOT.jar=true
Other product
- First parameter ‘false’ will inform the agent that it is a third party product. The file path provided in the second parameter will notify where to check for the relevant configuration files (Note: ‘/’ must be used atthe end)
-Javaagent:/path/to/analytics-instrumentation-agent-1.0-SNAPSHOT.jar=false,path/to/config/folder/
Instructions on setting these parameters are provided with the java-agent.sh
file.
Step 3 - Add jars required to the classpath
As in all other cases Javaagent also requires its dependencies to be added in the classpath. Based on the product and the classes instrumented with the agent, the jars required by the agent may vary. Therefore, copying required jars to the lib is up to the user. The set of main jars required is mentioned in Step 1.
WSO2 product
- jars are appended to the
CARBON_CLASSPATH
. Later it will be added to JVM classpath.- export
CARBON_CLASSPATH="$CARBON_CLASSPATH":"$(echo $CARBON_HOME/lib/Javaagent/*.jar | tr ' ' ':')"
- export
Other product
- Use environment variable CLASSPATH.
- export
CLASSPATH="$(echo absolute/path/to/lib/*.jar | tr ' ' ':')"
- export
- Add dependencies from the pom file to the MANIFEST file while building the jar. This can be done by uncommenting the sections in the
maven-jar-plugin
andmaven-dependency-plugin
Note: If a specific jar is not available in the classpath, it will throw a ClassNotFoundException
along with a NoClassDeffFoundError
and abort the Javaagent before the start of the product.
Step 4 - Setup WSO2 DAS
This step will be common to both WSO2 products and other products because we need to setup a stream and a receiver in WSO2 DAS to store the received data.
- Create a stream which matches the definition given in the ‘Architecture of analytics-instrumentation-agent’.
- Persist the table before saving the stream (check the ‘Merge with existing schema’ option in the Advanced section to enable schema merging. Otherwise the schema modified by the agent will be removed)
Figure 2
- Create an event receiver as mentioned above and enable arbitrary maps using
arbitraryMaps=”enable”
A step-by-step description can be found in product documentation3.
Step 5 - Configure the agent
Fill the inst-agent-config.xml
with the correct connection data and instrumentation class data. The connection data should match the exact values of the WSO2 DAS (values given above may change if the server is started with a port offset). The table name can be found from the DataExplorer of the server.
Step 6 - Run the agent
Once all the above steps are completed, the final step is to run the agent with the product. It can be achieved by doing the following.
WSO2 product
- Add the following section to the
wso2server.sh
file and start the product as normal usingsh wso2server.sh
(it can be added anywhere after setting theCARBON_HOME
variable).#load Java agent
. $CARBON_HOME/bin/java-agent.sh
Other product
- Add the above lines to the product’s startup script or modify it according to the requirement.
- If it is a simple Java application with a main class agent, it can be run using a command line.
~$ Java -javaagent:path/to/analytics-instrumentation-agent.jar[=options] com.test.TestClass
Published values can be searched and queried using the data explorer tool of WSO2 DAS.
Use case
The following ‘jdbc-monitoring’ scenario will give you an overview on how to run the agent with a WSO2 product. It will provide an end-to-end scenario on intercepting jdbc queries. You can refer to the configuration files in the Appendix.
Objective: Intercept JDBC calls. List all queries executed along with connection details.
Classes and methods:
- org.h2.jdbc.JdbcConnection - prepareStatement()
- org.h2.jdbc.JdbcPreparedStatement - executeQuery()
inst-agent-config.xml
Figure 3
java-agent.sh
Figure 4
instrumentation.agent.receiver.xml
Figure 5
instrumentation.agent.stream_1.0.0.json
Figure 6
Output in WSO2 DAS
Figure 7
Building and using the instrumentation agent
Source code for the analytics-instrumentation-agent is hosted here4. That GitHub repository contains the following resources.
java-instrumentation-agent/conf
: Required configuration files that need to be copied.java-instrumentation-agent/lib
: Required dependencies that need to copied to <PRODUCT_HOME>/lib/javaagent folder.java-instrumentation-agent/src
:Source code for the instrumentation agent.
Building analytics-instrumentation-agent source code
Java 1.6+ and Maven 3.0+ is required to build the source code.
- Download and unzip the source repository or clone it using Git: git clone https://github.com/wso2/analytics-data-agents
- cd into analytics-data-agents/java-instrumentation-agent and type mvn clean install
- java-instrumentation-agent binary can be found inside java-instrumentation-agent/target folder
- Follow “How to run the agent” section to install and configure the agent with any Java based server
Summary
The new analytics-instrumentation-agent feature introduced with WSO2 DAS provides a Java agent that can be used to intercept values of a given set of Java classes. It provides the facility to modify bytecode of given classes and publish the intercepted values to a stream in WSO2 DAS. These intercepted values can later be analyzed using WSO2 DAS tools (Spark SQL, index search queries) to obtain facts with a value.
It’s proves to be a useful tool for developers who wish to monitor internal parameters of products in terms of performance testing and profiling. This article explains how it works and how to implement it in any java based product.
References
- [1] https://docs.oracle.com/Javase/7/docs/api/Java/lang/instrument/package-summary.html
- [2] https://jboss-Javassist.github.io/Javassist/tutorial/tutorial2.html
- [3] https://docs.wso2.com/display/DAS300/Quick+Start+Guide
- [4] https://github.com/wso2/analytics-data-agents