81 lines
		
	
	
		
			4.8 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			81 lines
		
	
	
		
			4.8 KiB
		
	
	
	
		
			Markdown
		
	
	
	
# OpenTelemetry Java Agent Safety Mechanisms
 | 
						|
 | 
						|
This document outlines the safety mechanisms we have in place to have confidence
 | 
						|
that the Java agent can be attached to a user's application with a very low chance of
 | 
						|
affecting it negatively, for example introducing crashes.
 | 
						|
 | 
						|
## Instrumentation tests
 | 
						|
 | 
						|
All instrumentation are written with instrumentation tests - these can be considered the unit tests
 | 
						|
of this project.
 | 
						|
 | 
						|
Instrumentation tests are run using a fully shaded `-javaagent` in order to perform the same bytecode
 | 
						|
instrumentation as when the agent is run against a normal app.
 | 
						|
By then exercising the instrumented library in a way a user would, for example by issuing requests
 | 
						|
from an HTTP client, we can assert on the spans that should be generated, including their semantic
 | 
						|
attributes. A problem in the instrumentation will generally cause spans to be reported incorrectly
 | 
						|
or not reported at all, and we can find these situations with the instrumentation tests.
 | 
						|
 | 
						|
## Latest dep tests
 | 
						|
 | 
						|
Instrumentation tests are generally run against the lowest version of a library that we support
 | 
						|
to ensure a baseline against users with old dependency versions. Due to the nature of the agent
 | 
						|
and locations where we instrument private APIs, the agent may fail on a newly released version
 | 
						|
of the library. We run instrumentation tests additionally against the latest version of the
 | 
						|
library, as fetched from Maven, as part of a nightly build. If a new version of a library will
 | 
						|
not work with the agent, we find out through this build and can address it by the next release
 | 
						|
of the agent.
 | 
						|
 | 
						|
## Muzzle compile time checks
 | 
						|
 | 
						|
Muzzle is the tool we use to ensure we do not apply agent instrumentation if it would break the
 | 
						|
user's app. Details on its implementation can be found [here](./contributing/muzzle.md).
 | 
						|
 | 
						|
Continuous build runs a muzzle compile time check for every library. This check will select random
 | 
						|
versions of the library available in Maven and check if our agent will cleanly apply to it. The
 | 
						|
check collects all references that the agent code makes, e.g., classes that are used and methods that
 | 
						|
are called, and verifies the references exist in that version of the library. This is important
 | 
						|
because if we apply the agent with missing references, it will generally cause crashes in the user's
 | 
						|
app such as `NoSuchMethodError`. We cannot check every single version of every library in every build, it
 | 
						|
would be too slow and wasteful of contributed resources. But by selecting random versions every
 | 
						|
build, over time we can be confident that we know the agent can be used on all versions of a library
 | 
						|
without causing linkage errors due to missing references.
 | 
						|
 | 
						|
## Muzzle runtime checks
 | 
						|
 | 
						|
The set of references from the agent used at Muzzle during compile time is also stored in the agent's
 | 
						|
code itself. Similar to the compile time check, we also do a validation of the references available
 | 
						|
in the user's app vs what is referenced by the agent instrumentation. If the references do not match
 | 
						|
up, we will not load the instrumentation at runtime, preventing applying instrumentation that could
 | 
						|
potentially cause linkage errors.
 | 
						|
 | 
						|
## Classloader separation
 | 
						|
 | 
						|
See more detail about the class loader separation [here](./contributing/javaagent-structure.md).
 | 
						|
 | 
						|
The Java agent makes sure to include as little code as possible in the user app's class loader, and
 | 
						|
all code that is included is either unique to the agent itself or shaded in the agent build. This is
 | 
						|
because if the agent included classes that are also used by the user's app and there was a version
 | 
						|
mismatch, it could cause linkage crashes.
 | 
						|
 | 
						|
Instead of executing code in the app's class loader, the agent has its own agent class loader where
 | 
						|
instrumentation is loaded and exporters and the SDK is configured. Only when applying an
 | 
						|
instrumentation (which will have passed Muzzle runtime checks) do we inject any additional classes
 | 
						|
that are needed by the instrumentation into the user's class loader. These classes are always either
 | 
						|
unique to the agent or shaded versions of public libraries such as our library instrumentation
 | 
						|
modules and cannot cause version conflicts.
 | 
						|
 | 
						|
To ensure agent classes are not automatically loaded into the user's class loader, possibly by an
 | 
						|
eager loading application server, they are hidden in the agent JAR as standard, non-Java files.
 | 
						|
All packages are moved into a subdirectory `inst` and all classes are renamed from `.class` to
 | 
						|
`.classdata`, ensuring applications with any sort of automatic classpath scanning will not find
 | 
						|
agent classes. The agent class loader understands this convention and unobfuscates when loading
 | 
						|
classes.
 | 
						|
 | 
						|
## Smoke tests
 | 
						|
 | 
						|
We run docker-based smoke tests which have simple instrumented apps running under various JVMs
 | 
						|
and application servers. In particular, application servers sometimes have fragile behavior using
 | 
						|
internal details of the JVM which an agent can cause problems with. Smoke tests ensure compatibility
 | 
						|
with a wide variety of application servers.
 |