A brief overview of java.lang.invoke
Note: This page is still under construction
java.lang.invoke
, also known as JSR 292, is known for MethodHandles and invokedynamic. It is known for the support of dynamic programming languages, yet it is crucial to Java itself as time goes on. Let’s take a look at its history and its implications.
Before java.lang.invoke
We all know that the most usual way to get a MethodHandle
is through MethodHandles.lookup()
, which can find field accessors and methods. But didn’t reflection exist before that? Why couldn’t reflection be used?
Reflection and Unsafe
Before the appearance of invoke, reflection did exist, and this is how they were implemented:
- Method accessors used ad-hoc bytecode generation that was only removed in favor of MethodHandle in JEP 416; as of JDK 23, the infrastructure still exists to support old serialization constructor generation.
- Field accessors used Unsafe, which soon becomes notorious as a major blocker for upgrades past Java 9. Back then, it was much simpler, with only field access methods using a long offset.
So what does MethodHandle do in comparison? Each MethodHandle has a fixed MethodType; a MethodType can speed up calls significantly compared to argument conversions performed by reflection. And indeed, each invokedynamic instruction has a fixed MethodType passed to the bootstrap method.
Reading the java.lang.invoke
code
Entrypoints from the VM
Since invocation happens from the VM, it would be helpful to find where the call sequences start. The entrypoints to the whole invoke system are these 3 methods in MethodHandleNatives
:
linkCallSite
: Links a CallSite, i.e. an invokedynamic instructionlinkMethod
: Links a signature-polymorphic method inMethodHandle
(invokeExact
orinvoke
) orVarHandle
(access methods)linkDynamicConstant
: Resolves a CONSTANT_Dynamic to a constant value
linkCallSite
and linkMethod
return MemberName
which points to infrastructure static methods, mostly in dynamically-generated LambdaForm
s (see InvokerBytecodeGenerator
and Invokers
too). They can also point to pregenerated bytecode, such as to VarHandleGuards
methods for VarHandle
, or to Invokers$Holder
from pregeneration (via CDS or jlink)
Back into the VM
The execution of course comes back into JVM. The hooks are all in MethodHandle
:
invokeBasic
: Used byLambdaForm
code generation to easily invoke nestedMethodHandle
s, such as ones with bound arguments (BoundMethodHandle
); essentially same asinvokeExact
orinvoke
but without type conversions, as all types are “basic types” (loadable types)linkToVirtual
,linkToStatic
,linkToSpecial
,linkToInterface
: The most basic calls used byjava.lang.invoke
. Used byDirectMethodHandle.preparedLambdaForm
to simulate invokevirtual, invokestatic, invokespecial, invokeinterface calls. However, they are more powerful, as they can link to hidden classes with the trailingMemberName
argument while Java bytecode cannot.- In addition,
linkToStatic
is explicitly used inVarHandleGuards
to invoke static methods when there are manyMemberName
possibilities.
- In addition,
linkToNative
works much like the other link methods, except it takes a trailingNativeEntryPoint
. Used byNativeMethodHandle.preparedLambdaForm
.
LambdaForm
Being thousands of lines long, LambdaForm
is daunting to dig through. However, if you are a bytecode guru, you can check out InvokerBytecodeGenerator
which converts LambdaForm
to hidden classes. Also check out preparedLambdaForm
in a few MethodHandle
implementations. Luckily, LambdaForm
is a well encapsulated class, so understanding its upstream and downstream can give you a good grasp of what it does before you dive in.
MethodType
MethodType
seems simple on the surface: just a return type plus an array of parameters. What good does it do so we need it?
Turns out MethodType
encapsulates some complex logic too: one is its invokers
, which dictates how polymorphic methods with its type should be invoked; in addition, it is interned, just like the String for method and class names in reflection. It also has some logic for erasure to “basic types” (similar to the loadable types in bytecode) to reduce LambdaForms and code generation.
Best practices
MethodHandle
and VarHandle
When using MethodHandle
and VarHandle
, prefer to keep them as constants (another good topic to dive into later), such as in static final
fields.
Always prefer calling invokeExact
; this methods is the fastest. A call to invoke
, in contrast, may call asType
every time when the handle’s invoked, and even if the asTypeCache
doesn’t miss, since it’s a soft reference instead of a constant, it cannot be inlined.
Similarly, when declaring a VarHandle
, finish the declaration with a withInvokeExactBehavior
. Otherwise, the VarHandle
will suffer from similar performance penalties if called with a suboptimal type (JDK-8160821).
Dynamic constants
Compared to invokedynamic bootstrap methods scattered across many classes (LambdaMetafactory
, StringConcatFactory
), the ConstantBootstraps
method provide a lot of bootstrap methods for general-purpose dynamic constants otherwise not representable in the constant pool, such as nullConstant
, primitiveClass
, for use in bootstrap method arguments. There are two useful ones, getStaticFinal
and invoke
, which can translate otherwise eagerly initialized static final fields in a class to a lazy constant to reduce class initialization cost.
Hidden classes
Hidden classes began with Unsafe.defineAnonymousClass
, which defined “VM anonymous classes”; they indeed began with invoke, as they were first used for LambdaForm implementations. Now, they have been promoted to a standalone Hidden Classes feature usable by all Java programs.
NestMates
From JEP 181:
The notion of a common access control context arises in other places as well, such as the host class mechanism in
Unsafe.defineAnonymousClass()
, where a dynamically loaded class can use the access control context of a host. A formal notion of nest membership would put this mechanism on firmer ground (but actually providing a supported replacement fordefineAnonymousClass()
would be a separate effort.)
How unexpected! Nestmates come from VM anonymous classes. Indeed, in current invoke, the generated LambdaForm$
hidden classes still have LambdaForm
as their host class, though they are not nestmates.
An anecdote about nest is that they were created to enable generic specialization by subclassing in project Valhalla. (Treat this message with doubt, since I forgot about the source)
The nests also greatly simplified some Java design patterns. For example, before nests:
private static class Holder {
static final Object instance;
}
The field declaration avoided private
because java compiler has to generate accessor to access the instance; it had always generated bridge methdos to access private members in enclosing and inner classes, because these concepts don’t exist in the JVM (only packages exist).
Another anecdote is that a MethodHandle
can be created for a nested enum constructor and can be called without any problem, while doing so is prohibited by reflection.
ClassData
Class data is any object passed to MethodHandles$Lookup.defineHiddenClassWithClassData()
. Compared to passing the data elsewhere such as via ThreadLocal
, using class data is more thread safe and less costly.
Since there are hidden classes, class data becomes necessary, as not all MethodHandle instances are representable by bytecode instructions. LambdaForm
classes use class data to represent other hidden classes and MemberName
for hidden class members.
Class data is usually accessed in generated code with MethodHandles.classData
. It’s intentionally compatible as a bootstrap method to facilitate usage as a dynamic constant and using that constant as opposed to calling this method on each site. (Note that InvokerBytecodeGenerator
does not use condy, as LambdaForm
has to be ready before condy is available for use, so it stores the values in static final fields instead)
There’s an additional MethodHandles.classDataAt
, but calling List.get(int)Object
is preferable in actual bytecode to prevent spamming up the constant pool; classDataAt
is mostly for supplying bootstrap method arguments.
Other attributes
Other important attributes of hidden classes include:
- Omission in stack traces by default
- Not modifiable by instrumentation
- Final fields are automatically “trusted” (part of constants)
- Class no longer discoverable by
Class.forName
These pose risks for migration of regular generated classes to hidden classes.
Impact of java.lang.invoke
invokedynamic
Initially created to allow dynamic programming languages to better resolve calls (like Gradle’s closures), indy is also noted for its ability to provide distinct implementations on different VMs; just like library methods that evolve over time, older code using indy will use the modern code shape provided by indy, enjoying improved performance.
For example, LambdaMetafactory
can try using shared-class approach (storing MemberName or MethodHandle in final fields and create a class only if the interface differs) to reduce class loading pressure when a few interfaces have a lot of implementations. Already in action is ObjectMethods
where record’s object methods are being improved over time, and StringConcatFactory
that relays back to StringBuilder
if the concatenation is too complex.
Reflection
We have discussed how reflection was before invoke - ad-hoc classes generated for each different method. This creates a lot of classes. In comparison, JEP 416 creates a MethodHandle
that may use shared LambdaForm
if possible; this change might explain the slowdown observed with reflection for non-constant field/method objects. Yet it’s a good tradeoff, as it significantly reduces classloading pressure.