Note: This page is still under construction


java.lang.invoke, also known as JSR 292, is known for MethodHandles and invokedynamic. It is known for the support of dynamic programming languages, yet it is crucial to Java itself as time goes on. Let’s take a look at its history and its implications.

Before java.lang.invoke

We all know that the most usual way to get a MethodHandle is through MethodHandles.lookup(), which can find field accessors and methods. But didn’t reflection exist before that? Why couldn’t reflection be used?

Reflection and Unsafe

Before the appearance of invoke, reflection did exist, and this is how they were implemented:

  • Method accessors used ad-hoc bytecode generation that was only removed in favor of MethodHandle in JEP 416; as of JDK 23, the infrastructure still exists to support old serialization constructor generation.
  • Field accessors used Unsafe, which soon becomes notorious as a major blocker for upgrades past Java 9. Back then, it was much simpler, with only field access methods using a long offset.

So what does MethodHandle do in comparison? Each MethodHandle has a fixed MethodType; a MethodType can speed up calls significantly compared to argument conversions performed by reflection. And indeed, each invokedynamic instruction has a fixed MethodType passed to the bootstrap method.

Reading the java.lang.invoke code

Entrypoints from the VM

Since invocation happens from the VM, it would be helpful to find where the call sequences start. The entrypoints to the whole invoke system are these 3 methods in MethodHandleNatives:

  • linkCallSite: Links a CallSite, i.e. an invokedynamic instruction
  • linkMethod: Links a signature-polymorphic method in MethodHandle (invokeExact or invoke) or VarHandle (access methods)
  • linkDynamicConstant: Resolves a CONSTANT_Dynamic to a constant value

linkCallSite and linkMethod return MemberName which points to infrastructure static methods, mostly in dynamically-generated LambdaForms (see InvokerBytecodeGenerator and Invokers too). They can also point to pregenerated bytecode, such as to VarHandleGuards methods for VarHandle, or to Invokers$Holder from pregeneration (via CDS or jlink)

Back into the VM

The execution of course comes back into JVM. The hooks are all in MethodHandle:

  • invokeBasic: Used by LambdaForm code generation to easily invoke nested MethodHandles, such as ones with bound arguments (BoundMethodHandle); essentially same as invokeExact or invoke but without type conversions, as all types are “basic types” (loadable types)
  • linkToVirtual, linkToStatic, linkToSpecial, linkToInterface: The most basic calls used by java.lang.invoke. Used by DirectMethodHandle.preparedLambdaForm to simulate invokevirtual, invokestatic, invokespecial, invokeinterface calls. However, they are more powerful, as they can link to hidden classes with the trailing MemberName argument while Java bytecode cannot.
    • In addition, linkToStatic is explicitly used in VarHandleGuards to invoke static methods when there are many MemberName possibilities.
  • linkToNative works much like the other link methods, except it takes a trailing NativeEntryPoint. Used by NativeMethodHandle.preparedLambdaForm.

LambdaForm

Being thousands of lines long, LambdaForm is daunting to dig through. However, if you are a bytecode guru, you can check out InvokerBytecodeGenerator which converts LambdaForm to hidden classes. Also check out preparedLambdaForm in a few MethodHandle implementations. Luckily, LambdaForm is a well encapsulated class, so understanding its upstream and downstream can give you a good grasp of what it does before you dive in.

MethodType

MethodType seems simple on the surface: just a return type plus an array of parameters. What good does it do so we need it?

Turns out MethodType encapsulates some complex logic too: one is its invokers, which dictates how polymorphic methods with its type should be invoked; in addition, it is interned, just like the String for method and class names in reflection. It also has some logic for erasure to “basic types” (similar to the loadable types in bytecode) to reduce LambdaForms and code generation.

Best practices

MethodHandle and VarHandle

When using MethodHandle and VarHandle, prefer to keep them as constants (another good topic to dive into later), such as in static final fields.

Always prefer calling invokeExact; this methods is the fastest. A call to invoke, in contrast, may call asType every time when the handle’s invoked, and even if the asTypeCache doesn’t miss, since it’s a soft reference instead of a constant, it cannot be inlined.

Similarly, when declaring a VarHandle, finish the declaration with a withInvokeExactBehavior. Otherwise, the VarHandle will suffer from similar performance penalties if called with a suboptimal type (JDK-8160821).

Dynamic constants

Compared to invokedynamic bootstrap methods scattered across many classes (LambdaMetafactory, StringConcatFactory), the ConstantBootstraps method provide a lot of bootstrap methods for general-purpose dynamic constants otherwise not representable in the constant pool, such as nullConstant, primitiveClass, for use in bootstrap method arguments. There are two useful ones, getStaticFinal and invoke, which can translate otherwise eagerly initialized static final fields in a class to a lazy constant to reduce class initialization cost.

Hidden classes

Hidden classes began with Unsafe.defineAnonymousClass, which defined “VM anonymous classes”; they indeed began with invoke, as they were first used for LambdaForm implementations. Now, they have been promoted to a standalone Hidden Classes feature usable by all Java programs.

NestMates

From JEP 181:

The notion of a common access control context arises in other places as well, such as the host class mechanism in Unsafe.defineAnonymousClass(), where a dynamically loaded class can use the access control context of a host. A formal notion of nest membership would put this mechanism on firmer ground (but actually providing a supported replacement for defineAnonymousClass() would be a separate effort.)

How unexpected! Nestmates come from VM anonymous classes. Indeed, in current invoke, the generated LambdaForm$ hidden classes still have LambdaForm as their host class, though they are not nestmates.

An anecdote about nest is that they were created to enable generic specialization by subclassing in project Valhalla. (Treat this message with doubt, since I forgot about the source)

The nests also greatly simplified some Java design patterns. For example, before nests:

private static class Holder {
    static final Object instance;
}

The field declaration avoided private because java compiler has to generate accessor to access the instance; it had always generated bridge methdos to access private members in enclosing and inner classes, because these concepts don’t exist in the JVM (only packages exist).

Another anecdote is that a MethodHandle can be created for a nested enum constructor and can be called without any problem, while doing so is prohibited by reflection.

ClassData

Class data is any object passed to MethodHandles$Lookup.defineHiddenClassWithClassData(). Compared to passing the data elsewhere such as via ThreadLocal, using class data is more thread safe and less costly.

Since there are hidden classes, class data becomes necessary, as not all MethodHandle instances are representable by bytecode instructions. LambdaForm classes use class data to represent other hidden classes and MemberName for hidden class members.

Class data is usually accessed in generated code with MethodHandles.classData. It’s intentionally compatible as a bootstrap method to facilitate usage as a dynamic constant and using that constant as opposed to calling this method on each site. (Note that InvokerBytecodeGenerator does not use condy, as LambdaForm has to be ready before condy is available for use, so it stores the values in static final fields instead)

There’s an additional MethodHandles.classDataAt, but calling List.get(int)Object is preferable in actual bytecode to prevent spamming up the constant pool; classDataAt is mostly for supplying bootstrap method arguments.

Other attributes

Other important attributes of hidden classes include:

  • Omission in stack traces by default
  • Not modifiable by instrumentation
  • Final fields are automatically “trusted” (part of constants)
  • Class no longer discoverable by Class.forName

These pose risks for migration of regular generated classes to hidden classes.

Impact of java.lang.invoke

invokedynamic

Initially created to allow dynamic programming languages to better resolve calls (like Gradle’s closures), indy is also noted for its ability to provide distinct implementations on different VMs; just like library methods that evolve over time, older code using indy will use the modern code shape provided by indy, enjoying improved performance.

For example, LambdaMetafactory can try using shared-class approach (storing MemberName or MethodHandle in final fields and create a class only if the interface differs) to reduce class loading pressure when a few interfaces have a lot of implementations. Already in action is ObjectMethods where record’s object methods are being improved over time, and StringConcatFactory that relays back to StringBuilder if the concatenation is too complex.

Reflection

We have discussed how reflection was before invoke - ad-hoc classes generated for each different method. This creates a lot of classes. In comparison, JEP 416 creates a MethodHandle that may use shared LambdaForm if possible; this change might explain the slowdown observed with reflection for non-constant field/method objects. Yet it’s a good tradeoff, as it significantly reduces classloading pressure.