The selection of the fields stored inside the token and the call secured methods is an important step in the protection of an application. It is currently left to the developer of the application[6] and we provide some guidance in this choice below.
Selecting a field for storage in the secure coprocessor causes the translator to perform the following operations:
The translator replaces the field by a pointer and inserts code in the constructor to allocate space in the memory of the coprocessor to hold the value of the field and its associated tag.
The translator transforms instructions located around the accesses to the field so that they are executed by the coprocessor.
The translator inserts instructions to exchange values between the main processor and the coprocessor where needed.
The transformations have two consequences on the application. On one hand, the application becomes potentially more difficult to crack. More information about its state and the operations it performs is hidden from a potential cracker. On the other hand, the application takes a performance hit. The coprocessor is slower than the main processor and the communication bus introduces latency when the main processor must retrieve a value from the coprocessor. A good selection must balance these two conflicting consequences. Below are some characteristics of fields that are good candidates for storage in the coprocessor.
A field that has a long lifetime relative to the runtime of the application and that is modified and used several times over this lifetime is a good candidate. The analysis of the values returned by the coprocessor is more difficult if set and get operations are dissociated and performed over a long period.
A field that contains a configuration value or a state variable of the application is a good candidate. Such a field is often assigned constant values that need not be transmitted at runtime. Instead the constants are encoded inside ciphered instructions at recompilation time and the instructions that set the value of the field are completely indistinguishable from other ciphered instructions. The change is not linked to the transmission of a given value to the coprocessor but to the execution of a given branch of the code. The value of such a field is also often used to drive the flow of control of the application. When a conditional branch uses the value of the field in a comparison, the translator relocates the comparison inside the coprocessor and its boolean result is the only information that is returned to the main processor.
A field that triggers the execution of heavy numerical operations in the coprocessor is likely to cause performance problems. However if a computation takes some parameters that can be precomputed from configuration values, these parameters and the code that precomputes them can be relocated in the coprocessor. The precomputed values are then retrieved and used in the heavy computation that is performed by the main processor.
There is no tool yet to evaluate automatically the quality of
protection afforded by a given choice of secured fields. One key point
must be kept in mind when trying to estimate this quality: The result
of all the information hiding performed by the translator is embodied
in the streams of values flowing from the coprocessor to the main
processor as the application is executed. These streams are the values
returned by the calls to the in method of the
runtime. There is one stream for each thread of execution making such
calls. A given selection of fields can provide a weak protection in
two different cases:
A cracker can guess the future values in the streams by
analyzing the streams produced by the application execution up to
a given point in time. This indicates that the hidden part of the
application is not complex enough. The log of the virtual
coprocessor can be used to extract the streams of
in values. The log record of a call to
in is prefixed with the
< character. It contains the
retrieved value and the id of the thread in square brackets. The
code of the ciphered instruction executed right before the call to
in can be used to further classify the
streams by in call site.
Altering the stream does not make the application unusable. This indicates that the fields hidden inside the coprocessor are not significant enough to the behavior of the application.
In the evaluation version of the Validy SoftNaos translator and virtual coprocessor package, only scalar fields of type byte, short, or int and Java 5 enumerated types can be stored inside the coprocessor. We plan to add support for the following types in the next versions of the tool.
Boolean, and possibly char types.
Long integers.
Arrays of the preceding types.
When a field of an enumerated type is stored inside the coprocessor, the value that is actually stored is the ordinal of the enum constant. The ordinal is the integer rank of the constant among all the constants defined for the enum type. If the field is null, a negative integer is stored in place of the ordinal. When an enum field is stored inside the coprocessor, the following operations are performed inside the token:
assignment of a constant or another enum local or field to the secured field
comparison of the secured field with a constant or another enum local or field
If the enum is used as the key in a switch statement, the ordinal is retrieved from the coprocessor and used in place of the constant. For operations that require access to the enum constant object itself, such as displaying the string representation of the constant, the ordinal is retrieved and the corresponding enum constant instance is used by the translator.
To simplify the association between storage locations and tags inside the coprocessor, the minimal unit of memory allocation is the 32 bit integer. A field of type byte or short stored inside the coprocessor occupies the same space as a field of type int.
One provision of the specification for the Java language in the binary compatibility chapter raises several difficulties for the translator. §13.4.26 Evolution of Enums states that “Adding or reordering constants from an enum type will not break compatibility with pre-existing binaries.” However, because the actual value stored inside the coprocessor is the ordinal of the enum, the constraint cannot be maintained and a class that contains a secured enumerated field must be compiled and run through the translator when enum constants are added, reordered or removed.
Another consequence of the same provision is that when an enum
type is compiled, the Java compiler must synthesize special code to
handle the mapping between constants and their ordinal. This code is
not part of the specification of the Java language or the Java
virtual machine and is thus compiler dependent. At this time, the
Validy Technology translator supports and has been tested with
bytecodes generated by Sun's javac compiler
(version 1.5.x and 1.6.x) and by the Eclipse internal compiler
(version 3.2.x). Please contact Validy if you need support for
another compiler.
The current implementation of the translator does not
automatically select methods for call protection. The developer must
mark methods with the
annotation or list them in a separate text file (see option markers).
SecureMethod
There is no need to mark the call sites. They are identified automatically by the translator.
Because methods are secured in groups (see Section 2.4, “Call protection”), the methods that can be selected must satisfy a few constraints.
All the call sites must be transformed by the translator; this rules out interfaces or virtual methods defined outside the application because all the call sites may not be accessible to the translator[7] .
Methods that are called directly by the virtual machine
cannot be secured; this includes class constructors,
finalize methods,
writeObject/readObject
methods used for serialization.
Methods called using reflection cannot be secured.
A slightly different constraint addresses how (rather than which) method groups should be marked so that all members of the group are identified by the translator. A method should always be marked in the highest[8] class or interfaces in which it is defined. Consider the following counterexample:
class A {
public abstract void f();
}
class B extends A {
@SecureMethod
public void f() {
}
}
class C extends A {
public void f() {
}
}If C is processed before or independently
from B, C.f will not be
treated as secured and the application will fail if
C.f is called through a reference to
A. The same problem occurs if
A is defined as an interface and the
keywords are replaced by
extends keywords.
implements
Some method groups require several marks in order to be
properly identified by the translator. In the example below,
A.f and D.f belong to the
same group because:
Method A.f and
B.f can be called from the same site
(invokeinterface )
since I.fB.f is the implementation of
I.f for class
C.
Method B.f and
D.f can be called from the same site
(invokevirtual
).
B.f
Both SecureMethod annotations below are
required.
interface I {
@SecureMethod
void f();
}
class A implements I {
public void f() {
}
}
class B {
@SecureMethod
public void f() {
}
}
class C extends B implements I {
public void f() {
}
}
class D extends A {
public void f() {
}
}Within the constraints exposed above, the only limit to the number of secured method calls is performance degradation. However, secured calls have less impact on performance than secured fields for the following reasons:
Secured calls do not allocate extra memory in the coprocessor.
Secured calls do not retrieve any value produced by the coprocessor. Their only potential effect is a failure of the application if it is tampered with.
The selection of methods one at a time for call protection can be tedious. A call graph builder was developed to simplify the process. This tool computes call graphs starting from entry point methods, checks that the call graph will be linked to the original secured code, and provides the list of methods that have to be secured. This is a standalone tool but its output can be used directly as an input for the translator and the provided Ant tasks allow easy integration.
Using the secure call graph builder has the following advantages over manual selection of methods to secure:
by selecting only one entry point, many methods are secured
at runtime, the selected methods automatically form a secured call stack rooted at the entry point
by default, the tool outputs only the methods of the call graphs that are actually linked to some of the original secured code. If they were not, all the code added to freeze the graph could be removed in one block without affecting the application. For a graph to be secure, the code of at least one of its methods must access a secured field, allocate an object with a secured field, or make a call to a secured API. A call graph that shares methods with another secure call graph is also secure.
The call graphs computed by the builder are geared to their use for call protection. Compared to complete Java call graphs, they have the following limitations:
Only explicit method calls are explored. Implicit calls such as calls to class initializers are not considered because they cannot be secured. The calling context is not completely under control of the translator so that a proper secured call context cannot be created.
Only methods in application classes are explored. Building the call graph further is unnecessary since the bytecode outside of application classes cannot or are not allowed to be modified to perform secured calls.
No type analysis is performed to prune the method implementations that cannot actually be called from a given site. This analysis could produce a more precise call graph but for call protection, all the methods that share at least one common call site must be secured in one group. Proving that a method implementation cannot be called from a given site is not enough. The correct verification would require a global analysis at the application level to show that the implementation is never called.
The com.validy.technology.annotation package
defines three annotation classes, SecureField,
SecureMethod, and
SecureCallGraph. To use these annotations, you
must add the vldy-tech-annot.jar file to your
class path. The last annotation is used only by the call graph
builder, the translator simply ignores it and removes it from the
bytecode it transforms.
The SecureField annotation takes one
optional parameter called registerNumber.
Instance fields have as many copies as there are objects of the class
they belong to. They can only be allocated in the heap of the
coprocessor and they are accessed using an offset from the
k$this pointer. Static fields on the other
hand are unique and can be allocated in one of two ways: in the heap
or in a global register. The allocation of static fields in the heap
is similar to that of an instance field. A chunk of coprocessor memory
is allocated in the class constructor and stored in a static pointer
called k$class. This is the default
behavior. When a register number is specified, memory is not allocated
and the value of the field is stored permanently in the given global
register.
Storing a static field in a register has the following advantages:
Accesses to the field are slightly more efficient because stores and loads are not needed.
If all secured static fields of a class are put into registers, k$class is not defined. There is not hint that the class has secured static fields.
The characteristics of the coprocessor limit the number of
static fields that can be allocated in global registers. The virtual
coprocessor has 16 global registers numbered from 32 to 47. With the
annotation below, the counter field of
class A is allocated to register 40.
class A {
@SecureField(registerNumber=40)
private static int counter = 0;
...
}The SecureField,
SecureMethod, and
SecureCallGraph annotations accept an optional
boolean parameter called enabled whose
default value is true. When this parameter is set to false, it
instructs the translator not to secure the given
field or method, or to output a list of methods
not to secure in the case of the call
graph.
The need to annotate an object to avoid protection seems paradoxal since objects that are not annotated at all are not secured. However using markers (see Section 2.5, “Markers”), it is possible to use automated tools to produce lists of fields or methods selected for protection and thus the same object may have several annotations. Adding a disabled annotation is useful when an automatically selected object must not be changed by the translator for a reason not taken into account by the tool. If an object has at least one disabled annotation, the translator will not change it.
For example, if the method g from class
A is selected by a tool that automates call
protection but must remain unchanged because it is called through
reflection in some part of the application, the following annotation
makes sure the translator does not apply call protection to it:
class A {
@SecureMethod(enabled=false)
public int g() {
...
}
...
}This is an alternative method to using annotations and the only possible method for Java 1.4. It can also be used to accept the output of a tool that selects fields or methods automatically for protection.
To mark a field or a method, its containing class or interface name, its name and its signature must be listed on one line separated by spaces. The class name and the signature must follow the form defined in the Java Virtual Machine specification. Lines starting with a + character correspond to enabled annotations while lines starting with a - character correspond to disabled ones. If the + or - character is omitted, + is assumed by default.
Lines starting with a # character are ignored and can be used to add comments.
For example, to produce the same result as the following annotations:
package com.validy.sample;
import com.validy.technology.annotation.*;
public class Test {
@SecureField
private int value;
@SecureMethod
public final void f(String name);
@SecureMethod(enabled=false)
public int g();
}the following lines should be added to the markers file:
+ com/validy/sample/Test value I + com/validy/sample/Test f (Ljava/lang/String;)V # g is called through reflection in ... - com/validy/sample/Test g ()I
Serialization is a way of flattening the state of objects to
produce a data stream. This stream can then be stored on disk or
transmitted over the network and later parsed to reconstruct a copy
of the original objects. The Java language's builtin support for
serialization is described in the Java
Object Serialization Specification. Even though it can be
customized by implementing special methods
(writeObject and
readObject), the support for serialization
is largely declarative in nature:
classes whose instances need to be serialized are simply
marked by having them inherit from the
java.io.Serializable
interface,
fields in these classes that should not be serialized are marked with the transient modifier.
The Java runtime is responsible for the actual process of serializing and deserializing object instances. It works by inspecting the definition of classes at runtime using reflection to decide what should be put in the output stream and what should be done with the content of the input stream. The declarative nature of Java serialization makes it easy to use even if its interaction with class evolution requires careful planning. The Validy SoftNaos translator strives to retain this simplicity. No extra work is required to support the serialization of classes with secured fields or methods. However in the absence of runtime support, the translator must use the customization hooks provided by the Java language to implement the secure serialization and deserialization of fields stored inside the token. The rest of this section presents this implementation and its limitations. It assumes some familiarity with Java object serialization.
When one or more fields are marked for storage inside the
secure token, the translator removes their definition from the class
and they are replaced by a pointer to token memory named
k$this[9]. The Java runtime does not have enough information to
be able to serialize or deserialize the object. Therefore, the
translator must provide special support for serialization by:
overriding the default definition of persistent fields
through the declaration of a custom
serialPersistentFields (or the
modification of an existing declaration),
implementing custom writeObject
and readObject methods (or modifying
existing implementations).
When a secured field is serialized, its value is first loaded from the token memory to a token register, concatenated with a random nonce, checksummed, ciphered, and the resulting block is retrieved from the token to be put in the output stream. By default, when a field is deserialized, the ciphered block is sent to the token, deciphered, the checksum is verified, the nonce discarded, and if the value has not been tampered with, it is stored in the token memory at the proper location. The operations performed on the value of a field before serialization (nonce, checksum, cipher) are added to:
hide the actual value of the field when it is retrieved and stored outside of the token,
make it impossible to know whether a field has changed value or not between two serializations of the same object,
make it difficult for an attacker to tamper with the serialized value to try to alter the behavior of the secured application.
These operations use instructions from the token virtual machine and benefit from the same protection as other instructions generated by the translator (linked using tags and ciphered). The key used to cipher and decipher serialized values is stored inside the token at customization time and is independent from the one used to decipher instructions. Because of the nonce and checksum, a byte, short or int value that occupies one 32 bit word in the memory of the token is serialized as an 8 byte array.
By default, streams of objects that were produced before the application was transformed and contain the value of secured fields in clear can be read but the secured fields are not initialized and trying to access them later on triggers an error in the secure token. There are two ways to handle this problem:
a custom readObject method can be
defined that initializes the secured fields with a default value
before calling
defaultReadObject.
an option can be passed on the command line or as
an attribute of the
Ant task to change the default behavior. When this option is
specified, the implementation of
readObject is altered to accept either
ciphered blocks or clear values for secured fields in the input
stream. While ciphered blocks are handled as described above,
clear values are transmitted to the token as is and stored
directly in memory. This makes it possible to deserialize
objects that were serialized before the application was
transformed by the translator at the cost of extra
storage[10].
When this option is used, the protection against tampering becomes ineffective since an attacker can always remove the ciphered block and set the clear value in the serialized stream before it is deserialized.
When a field declared in a non serializable class is secured,
the translator removes the declaration of this field from the
bytecode of the application and all accesses to the field are
performed by ciphered instructions of the virtual machine. The only
remaining information is the presence of a
k$this field that indicates the presence
of at least one secured field in the class. However if the class is
serializable, the name and type of the field are declared in the
serialPersistentFields field and appear
in the class constructor's bytecode.
When custom readObject and
writeObject methods are defined, their
special handling by the translator depends on locating the call to
defaultReadObject or
readFields and
defaultWriteObject or
putFields respectively. If one of these
methods is not called, a warning is issued, the methods are handled
normally, and the fields will be serialized in clear.
The current version of the translator has no support for
externalization or secure custom serialization. If an application
class implements
java.io.Externalizable and its
implementation of writeExternal reads the
value of a secured field to stream it, the translator generates code
that retrieves the value of the field in clear from the token. A
future version of the translator may add support for an
assembly-level interface to the virtual machine that would let the
developer insert checksumming and ciphering instructions “by
hand” but still have the translator do register allocation,
set the right tag checks, and cipher the instructions.
The following missing functionalities may be added in future versions of the translator depending on demand:
possibility to remove the secure annotation on a field and still be able to read streams where its value was stored as a ciphered block,
control of code generation for backward compatibility at the class or field level using annotations,
[6] it is expected that future improved versions of the tools will be able to suggest or automatically select good candidates by analyzing the application.
[7] unless you are willing and authorized to distribute modified copies of third party libraries
[8] in the extends or implements relationship
[9] because it references an external resource (the secure
token heap), k$this is marked with
the transient modifier. The translator
generates the code that recreates it upon
deserialization.
[10] both the clear and the ciphered value of each secured
field must be declared in
serialPersistentFields for the
application to be able to retrieve either the clear or the
ciphered value of the field. Because of the way the Java
runtime handles this declaration, both values will be stored
in the output stream when an object is serialized by the
transformed application. The clear value of the field will
always be zero and the ciphered block will contain the
correct value.