Java Memory Model

The Java memory model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.In single threaded or sequential execution, the stores into memory are coming in some given order (not necessarily same as execution order though), and it is obvious that the reads should observe the latest writes in that order. That is why people usually meet with memory models only for multi-threaded programs, where this question becomes complicated

The Java Memory Model was an ambitious undertaking as it was the 1st time that a programming language was being designed trying to provide consistent semantics for concurrency across a variety of architectures. However, the original memory model (in Java versions 1 through 4) was widely considered by the Java community as broken. It was updated in Version 1.5 as part of JSR-133

Most modern day computer architectures have a tiered memory structure. Cache memory is really fast (SRAM) and is also physically located very close to the processor (L1 is placed on CPU chip closest to processor, L2 is placed in between CPU and RAM, if L3/L4 are present then they are on motherboard) whereas accessing main memory involves data transfer over bus and is significantly slower.

The diagram below indicates memory hierarchy and relative access times:

Clearly, as we go down the memory hierarchy, reading and writing from/to memory becomes more and more expensive. Hence, for performance reasons each thread is allowed to copy state of the variables from main memory (DRAM) to cache and work on the cached copy. The cache is guaranteed to synchronize with main memory only when there's a memory barrier enforcing a happens-before guarantee (e.g synchronized, volatile etc) Say, 2 threads, T1 and T2 are being executed simultaneously on a multi core machine. They are both being run on different processors and have dedicated parts of the cache reserved for their use. They also happen to mutate the same object, obj1 and the Operating System decides to fetch obj1 and put it in each of their respective caches. So now, there are 3 copies of obj1 - 1 in main memory, 1 in the cache region for T1 and the last in the cache region for T2. This immediately creates a problem. The mutation done by T1 on it’s copy of obj1 is not visible to T2. This is a classic case of lost updates.

This can be resolved by adding memory barriers which force the cache to be loaded from main memory upon entering the barrier and force flush the cache to main memory upon barrier exit. In other words, we have to implicitly inform the Operating System (via the JVM) to perform reads and writes of the object from main memory. In Java, this is called "guarantee against caching".

When we execute a piece of code, the order in which the instructions get executed at the CPU isn't necessarily same as the order in which they appear in the code. This is because, the JVM, the compiler and even the processor are allowed to modify the order in which the instructions appear (or even ignore some of the instructions) in order to optimize performance.



public class CodeReordering {
    public static void main(String[] args) {
        int i = 5;
        System.out.println(i);
        i = 10;
        int j = 6;
        System.out.println(j);
        System.out.println(i);
    }
}

For example in the above code, the assignment of i to 10 might be executed after assignment of j to 6. This is because the JVM realizes that the value of i is not being used immediately and it can save on the cost of executing that statement. When the JVM encounters the 2nd print statement for i, it realizes that a statement that it skipped over earlier needs to be executed and executes it then.

Note that reordering maintains "as-if-serial" semantics in the context of a particular thread - This means that two statements which are related will have a happens-before relation, but two unrelated statements can be executed out-of-order. The JVM specifications allow for instruction reordering as long as it does not effect the execution of that thread in isolation. The guarantee of output not being disturbed is only for that thread running in isolation. The guarantee does not hold true if there are multiple threads running and trying to work on the same resources.

Let's look at the following code snippet:

class SomeClass {
  private Resource resource = null;
  public Resource getResource() {
    if (resource == null) {
      synchronized {
        if (resource == null) 
          resource = new Resource();
      }
    }
    return resource;
  }
}

This is a common locking idiom known as double check locking where we check a condition twice - once before synchronization and second time after acquiring a lock. Unfortunately it doesn't work

resource = new Resource() might look like an atomic instruction but it actually involves the following steps :

allocate memory
create new instance of class Test in memory
initialize it's fields to default values
run the constructor which includes running its parent constructor and so on
assign the reference to the newly constructed object to the variable resource.

Java Memory Model allows these steps to be executed in any order in the absence of a guarantee against reordering.

Let's consider a situation where instance is still null and two threads - ThreadA and ThreadB are executing this code. ThreadA enters first and is executing the statement : resource = new Resource() whereas ThreadB has just started executing the getResource() method. Consider the effect on memory of this initialization. Memory for the new Resource object will be allocated; the constructor for Resource will be called, initializing the member fields of the new object; and the field resource of SomeClass will be assigned a reference to the newly created object.

However, since thread B is not executing inside a synchronized block, it may see these memory operations in a different order than the one thread A executes. It could be the case that B sees these events in the following order (and the compiler is also free to reorder the instructions like this): allocate memory, assign reference to resource, call constructor. Suppose thread B comes along after the memory has been allocated and the resource field is set, but before the constructor is called. It sees that resource is not null, skips the synchronized block, and returns a reference to a partially constructed Resource! Needless to say, the result is neither expected nor desired.

Let's take a look at another example:



public class CodeReordering {
 
    int A = 0; 
    int B = 0; 
    int r1 = 0; 
    int r2 = 0;
 
    public static void main(String[] args) {
        final CodeReordering reorderObj = new CodeReordering();
        Thread t1 = new Thread(new Runnable() {
            @Override
            public void run() {
                reorderObj.r2 = reorderObj.A;
                reorderObj.B = 1;
            }
        });
 
        Thread t2 = new Thread(new Runnable() {
            @Override
            public void run() {
                reorderObj.r1 = reorderObj.B;
                reorderObj.A = 2;
            }
        });
 
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println("r1 : " + reorderObj.r1 + " r2 : " + reorderObj.r2);
    }
}

It’s perfectly possible that the output would be r1 = 1 and r2 = 2. This can happen when the code inside each of the threads is reordered. The reordering won’t have any impact if the threads are run in isolation, but when run in parallel, it leaves difficult to track, difficult to reproduce and difficult to solve bugs in the program.

Synchronized Blocks and Methods

Synchronization ensures that memory writes by a thread before or during a synchronized block are made visible in a predictable manner to other threads which synchronize on the same monitor. After we exit a synchronized block, we release the monitor, which has the effect of flushing the cache to main memory, so that writes made by this thread can be visible to other threads. Before we can enter a synchronized block, we acquire the monitor, which has the effect of invalidating the local processor cache so that variables will be reloaded from main memory. We will then be able to see all of the writes made visible by the previous release of monitor.

Synchronization provides some level of guarantee against reordering.

Statements before or after the synchronized block may be moved inside the synchronized block ( provided they are not in a different synchronized block of course) - it is sometimes called the "roach motel" principle. Roach motel principle states that instructions can move into synchronized block, but not out. So, normal accesses can be moved from after a monitor exit to before it, and from before a monitor enter to after it.

Also statements inside a synchronized block may be reordered as long as the modification is sequentially consistent (i.e. the effect is not visible from other code that shares a happens-before relationship with the block

Let's look at the following code:

int a = 1; //statement 1
int b = 2; //statement 2
synchronized(lock) {
    int c = 3; //statement 3
    int d = 4; //statement 4
    int e = 5; //statement 5
}
int f = 6; //statement 6
int g = 7; //statement 7

statements (3, 4, 5) can be mutually reordered

statements (1, 2) can be mutually reordered

statements (6, 7) can be mutually reordered

statements (1, 2) can be reordered with (6, 7)

statements (3, 4, 5) can't be reordered with (6, 7)

statements (3, 4, 5) can't be reordered with (1, 2)

Volatile Variables

JVM guarantees that if a variable is declared as volatile, every thread reading the variable is always guaranteed to read the value from main memory. Just like synchronization, volatile creates a memory barrier. So when a thread reads a volatile variable it's local processor cache is replaced with the values from heap (including values for non volatile variables as well). Similarly when a thread writes to a volatile variable, changes in local cache are propagated to main memory. Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire

There's a difference between old(1.4) and new(1.5) memory model when it comes to reordering of volatile variables. Old memory model states that a volatile variable cannot be reordered with another volatile variable but can be reordered with a non volatile variable. As per new memory model, however, a volatile variable cannot be reordered at all. So the double check lock idiom is fixed now as long as the variable is declared as volatile:

class SomeClass {
  private volatile Resource resource = null;
  public Resource getResource() {
    if (resource == null) {
      synchronized {
        if (resource == null) 
          resource = new Resource();
      }
    }
    return resource;
  }
}

Final Variables

Before we look into provisions for final variable as per Java Memory Model, first we need to understand that constructors are not sacrosanct and hence not treated specially by either JIT compiler or CPU and hence, in the absence of a memory barrier, it's perfectly legal to reorder instructions from the constructor with instructions that come after the constructor

So, if we look at the following code:

public class Dummy {
      int x = 1;
      
      static Dummy d;
      public Dummy() {
        x = 3;
      }


      static void writer() {
        d = new Dummy();
      }


      static void reader() {
        if (d != null) {
          int i = d.x;
        }
      }
}

There's always a possibility that a thread executing Dummy.reader() will read i as 1 and not 3 (because the statement x = 3 got executed after the the statement d = new Dummy(). So d isnt null but x is still 3. But this means reference to an object which is partially constructed (from a functional point of view) is allowed to escape which is definitely problematic.

Under the old memory model, definition of final (wrt ARCH model) was unspecified, thus allowing JVM implementors to decide on whether to reorder and/or cache final variables.

As per new memory model, if final variables are initialized in the constructor, then the address of the enclosing object is not allowed to escape until all final fields have been initialized and the changes on local cache have been pushed to main memory.

So if we look at a modified version of the above code :

public class Dummy {
      final int x;
      int y = 0;
      
      static Dummy d;
      
      public Dummy() {
        x = 3;
        y = 5;
      }


      static void writer() {
        d = new Dummy();
      }


      static void reader() {
        if (d != null) {
          int i = d.y;
          int j = d.x;
        }
      }
}

Now we can never have a situation where d is not null but x is 0 (because the reference isn't allowed to escape until x is initialized). However, d is not null but y is 0 is still as possibility.