CopyOnWriteArrayList and CopyOnWriteArraySet

CopyOnWriteArrayList

CopyOnWriteArrayList uses an interesting technique to make it thread-safe without a need for synchronization - all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array. This is ordinarily too costly, but may be more efficient than alternatives when traversal operations vastly outnumber mutations, and is useful when you cannot or don't want to synchronize traversals.

CopyOnWriteArrayList maintains it's internal state by using an Object[] as the inner data structure

 private transient volatile Object[] array;

With every mutation by a thread, this array has to be reinitialized. Hence it's declared as volatile to ensure happens-before guarantee.

CopyOnWriteArrayList also maintains a ReentrantLock which is which used to ensure mutual exclusion and atomicity while performing any mutation operation

 final transient ReentrantLock lock = new ReentrantLock();

add
To add an element first the existing array elements are copied into a new array. The new array is declared with length as length of existing array + 1. The element to add is appended at the end of the new array.

public boolean add(E e) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        newElements[len] = e;
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

Lets say we have an existing CopyOnWriteArrayList object containing 5 elements - [1,2,3,4,5]. Now two threads T1 and T2 are invoking add(6) and add(7) on the object. The lock will ensure mutual exclusion and depending on which thread CPU picks up first, either T1 can get lock and add element while T2 is waiting or vice versa. Once the first thread is done, the second thread will create another copy of the array, insert the element and assign the reference to the new array (Since the array is declared as volatile, the second thread is guaranteed to pick the new array created by the first thread and not any stale value). So depending on which thread gets picked up first, the final state may be either [1,2,3,4,5,6,7] or
[1,2,3,4,5,7,6] but there will never be an issue with lost updates

Note that Arrays.copyof( ..) is an O(n) operation. So this will be very inefficient for large arrays.

The same logic will also apply in case of addition at a particular index. the values upto that position will have to be copied to the new array as they are while the values from that index to end will be copied after shifting one position to the right. Finally the new element will be copied at the desired index

public void add(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        if (index > len || index < 0)
            throw new IndexOutOfBoundsException("Index: "+index+
                                                ", Size: "+len);
        Object[] newElements;
        int numMoved = len - index;
        if (numMoved == 0)
            newElements = Arrays.copyOf(elements, len + 1);
        else {
            newElements = new Object[len + 1];
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index, newElements, index + 1,
                             numMoved);
        }
        newElements[index] = element;
        setArray(newElements);
    } finally {
        lock.unlock();
    }
}

update
The set method works very similarly as add. The array elements are copied to a new array (of same size) and the new array is then updated. The critical section is protected by ReentrantLock to ensure no interleaving

public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        E oldValue = get(elements, index);

        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

delete
The remove operation is very similar. To remove the element at index i, we declare a new array of length 1 less than original array length, copy elements at position 0 to i-1 as they are and then copy elements at position i+1 to end from source array to one position shifted to the left in the new array.

public E remove(int index) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        E oldValue = get(elements, index);
        int numMoved = len - index - 1;
        if (numMoved == 0)
            setArray(Arrays.copyOf(elements, len - 1));
        else {
            Object[] newElements = new Object[len - 1];
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index + 1, newElements, index,
                             numMoved);
            setArray(newElements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

fetch
Random access is straightforward and totally lock free.

public E get(int index) {
    return get(getArray(), index);
}
private E get(Object[] a, int index) {
    return (E) a[index];
}

iteration
Just like other concurrent Collection classes, iterators in CopyOnWriteArrayList are designed to be weakly consistent. The snapshot style iterator method uses a reference to the state of the array at the point that the iterator was created. This array never changes during the lifetime of the iterator, so interference is impossible and the iterator is guaranteed not to throw ConcurrentModificationException. The iterator will not reflect additions, removals, or changes to the list since the iterator was created. Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException.

CopyOnWriteArraySet

CopyOnWriteArraySet is a Set that uses an internal CopyOnWriteArrayList for all of its operations. Thus, it shares the same basic properties:
- It is best suited for applications in which set sizes generally stay small, read-only operations vastly outnumber mutative operations, and you need to prevent interference among threads during traversal.
- It is thread-safe.
- Mutative operations (add, set, remove, etc.) are expensive since they usually entail copying the entire underlying array.
- Iterators do not support the mutative remove operation.
- Traversal via iterators is fast and cannot encounter interference from other threads. Iterators rely on unchanging snapshots of the array at the time the iterators were constructed.

A CopyOnWriteArraySet internally uses a CopyOnWriteArrayList as the underlying data structure.

     private final CopyOnWriteArrayList<E> al;

All internals operations are delegated to the corresponding operations on the CopyOnWriteArraySet.

Since Sets can't contain duplicate elements unlike lists, add(E e) in CopyOnWriteArraySet involves a call to the addIfAbsent(E e) method of CopyOnWriteArrayList which first tries to check if the element is already present or not and invokes add only if the element doesn't exist already.

public boolean addIfAbsent(E e) {
    Object[] snapshot = getArray();
    return indexOf(e, snapshot, 0, snapshot.length) >= 0 ? false :
        addIfAbsent(e, snapshot);
}

In the above code, what if another thread adds a the same element after current thread has executed the statement but yet to execute the second statement ?

To avoid this situation, the addIfAbsent(E, Object[]) method performs another check (after acquiring lock) if the snapshot is same the current version of the underlying array. If they are not the same objects , the check (whether the element already exists or not) is performed again (Because its guarded by a lock, there's no scope for race condition here) and the element is added only if it doesn't exist already.

private boolean addIfAbsent(E e, Object[] snapshot) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] current = getArray();
        int len = current.length;
        if (snapshot != current) {
            // Optimize for lost race to another addXXX operation
            int common = Math.min(snapshot.length, len);
            for (int i = 0; i < common; i++)
                if (current[i] != snapshot[i] && eq(e, current[i]))
                    return false;
            if (indexOf(e, current, common, len) >= 0)
                    return false;
        }
        Object[] newElements = Arrays.copyOf(current, len + 1);
        newElements[len] = e;
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

comments powered by Disqus