Skip to content

Hoist access to volatile variables out of loop in ConcurrentDictionary #34198

Closed
@kunalspathak

Description

@kunalspathak

There are few places in ConcurrentDictionary where we call AcquireAllLocks but still access the volatile variables like _tables._countPerLock, _tables._buckets and _tables._locks inside loop.

E.g. CopyTo, GetCountInternal, CopyTo, GetKeys and GetValues.

Is there a reason for doing it or can we cache those variables outside the loop and use it inside? Currently, for ARM64, JIT generates expensive memory barrier instructions for accessing volatile variables. With volatile variables getting accessed inside the loop, we are executing these instructions inside the loop. Caching them would optimize the performance of these APIs on ARM64.

Below is an example of machine code we generate before/after caching the volatile variable outside loop for GetCountInternal method.
Before:
before

After: I made following change and see this generated code. This gave approx. 30% win in Dictionary.Count benchmark.

private int GetCountInternal()
{
    int count = 0;
    int[] countPerLocks = _tables._countPerLock;

    // Compute the count, we allow overflow
    for (int i = 0; i < countPerLocks.Length; i++)
    {
        count += countPerLocks[i];
    }

    return count;
}

after

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions