WL#2595: kernel-independent atomic operations

Affects: Server-5.5 — Status: Complete

Description
High Level Architecture
Low Level Design

Different systems, and newer versions of glibc, define atomic_add() in places 
other than , which is the only place we look.

For example, it is in  on FreeBSD.

QNX defines it in 

Some versions of glibc may define it in 

Novell NetWare's LibC includes it in 

Also, Windows has the Interlocked*() functions which we may be able to use.

We could also potentially implement these in assembler code for platforms that 
don't already define them in headers.

[serg]: we should probably implement these in assembler even for platforms that
define them in headers - notably Linux/FreeBSD - because user applications are
not supposed to include kernel headers (such as  and
.
Kernel headers may depend on the particual kernel and atomic operations will
work incorrectly if the binary will be used with another kernel [1]. Kernel
headers may refer to other kernel symbols/macros [2]. Kernel headers may contain
#warning/#error to discourage/prevent usage outside of the kernel [3].
Another solution is to use user-level portable atomic operation library [4]

The best solution seems to be implementing our own atomic framework instead of
taking exising library (e.g. libatomic by HP).

The most important argument - all other implementation of atomic operators (many
projects use their own implementations - mostly they're all similar) -
implements atomic operators for "generic" platform with mutexes by doing

atomic_add(atomic_t *var)
{
  pthread_mutex_lock(mutex);
  (*var)++;
  pthread_mutex_unlock(mutex);
}

in libatomic [4] there's one global mutex, in smarter implementations there's an
array of mutexes and var's address is hashed to get a mutex.

No implementation allows to lock a mutex explicitly with

  atomic_mutex_lock(mutex);
  ... accessing many atomic variables
  atomic_mutex_unlock(mutex);

which woud save many lock/unlock calls in "generic" implementation (and
atomic_mutex_lock/unlock would be no-op in "real" implementation).

We want to be able to do that.


1. https://bugzilla.altlinux.org/show_bug.cgi?id=5884
2. http://bugs.mysql.com/bug.php?id=7970
3. production:/usr/include/asm/atomic.h
4. http://www.hpl.hp.com/research/linux/atomic_ops/index.php4

API:

1. to lock atomic variables manually (for "generic" version)

atomic_rwlock_t name1,name2, ....
atomic_rwlock_destroy(name)
atomic_rwlock_init(name)
atomic_rwlock_rdlock(name)
atomic_rwlock_wrlock(name)
atomic_rwlock_rdunlock(name)
atomic_rwlock_wrunlock(name)

2. actual atomic operations in the form

atomic_(atomic__t *var [, second_ard], atomic_rwlock_t name)

where  is 8,16,32,64;  is add,swap,cas,load,store
which gives

atomic_add8	atomic_add16	atomic_add32	atomic_add64
atomic_swap8	atomic_swap16	atomic_swap32	atomic_swap64
atomic_cas8	atomic_cas16	atomic_cas32	atomic_cas64
atomic_load8	atomic_load16	atomic_load32	atomic_load64
atomic_store8	atomic_store16	atomic_store32	atomic_store64

atomic types

atomic_8_t atomic_16_t atomic_32_t atomic_64_t

"second_arg" is present for add,swap,cas,store, and absent for load.

name_is an atomic_rwlock created with atomic_declare_rwlock
or 0 if there's no need to lock (if lock was taken explicitly)

3. initialization code

atomic_initialize()

it performs whatever initialiation is necessary. In particular,
it checks that the this code can run on this architecture (that is P6 code is
not running on i80386, UP code on SMP, and so on)

4. in debug builds every atomic var remembers what rwlock it is used with. It
asserts that only one rwlock is used for any single atomic var. It may also
assert that no lock is taken recursively.

1. The trick is to convert

atomic_rwlock_t name1,name2, ..., nameN;

to no-op. Konstantin suggested:

typedef struct {} atomic_rwlock_t

2. To complete this task it's enough to implement framework, generic (pthread
rwlocks), dummy (1 CPU, no synchronisation), and x86 modes. A support for Sparc,
MIPS, and other CPUs will be added as separate WL tasks.