|
| Research on Transactional Memory supported by NSF CAREER Award CCF 03-47260 and a gift from Intel.
|
With the advent of commodity multiprocessing, the challenges of
parallel programming have attracted a lot of attention. In
particular, locks as a means of concurrency control have a number of
known drawbacks: a trade-off between simplicity and concurrency, error
prone, potential for deadlock, lack of composition, priority
inversion, etc. To remedy these limitations, Transactional Memory
(TM) was proposed (by Herlihy & Moss, 1993).
|
Hardware Transactional Memory
|
In building a hardware transactional memory (HTM), the main challenge
is in building a system that provides full-speed execution for simple,
reasonably-sized transactions, while providing support for rich
transaction semantics (arbitrary sizes, fancy nesting, debugging,
etc.) with reasonable cost and without sacrificing any other machine
properties (e.g., performance isolation, virtualizability).
Our most recent work rectifies the problems with Hybrid TM (use a best
effort HTM and handle the rest in a software TM). By providing
efficient hardware fine-grain memory protection, we extend an STM to
be "strongly atomic" (i.e., so that conflicts between transactions and
non-transactional code will be detected). In this way, we can avoid
instrumenting the hardware transactions and the overhead doing so
introduces. As a result, we achieve a TM system with performance
close to an "ideal" HTM by composing two general-purpose hardware
mechanisms: best-effort atomic execution (also useful for SLE and speculative optimization) and fine-grain
hardware memory protection (also useful for debugging, invariant
monitoring, etc). In this work, we also characterize the requirements
for contention management and abort policies for achieving good
performance in a hybrid TM.
Using Hardware Memory Protection to Build a High-Performance, Strongly Atomic Hybrid Transactional Memory
Lee Baugh, Naveen Neelakantam, and Craig Zilles (ISCA 2008)
In earlier work, we identified the challeneges to providing performance
isolation in Hardware TM systems:
Challenges to
Providing Performance Isolation in Transactional Memories
Craig Zilles and David Flint. (WDDD 2005)
We also explored how to extend a virtualizing hardware TM with hooks
to software to provide a cost-effective (in terms of hardware) means
to providing richer transactional semantics:
Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions
Craig Zilles and Lee Baugh (Transact 2006)
|
|
Software and Hybrid Transactional Memory
|
One concern in software transactional memory (STM), has been
minimizing the overhead. One common optimization has been the
implementation of tagless ownership tables, where a whole slice of
memory is locked at a time, potentially leading to alias-based false
conflicts. In the paper below, we show that the likelihood of
aliasing grows superlinearly with both the transaction footprint and
concurrency. As a result, those STMs that implement the tagless
optimization may be sacrificing concurrency (generally the goal
of TM) for lower overhead.
Transactional Memory and the Birthday Paradox
Craig Zilles and Ravi Rajwar (SPAA 2007)
Implications of False Conflict Rate Trends for Robust Software Transactional Memory
Craig Zilles and Ravi Rajward (IISWC 2007)
|
Transactional Memory Workloads
|
A significant remaining challenge for research in TM is the lack of TM
workloads. Universally proposed TM systems are evaluated with either
"transactionalized" versions of existing lock-based programs or
microbenchmarks (e.g., "hashtable"). Neither class of programs are
likely to be representive of the class of programs that TM systems are
intended to evoke: concurrent versions of what were previously
sequential programs. In an effort to rectify this problem, I organized a Workshop on Transactional Memory
Workloads that was held in conjunction with PLDI 2006.
To reason about the use of I/O within transactions, in spite of the
lack of large scale TM workloads, we performed an analysis of conventionally
locked programs and characterized the I/O performed in critical sections.
We find that: 1) I/O are frequently in critical regions for data protection
reasons, 2) most I/O's can be handled by one the previously proposed
techniques for transparently handling I/O, 3) that I/Oing critical
sections have much longer run-times than most critical sections, and 4)
there is little concurrency between I/Oing critical sections.
An Analysis of I/O and Syscalls in Critical Sections and Their Implications for Transactional Memory
Lee Baugh and Craig Zilles (Transact 2007)
One of the workloads that we've tried to parallelize with transactions
is the Python interpreter. While the Python programming
language provides threads as part of the language, its canonical
implementation (CPython) provides only limited concurrency as bytecode
execution is performed while holding a "global interpreter lock."
With the expectation that the bytecodes would likely be embarrassingly
parallel, one of my students undertook replacing this overly
conservative concurrency control mechanism with the optimistic
execution and fine-grain conflict detection provided by wrapping each
bytecode with a (hardware) transaction. Our lessons learned were
two-fold: 1) it was remarkably easy to eliminate the false conflicts
resulting from the interpreter's use of global variables, and 2) it
was remarkably hard to correctly deal with the bytecodes (and
therefore the transactions) that resulted in system calls and I/O, in
part because they may be encapsulated in native code. In hindsight,
we recognized that exposing Python's concurrency would have been much
easier by programming to a hardware abstraction that speculatively
executed lock-based critical sections (i.e., Speculative Lock Elision
(SLE)), which would have transparently handled the system call and I/O
issues correctly. We believe that this conclusion likely generalizes
to many of the workloads that have been executed on TM prototypes.
Hardware Transactional Memory Support for Lightweight Dynamic Language Evolution
Nicholas Riley and Craig Zilles (DLS 2006)
|
|
Some Challenges Facing Transactional Memory
|