High performance is obtained by:
using GLib's g_mem_chunk and fast
non-blocking allocation algorithms where possible to
minimize dynamic memory allocation.
extremely light-weight links between plugins. Data can travel the pipeline with minimal overhead. Data passing between plugins only involves a pointer dereference in a typical pipeline.
providing a mechanism to directly work on the target memory. A plugin can for example directly write to the X server's shared memory space. Buffers can also point to arbitrary memory, such as a sound card's internal hardware buffer.
refcounting and copy on write minimize usage of memcpy. Sub-buffers efficiently split buffers into manageable pieces.
the use of cothreads to minimize the threading overhead. Cothreads are a simple and fast user-space method for switching between subtasks. Cothreads were measured to consume as little as 600 cpu cycles.
allowing hardware acceleration by using specialized plugins.
using a plugin registry with the specifications of the plugins so that the plugin loading can be delayed until the plugin is actually used.
all critical data passing is free of locks and mutexes.