
Support for Latest Intel Architectures - Intel® Xeon® Processors and Intel® Xeon Phi™ Coprocessors
Selecting the best models for your application today will set a path for you to take full advantage of multicore and many-core performance. Start today by implementing parallelism for today’s architecture and be ready for future architectures.
Improved Flow Graph Feature
Additional exception safety and the ability to iterate over graph nodes is now included in the flow graph feature. This improves usability and reliability of the flow graph, making it applicable to more use cases.
Conditional Numerical Reproducibility
The new Intel TBB template function parallel_deterministic_reduce overcomes the inherently non-associativity characteristics of floating-point arithmetic results.
Additional C++11 Support
Intel is committed to supporting the C++11 standard and we have added more in this release. TBB can be used with C++11 compilers and supports lambda expressions.
New Examples and Documentation
A new HTML & CHM TBB Reference Manual makes it easier to find the answers you need. New examples demonstrate how to use major new features including logic_sim for the flow graph.
Parallel Algorithms - Generic implementation of common parallel performance patterns
Generic implementations of parallel patterns such as parallel loops, flow graphs, and pipelines can be an easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.
Scheduler - Engine that manages parallel tasks and task groups
Intel® TBB task scheduler supports task-based programming and utilizes task-stealing for dynamic workload balancing – a scalable and higher level alternative to managing OS threads manually. The implementation supports C++ exceptions, task/task group priorities, and cancellation which are essential for large and interactive parallel C++ applications.
Concurrent Containers - Generic implementation of common idioms for concurrent access
Intel® TBB concurrent containers are a scalable alternative to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to protect them from concurrent access and modification. Concurrent containers allow multiple threads to concurrently access and update items in the container maximizing the amount of parallel work and improving application’s scalability.
Synchronization Primitives- Exception-safe locks, mutexes, condition variables, and atomic operations
Intel® TBB provides a comprehensive set of synchronization primitives with different qualities that are applicable to common synchronization strategies. Exception-safe implementation of locks help to avoid dead-locks in C++ programs which use C++ exceptions. Usage of Intel® TBB atomic variables instead of C-style atomic API minimizes potential data races.
Scalable Memory Allocators - Scalable memory manager and false-sharing free memory allocator
The scalable memory allocator avoids scalability bottlenecks by minimizing access to a shared memory heap via per-thread memory pool management. Special management of large (≥8KB) blocks allow more efficient resource usage, while still offering scalability and competitive performance. The cache-aligned memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.
Applicable to Various Application Domains
The Intel® TBB flow graph as well as generic template functions are customizable to a wide variety of problems.
User-Defined Tasks
When an algorithm cannot be expressed with high-level Intel® TBB constructs, the user can choose to create arbitrary task trees. Tasks can be spawned for better locality and performance or enqueued to maintain FIFO-like order and ensure starvation-resistant execution.
Dynamic Task Scheduling
Intel® TBB allows a developer to think of parallelism at the higher level to avoid dealing with low level details of threading. A developer expresses a parallel model in terms of parallel tasks and relies on Intel® TBB to execute them in an efficient way by dynamically detecting the appropriate number of threads. This makes Intel® TBB based solutions independent of the number of CPU’s and allows for improved performance and scalability with the growing number of cores in the future.
Support for Various Types of Parallelism
Intel® TBB task scheduler and parallel algorithms support nested and recursive parallelism as well as running parallel constructs side-by-side. This is useful for introducing parallelism gradually and helps independent implementation of parallelism in different components of an application.
Co-existence with Other Threading Packages
Intel® TBB is designed to co-exist with other threading packages and technologies (Intel® Cilk™ Plus, Intel® OpenMP, OS threads, etc.). Different components of Intel® TBB can be used independently and mixed with other threading technologies.
Compiler-independent Solution
Intel® TBB is a library solution and can be used in software projects built by multiple compilers, across numerous platforms.
Royalty-free Distribution
Redistribute unlimited copies of the Intel® TBB libraries and header files with your application.
Open Source Version
Available for download from threadingbuildingblocks.org. The broad support from an involved community provides developers access to additional platforms and operating systems.