Technology Overview
RapidMind Architecture
RapidMind is a development and runtime platform that enables single threaded, manageable applications to fully access multi-core processors. With RapidMind, developers continue to write code in standard C++ and use their existing skills, tools and processes. The RapidMind platform then parallelizes the application across multiple cores and manages its execution.
API
- Intuitive, integrates with existing C++ compilers, and requires no new tools or workflow
Platform
- Code Optimizer analyzes and optimizes computations to remove overhead
- Load Balancer plans and synchronizes work to keep all cores fully utilized
- Data Manager reduces data bottlenecks
- Logging/Diagnostics detects and reports performance bottlenecks
Processor Support Modules
- x86 processors from AMD and Intel
- ATI/AMD and NVIDIA GPUs
- Cell Blade, Cell Accelerator Board, PS3
The API
The RapidMind API is discussed in detail in the next section. It provides an intuitive interface for creating RapidMind enabled applications entirely in C++ using existing tools and compilers. RapidMind is a solution for developers of both new and existing applications. To achieve results quickly, developers can RapidMind-enable only the most critical parts of their applications. The remainder of their code is unaffected.
The Platform
The RapidMind platform does the low-level heavy lifting required to achieve high performance on multi-core processors. In addition to adapting the computation to the capabilities of the specific processor in the deployment target, it manages communication and data flow between the host processor and target device(s). It handles memory transfers and load balancing, leaving the developer free to focus on high-level alogrithms and design.
Processor Support Modules
The RapidMind Platform provides a set of backends. Each provides services that support the execution of RapidMind programs on a particular processor. The developer does not have to deal with the details of each processor, and is free to write portable applications that work on a variety of processor targets.
- The x86 backend executes RapidMind programs on x86 CPUs from Intel and AMD
- The GPU backend executes RapidMind programs on a variety of Graphics Processing Units (GPUs) from both ATI and NVIDIA
- The Cell BE backend executes RapidMind programs on the SPEs of the Cell BE Broadband Engine
- The Debug backend executes RapidMind programs on the host processor, compiling programs with a C compiler
Parallelizing Your Application with RapidMind
The RapidMind platform enables the developer to easily and quickly parallelize applications.
- An application is expressed by the developer as a sequence of functions applied to arrays
- The RapidMind platfom divides the application’s data and computation automatically among the cores for processing
- The SPMD stream processing model used by the RapidMind platform, can easily scale to a large number of cores while maintaining the simplicity of a single thread of control
- The RapidMind runtime component is embedded in the resulting application and dynamically manages and optimizes the workload on the target processor
Optionally: Accelerating to GPUs or the Cell
Accelerators such as the GPU or Cell provide an opportunity for further performance enhancements beyond that of the primary x86 processor (the host). Without changing your application logic, if the hardware is available, you can achieve an additional speed up using accelerators. The RapidMind platform will automatically manage movement of data and computation between the accelerator and the host.
Process
When using the RapidMind Multi-core Development Platform developers continue to program in C++. After identifying components of their application to accelerate, the overall process of integration is as follows:
1. Replace types: The developer replaces numerical types representing floating point numbers and integers with the equivalent RapidMind platform types.
2. Capture computations: While the user’s application is running, sequences of numerical operations invoked by the user’s application can be captured, recorded, and dynamically compiled to a program object by the RapidMind platform.
3. Stream execution: The RapidMind platform runtime is used for managed parallel execution of program objects on the target hardware platform, which can be a GPU, the Cell processor, or a multicore CPU.
Programming with RapidMind
The RapidMind Multi-core Development Platform
This document provides an overview of the RapidMind platform: its uses and benefits, how it works, and how software developers can integrate it into their application.
Download as PDF
Build high-performance apps for multicore processors.
The RapidMind Multi-core Development Platform provides a simple single-source mechanism to develop portable high-performance applications for multicore processors. In particular, you can use it to develop applications that fully exploit the power of the Cell Broadband Engine™ (Cell/B.E.) processor’s unique architecture by writing only one, single-threaded C++ program using an existing C++ compiler. In this article, author Michael McCool takes you on a guided tour of the RapidMind Multi-core Development Platform.
This article is available from the IBM website.

RapidMind: C++ Meets Multicore - Making the most of multiple cores
RapidMind is a framework for expressing data-parallel computations from within C++ and executing them on multicore processors.
This article appeared in the July 2007 edition of Dr. Dobb’s Journal. Down the full article as a PDF.
