Computer industry is approaching a formidable obstacle course where anyone wishing to drive advances in computing technology must carefully negotiate several segments of computing. Consumers want improved battery life, size, and weight for their laptops, tablets, and smartphones. Likewise, data center power demands and cooling costs continue to rise. At the same time, we constantly demand rise in performance to enable compelling new user experiences. We want to access our devices through more natural interfaces (speech and gesture), and we also want these devices to manage the ever-expanding volumes of data (home movies, pictures, and a world of content available in the cloud).

To deliver such new user experiences, programmer productivity is another essential element that must be delivered. It needs to be simple and easy for software developers to tap into new capabilities through powerful, familiar programming models. It is increasingly important that software be supported across a broad spectrum of devices. Developers cannot sustain today’s trend of re-writing code for an ever expanding number of different platforms. To navigate this complicated set of requirements, the computer industry needs to adopt a unique, a more efficient approach to computer architecture. An approach that delivers improvement across all 4 of these vectors: programmability, power, performance, and portability.

Heterogeneous Computing as a Solution

Solution to current day computing challenge can be answered by heterogeneous computing. In simple words, specialized computing for specialized workloads, software, and specific usage models.It refers to systems that uses more than one kind of processor or cores.

Type of processing unit / cores and how they are used together for heterogeneous computing

Each type of processing unit/core is designed for some specific kind of tasks. Below are the same in detail:

CPU (Central Processing Unit): A CPU consists of a few cores optimized for sequential serial processing. It does not operate on floating point.

GPU (Graphic Processing Unit): GPU has a massively parallel architecture consisting of thousands of smaller, more efficient cores designed for handling multiple tasks simultaneously. It computes on floating point.

APU (Accelerated Processing unit): APU is CPU + GPU on a single chip. By combining both processors onto a single chip, the components are able to communicate faster and give you greater processing power and performance, but these are not a replacement for standalone GPU or CPU.
DSP (Digital Signal Processor): They are designed for quickly performing large number of numeric operations repeatedly on a series of data samples and are ideal for processing streaming digital signals. The power used for computation is much lower than any other processing units available.

FPGA (Field Programmable Gate Array): An FPGA is entirely different from CPU, GPU, DSP or any other processing units, with very few exceptions, every chip inside your computer is hard-coded (at the time of manufacturing) to perform just one set of functions. Your CPU can only do exactly what Intel or AMD designed it to do. You can’t take your CPU and turn it into a GPU. But you can take an FPGA, program it to perform one set of functions (say, graphics), and then reprogram it to handle another type of workload (say, sorting through databases). The main advantage of an FPGA, other than its customizability, is that it has monstrously high performance.

Different use cases require different combinations of processing unit /core to tackle the required balance between 4 vectors—Power, Performance, Programmability and Portability

Use case 1

QUALCOMM is leaning into their hexagonal DSP. Reason, if programmed, DSP to something, it will generally do that function at much lower power draw, making its processor perform efficiently in their segment of Market (handheld devices, tablets and smart devices)

Use case 2

Providing Processors for server was one of the core business of Intel. Intel Xeon chips are good all-round chips—but there are plenty of cases where another, more-workload-specific chip might make sense. People soon started looking at cheaper, low-power chips for web servers.  Intel, rather than creating a generic solution, created a new processor with a core of Xeon and a core of FPGA. This could be programmed to act like a more-workload-specific chip similar to ASIC—Application Specific Integrated Circuit. Use of FPGA is gaining pace as it is being adopted by giants like Microsoft, IBM, Facebook and various huge datacenters with proven results of performance gain and lower power consumption.

Use case 3

Machines used for deep learning, analytics and engineering often utilize GPU accelerated computing. GPU accelerated computing is the use of a graphics processing unit (GPU) together with a CPU. GPU-accelerated computing offloads compute-intensive portions of the application to the GPU, while the remainder of the code still runs on the CPU. From a user’s perspective, applications simply run much faster.

These capabilities have made Heterogeneous computing, a buzz word among technology circles for quite some time now!

The author is Senior Solutions Architect at Xavient Information Systems


Leave a Reply

Your email address will not be published. Required fields are marked *