OpenCL, what is and what is this API for CPU and GPU?

First of all, it should be clarified that OpenCL is not a type of hardware, but rather a software or rather an API which is used to communicate the applications with the GPU using a software level abstraction of the GPU itself. Its difference from the rest of the APIs that applications communicate with the GPU is that it is not what we can say a graphics API, but an API for scientific computing.

GPU for computing

What is GPU computing? Keep in mind that GPUs can run programs called shaders, which allow you to manipulate the characteristics of the various primitives in the 3D pipeline, regardless of their shape. Obviously, any processor sees only a binary series of data, so when a shader unit executes a shader program, it is processing a set of data and therefore can be used to process any type of data.

In the mid-2000s, with the arrival of GPUs with unified shaders, the possibility of using them in markets beyond that of PC gaming appeared, the main one being that of scientific computing, which allowed the departure of the NVIDIA Tesla range from 2007.

The difference between these GPUs and those used for games is their ability to work with double precision floating point, which is not necessary in games, but in the world of science in all its aspects. Either for astronomical calculations or to produce a new generation drug.

OpenCL, an API for GPU computing

Until the emergence of OpenCL, graphics APIs were designed only for rendering graphics, but not for computing purposes, so they were not fully efficient at running non-graphics algorithms on a GPU. The solution? Obviously developing an API for computing, which they called OpenCL where CL comes from Compute Library.

But how are OpenCL and other APIs different? We can run OpenCL on any kind of processor, not only GPUs, but if we want us to be able to run OpenCL code, it’s a processor if we need it, other than that we can also l ‘run on DSPs, FPGAs, neural networks and a long etc.

The reason is that their model is based on distributed computing where we have a host unit which is the CPU and a series of processing units which can be GPUs, DSPs, FPGAs, etc. To whom are sent the tasks to be performed. Each task being a processing element, which during its processing the result is sent to the host and / or a confirmation that it has carried out said task. Each processing element is a separate program, so a thread with its own program counter.

OpenCL is not a graphical API

It should be noted that OpenCL does not control the graphics pipeline and therefore it is not used to run graphs, since many of the functions that OpenGL and other APIs like Direct3D, Vulkan, etc. have. They are not in OpenCL. Additionally, OpenCL was originally designed to interface with OpenGL and is currently designed to work with Vulkan, the Khronos Group’s current graphics API.

Another difference is the programming language used to run Shader programs. In the case of graphics APIs, high level shader languages are used, such as GLSL in the case of OpenGL and Vulkan, HLSL in the case of DirectX, etc.

On the other hand, with OpenCL this is not the case, general and non-specific languages such as C and C ++ are used, which allows programs and algorithms written in these languages to be ported to OpenCL. that they can be run on all types of devices that support this API and be able to take advantage of its greater versatility than the limited languages for shaders.

How is OpenCL applied in everyday applications?

OpenCL is widely used in some PC applications, especially multimedia applications. When, for example, in Photoshop we tell the program to run an image filter today, it is done through OpenCL and the algorithm runs on the most suitable hardware that supports the API, so if we have the most suitable component, the OpenCL part will be running on it.

Other types of everyday applications that use OpenCL are video codecs like AV1, HEVC, H.264, etc. Most of them are programmed in OpenCL for the same reasons we mentioned previously. This allows the processor to run them, and developers don’t have to go crazy if there is a video codec in the hardware and optimize it.

Interestingly, OpenCL is also the reason why the VGA-based 2D part has disappeared from graphics cards, and is that although it seems contradictory, it is much better to run the 2D graphical interface of a computer system. ‘operation via calculation via GPU.

DirectX Computing and the boycott of NVIDIA with CUDA

OpenCL is in decline in terms of use, especially after DirectX 11 included Compute Shaders in its repertoire and Apple included its Metal API as well. The emergence of graphics APIs with partial computing support is what made OpenCL less important.

It was from the introduction of Compute Shaders that the abandonment to OpenCL began to be gradual. The latest widely used version is 1.2 of the standard. It’s a very rudimentary version compared to what other APIs can do, as it doesn’t support things like shared virtual memory, SPIR-V for better interaction with Vulkan.

But CUDA is OpenCL’s main enemy. The reason is that NVIDIA has dominated the world of high performance GPUs for years and they took the opportunity to run a lot of scientific computing around CUDA and not under OpenCL as it ties programs to their hardware. NVIDIA was able to do this due to a complete lack of competition from its NVIDIA Tesla,

How to boycott OpenCL by NVIDIA? Does not officially support OpenCL 2.0 enhancements, which were also in CUDA. Not only that, but NVIDIA has never supported OpenCL on its NVIDIA Tesla, Quadro, and GeForce GPUs.

The third lucky time?

In the end, in order to avoid the definitive collapse of OpenCL for its third version, they had to rethink the entire API in its third version. In the release, a lot of the stuff that was part of the main branch of OpenCL 2.x has been downgraded to optional extensions and so the base hardware doesn’t need to support them again. So it is now possible to run OpenCL 3.0 on hardware with OpenCL 1.2 drivers and add extensions that we want to use ourselves, a way to bypass NVIDIA censorship.

In all the problem that OpenCL faces is that apart from the world of scientific computing, it is the most used in video games. Especially when calculating the physics of video games, as well as collision detection. The fact that Compute Shaders exist in both Vulkan and DirectX relegates the use of OpenCL to scientific computing, which is currently the absolute domain of CUDA.

One market that the API could have reached and succeeded in is that of Raspberry Pi-type embedded devices, but version 2.0 dismissed them because it focused too much on scientific computing. Version 3.0 is not designed to bring OpenCL to embedded systems that would adopt it without problems for a multitude of applications, but rather seeks to win an already lost war and that within the Khronos group itself. There is already competition to OpenCL in the form of Vulkan.