Internal communication and data transfer in the CPU

There is a cost that is usually obvious, the reason is that this cost does not seem to have a direct relationship with the number of calculations a processor can perform or the number of elements it contains, but in recent years , it has become The number 1 problem for hardware architects and the one that is least talked about in the specialized computer media, especially since it is the elephant in the room, we are talking about the energy cost of the data transfer within a processor.

Internal communication and data transfer

The easiest way to transmit data from one electronic system to another is to use a transmitter and receiver that transmit signal pulses continuously, with the clock signal being the one that controls the compass of each bit transmitted to the receiver. as if it were a metronome. Wiring being the channels through which these signals pass.

If we want to transmit data in both directions, it is enough to place a transmitter and a receiver in each direction, and in the case where we want to transmit several bits in parallel, it is enough to put a large number of transmitters and receivers.

Sounds simple, doesn’t it? But we are missing one thing and that is the cost of passing information from one part of the chip to another. It is as if they were telling us that a fleet of trucks can carry kilos of goods but suddenly someone would have forgotten the cost of fuel all this time because it would have been totally marginal until a certain point.

A little history: the end of the Dennard scale

The end of the Dennard scale which occurred in the mid-2000s, especially when it reached 65nm, but which still has a fairly significant consequence today, its statement as law can be summed up as following:

If we scale the characteristics of a lithograph to another manufacturing node, the same as voltage, then what the power consumption per zone should remain the same.

How was the end of the Dennard scale reached? Simply, the designers of the various microprocessors evolved their designs at a much higher clock speed than they should and they hit a wall, which forced them to turn the tide from 2005 onwards, the concept powering by watt has started popping up all over the marketing slides as a new performance trend.

From there, the engineers’ obsession was reversed, they went from a total ignorance of the consumption of energy consumed to the desire to increase the number of operations per watt that a processor could do, but in the midst of this commercialization the energy cost of data communication, due to the fact that for a long time the cost of energy was almost negligible, but it was not for a while.

The internal communication bottleneck

To understand the communication problem, we have to keep in mind that by increasing the number of elements in a processor, we also increase the number of communication channels needed to communicate, which are always 2n where n is the amount of communication participants.

This makes adding more elements in a processor also increases communication channels and strengths in order to keep power consumption stable to maintain lower clock speed, the trade-off is that when we increase the amount of nuclei so we see how the energy consumption does not stay stagnant but increases over time.

The interconnections in charge of communicating the different cores increase over time because we have more complex configurations and the amount of information they transmit and the energy they consume for it increases more and more, occupying more and more plus the energy budget of different processors, whether CPU or GPU, which poses a challenge, especially in GPUs, based on dozens or even hundreds of cores

So what’s the problem with internal communication in processors?

The problem engineers are now faced with is the fact that in a simple addition, the cost of adding the operands is negligible compared to translating the two operands, so the problem is no longer to do a simple ALU running a data in its registers can achieve a certain rate of performance, but this rate of speed can be achieved by using more remote data and therefore they will end up consuming more.

This means that designs that would seemingly be possible on paper and the computational capacity involved should be ruled out once the logistics of the data and the energy consumed by it are investigated.

How often does the energy consumed change?

With regard to the energy consumption of internal communication, here we have to separate the energy cost into two different blocks, on the one hand, the energy cost of IT, which followed the Dennard scale and slowed down in recent years, but there is a steady progression which indicates higher performance.

On the other hand, we have the cost of communication, it is not the cost of communication between the RAM and the processor, but how much it costs to transmit a block of data in energy.

We cannot forget that IT and data transfer are linked to each other. Poor communication will result in low computing power, and low computing power will significantly waste an efficient communication structure.

The future is in infrastructure

While they talk about faster and faster processors, with more cores and therefore with greater power, all the meat on the grill is focused on improving energy efficiency when transmitting data, because this is the next neck. bottle in which they will collide to increase the performance of the different processors.

This is why we are seeing new ways of organizing a processor in development, and they are becoming more and more common, the reason is that scaling a processor in a conventional way is no longer possible without the phantom of power consumption appearing due to data transfer.

For now, the monolithic models hold, but we do not know how far they can scale.