The Chinese Tianhe-1A supercomputer: It’s the interconnect, stupid!

by Kurt Shuler, On Nov 16, 2010

The whiny vinegar boils down to this: “That 2.6 petaflop/sec LINPACK score isn’t that great because they didn’t do anything special with the cores, just used existing Intel and Nvidia products with a fancy new interconnect. Anybody could have done that.”

That statement is correct in 3 out of 4 ways:

  1. The Chinese team did use 14,336 standard Xeon X5670 processors and 7,168 Nvidia Tesla M2050 GPUs.
  2. They did come up with a fancy new interconnect.
  3. And yes, anybody could have done that (but didn’t).

However, their score is truly awesome, 50% higher than second place Oak Ridge National Lab’s Cray Jaguar.

What is most spectacular to me is that the engineering team at the National Supercomputing Center in Tianjin did what good engineering teams should do:  Find the bottleneck in a system and work to remove it.

So rather than invent a new core for the sake of glory, they analyzed current supercomputers to determine where is the bottleneck and worked to find a better way. Seems InfiniBand was gating performance so the Chinese team tackled the interconnect. According to EE Times they invented an interconnect chip set called Galaxy that pumps 160 Gigabits per second.

What does this have to do with system-on-chip interconnects? Plenty.

When system-on-chip designers set out to meet requirements with a new design or derivative, they often forget to reexamine past designs to understand what was gating performance or what other features customers wanted, but could not get. We tend to immediately jump into the trap of “it will have this latest ARM core” and “there’s an update to the graphics core we can add” before we look at the system as a whole to understand what is most important. In other words, we often don’t understand the degrees of freedom we should exercise when developing a new product because we are blinded by the “sexy” stuff we could do.

The interconnect isn’t sexy (Well, I think it is, but most people don’t). But if it is gating performance, or using too much power, or increasing the size of your die, then it is a bottleneck. Removing that interconnect bottleneck is just as important, or maybe even more important, than adding that fancy new IP core that everyone is all excited about.

So next time you set out to design a new system-on-chip, think about the success of the Chinese supercomputer team. You just might be missing something.



EE Times, “Interconnect pushed China super to #1”. October 28, 2010.–1

“Top100爆冷门 天河一号力压星云再夺魁”. October 28, 2010.

EE Times, “China expands presence on Top 500 list”. November 14, 2010.

New York Times, “China Wrests Supercomputer Title From U.S.”. October XX, 2010.