As large supercomputers keep getting larger,Sunnyvale, California-based Cerebras has been taking a different approach. Instead of connecting more and more GPUs together, the company has been squeezing as many processors as it can onto one giant wafer. The main advantage is in the interconnects—by wiring processors together on-chip, the wafer-scale chip bypasses many of the computational speed lossesthat come from many GPUs talking to each other, as well as losses from loading data to and from memory.
Now, Cerebras has flaunted the advantages of their wafer-scale chips in two separate but related results. First, the company demonstrated that its second generation wafer-scale engine, WSE-2,was significantly faster than world’s fastest supercomputer, Frontier, in molecular dynamics calculations—the field that underlies protein folding, modeling radiation damage in nuclear reactors, and other problems in material science. Second, in collaboration with machine learning model optimization company Neural Magic, Cerebras demonstrated that a sparse large language model could perform inference at one-third of the energy cost of a full model without losing any accuracy. Although the results are in vastly different fields, they were both possible because of the interconnects and fast memory access enabled by Cerebras’ hardware.
To read more, click here.