Intel supercomputer: Aurora is to deliver 2 ExaFLOPS, ZettaFLOPS as early as 2027

Post a Comment

At the start of Intel ON, Intel CEO Gelsinger highlighted the supercomputer Aurora, which is supposed to become even faster, almost exuberantly. It should deliver two ExaFLOPS instead of one. After years of delay, Intel wants to find a conciliatory conclusion to the project. At the same time, however, there is a new megalomania.

Only the name “Aurora” has remained

ComputerBase recently reported repeatedly about the problem child Aurora. Announced in 2015, Intel should deliver CPUs and Knights Hill accelerators in 10 nm for the supercomputer to be launched in 2018. The big swan song followed: Knights Hill and the whole family were hired, Aurora didn’t come.

But Intel wanted to keep the prestigious contract, in 2019 a completely redesigned Aurora system with Xe GPUs was promised for 2021/2022. But that struggled with delays, even contractual penalties are now due: In the quarterly report Q2 / 2021, Intel told analysts that it would have to set aside 300 million US dollars for this at the end of the year.

Over 2 ExaFLOPS performance Over 2 ExaFLOPS performance (Image: Intel)

But now Intel suddenly speaks of “Over-fulfillment“Of the contract -“yet again“, Gelsinger also claims. But how does it come about?

Ponte Vecchio does more than expected

When it announced the new Xe GPUs in 2019, Intel designed the supercomputer so that it would definitely reach the mark of an ExaFLOP, and therefore provided many additional nodes and space for more CPUs and GPUs. Because three years ago, the manufacturer didn’t know how good or bad the Ponte Vecchio really would be in the end – and the GPUs in Auroa do most of the work.

It seems that the HPC accelerator chip is now running better than expected, Intel speaks of “well ahead of those performance objectives“, So that the reserve space is not used to reach the 1-ExaFLOPS mark, but to implement a system with over 2 ExaFLOPS peak performance.

The installation of the supercomputer has finally started. The shell of the hall is almost finished, now the system is to follow – in any case this year, as Intel wants to prove with the first pictures. The system should be ready for use in 2022. In the end, Aurora could be the fastest supercomputer for a short time.

Preparations for Aurora Preparations for Aurora (Image: Intel)

The first ExaScale supercomputer comes from China

The first ExaScale supercomputer will most likely not even be the Aurora in the US, it will be AMD’s Frontier. In the global competition, other computers may even have already achieved the goal. As The Next Platform executes China has apparently already had two ExaScale systems in operation since the spring, this just not hanged on the big bell.

ZettaScale already in 2027 !?

Recently, Intel’s new head of the supercomputing department claimed that Intel no longer wanted to be in the limelight when it came to supercomputers. At Intel ON, the U-turn came just days later: After the first hint less than two weeks ago that Intel was aiming for ZettaScale in this decade, the year 2027 was already fixed – and Intel of course wants to be first (and be in the spotlight) .

Currently and especially after the problems recently, it sounds anything but realistic and almost like megalomania, as even Raja Koduri admits on Twitter. In order to achieve the goal, further massive innovations would be necessary if the system is not simply to be scaled a hundredfold further and thus to consume gigawatts of electricity, as well The Next Platform executes.

It can be eagerly awaited to see what performance Intel will ultimately produce in 2027, at what price and with what consumption.

Update 10/29/2021 10:47 a.m.

Intel’s Raja Koduri has addressed the issue once again via several Twitter postings reported to speak. Since Intel has not delivered recently, he understands the skepticism and laughter at the new objectives – even if the latter annoys him of course, as you can read in the course of the tweets. In several postings, he then explains in more detail how the goal could be achieved. For the sake of clarity, they have been copied together here.

Thanks for the feedback. Given our past decade of not so stellar execution on HPC, we did expect to be laughed at..fair enough .. Starting SC’21 we will begin rolling out more details on our near and medium term plans.

I have been framing the need for 1000x more compute for rapidly evolving AI models by 2025 for the past couple of years. My talks at hot-chips 2020 and Samsung forum highlighted the need for the whole eco-system coming together to accelerate towards this goal.

1000x in which workload is an important detail … but not the top order bit IMO, as you can always narrow down to a “benchmark” to make the goal either “easy” or “hard”.

As highlighted in the “no transistor left behind talk” .. there’s 65,000x opportunity today with sw-hw co-optimization today without any new physics or crazy hw and this path will be exploited by many as we already see evidence with M1 etc

The 1000x framing we are looking at doesn’t count these single node level hw-sw co-optimizations. Those will be on top of the basic hardware + system arch targets we set. As you can imagine there are a lot of internal debates on choices of the workloads and scale

So why declare this now? The new intel that pat aspires to be an “all-in” culture..we declared our commitment for open – Pat opening this up definitely makes our (engineers) jobs easier to collaborate with other key external players who are also inspired by the these targets.

And the price of an open approach to any forward looking roadmap is – being laughed at ..

If the world gets to 1000x computational energy efficiency on key workloads one way or another by 2027, it’s a small price to pay!

And even more important than the goals – just one or two fundamentally new things created during this journey will help other moon and beyond shots.

Raja Koduri

Related Posts

Post a Comment

Subscribe Our Newsletter