
Source : hpcwire
For a decade, beginning in 2007, AMD appeared to have been demolished by Intel in both the PC processor and server markets. And then AMD reappeared, almost like the phoenix from its ashes. It has been a spectacular climb in the past. four years and started with the introduction of a new server processor called EPYC.
EPYC is now in its third generation. And AMD engineers in India have played a central role in every EPYC generation. The first EPYC, based on an x86 core architecture called Zen, was developed from the ground up with significant contributions from the Indian teams in Bangalore and Hyderabad, the updated version of the second generation was released two years later and set more than 170 records. Worlds in the data center CPU performance, security and scalability. The third generation, the latest 7 nanometer chip from AMD with the code name Milan,It was released earlier this year and is believed to be the fastest server processor in the world. The critical aspects, both the hardware and the software, were built in India.
Designing Hardware
Jaya Jagadish is the Country Head of AMD India. She is also Corporate Vice President of Silicon Design Engineering and leads the core team at AMD India responsible for developing the CPU or processor core. Chip (SoC) The SoC contains several building blocks’ called IPs. The SoC team takes all the different IPs, connects them, and puts them together. Then we check them the whole system to make sure the system is doing what it is supposed to do. We also do a lot of stress tests to make sure we have a system that can be used in high performance computers, ”says Jagadish.
She says several of the critical IPs in the EPYC SoC were developed in India. Once the chip is designed, all tests are done with simulations. Then the chip is manufactured by a huge team and tested again. This level of testing is conducted in the US, with the Indian team constantly submitting new patterns for testing and reviewing the logs submitted from the US for errors and anomalies. Then there is another test level that is carried out in India, the so-called board level testing.
The hardest part of the whole process, Jagadish says, was the final design phase when the pandemic hit and the lockdown took place. They had to shutdown campus overnight & move to work-from-home. The last 3 months of the design phase 2/15 are very critical for the SoC team as the design is so massive that it takes about a week to load everything onto a computer and run it. Any mistake can lead to weeks of delays. “But the passion and commitment of the team were just incredible and we were able to complete our work from home all over the country, ”she says.
It is extremely difficult to find people with experience in CPU design at this high level, so, Jagadish says, AMD is generally looking for people with a strong technology foundation, who are willing to invest time in learning.
Building Software
Jay Hiremath is Corporate Vice President of Software and Software Engineering at AMD India. He has been with the company for 28 years and says these are the most exciting times for AMD. The Hiremath team has full ownership of CPU software development and platform engineering. Team members builds the entire software stack used to optimize the server processor to–give world record performance benchmarks.
Hiremath says his team’s main job is to get the latest performance from the processors. “We have the best processor in its class, but if we can’t use the processing power of those processors, then the hardware is not good. This is where software comes-in,” he says. The team works on toolchains (compilers, libraries and profilers), debugging tools, system managment, platform drivers and machine learning benchmark statistics determine the average selling price of a processor.
As a result, the team needed to optimize performance for benchmarks and then optimize them more broadly for each application for which it was intended. “Whether the high-performance computing segment, the enterprise segment or the cloud segment, each has its own requirements & quirks, and that requires customization in terms of performance tuning. This is why the right design of the software is so crucial, ”says Hiremath.
Today, most new supercomputers use AMD’s second and third generation server chips. Powered by a custom AMD EPYC CPU and an AMD Radeon Instinct GPU, Frontier will be the world’s first exascale supercomputer. This computer have capability to perfom Billions (one trillion) operations per second. Today’s Fastest Supercomputers Solve Petascale Problems. Frontier will deliver more than 1 exaflops of peak processing power, more than 1,000 times faster than today’s best now.