U.S. Angles to Retake Supercomputer Lead

Release time:2017-11-30
author:Ameya360
source: R. Colin Johnson
reading:2190

  The latest Top500 list of the world’s fastest supercomputers turns the spotlight on China, which overtook the United States in the total number of ranked systems and which scored the top two fastest installations on the list. Announcements from IBM, Intel, and Advanced Micro Devices, however, position the U.S. industry for a comeback. Rather than target systems that test well on the Top500’s distributed-memory version of the Linpack benchmarks (High Performance Linpack), the companies aim to render those measurements irrelevant on their way to beating China to exascale computing.

  China captured not only first and second place in the ranking of the fastest installed systems, but also won the majority share of ranked installations and took the aggregate performance lead, according to the November 2017 Top500 list, which was the 50th one to be published since the ranking debuted in June 1993. According to the Top500 organization, “There is no system from the USA under the Top3.  #1 and #2 are installed in China ... the USA decreased to a new record low of 143 [installed Top500-ranked systems] from 169 six months ago. The number of systems installed in China increased to a new record high of 202, compared to 160 on the last list. China now clearly shows a substantially larger number of installations than the USA. China now is also pulling ahead of the USA in the performance category, with China holding 35.4% of the overall installed performance, while the USA is second, with 29.6%.”

  “The high-performance computing landscape is evolving at a furious pace that some are describing as an important inflection point,” Dave Turek, IBM’s vice president for high-performance computing (HPC) and OpenPOWER, wrote in a recent blog. “Realizing that these demands could only be addressed by an open ecosystem, IBM partnered with industry leaders Google, Mellanox, Nvidia, and others to form the OpenPOWER Foundation, dedicated to stewarding the Power CPU architecture into the next generation.”

  IBM’s silicon contribution will be its Power9 processor (see photo), housing up to 24 cores with up to 8 billion FinFET transistors cast in 14-nanometer CMOS, 120 megabytes of shared level-three cache, eight-way simultaneous multithreading, and 230 gigabytes/second of bandwidth to memory. Its architecture, to be showcased at Oak Ridge and Lawrence Livermore National Labs, will pack thousands of Nvidia Volta graphic-processing units (GPUs) aimed at boosting overall performance beyond that of China’s home-brewed SunwayCPUs.

  IBM is banking mostly on its supercomputer data-centric architecture, which spreads out the processing power by embedding the processors at the locations where the data resides. This approach, according to Turek, yields a speedup of 5 to 10 times for the hardest applications: analytics; modeling; visualization; simulation; and artificial intelligence (AI), especially deep learning.

  To address the specific architectural needs of AI, IBM has redesigned the data flow of its new Power9 processor to dovetail with massive numbers of GPUs and Nervana coprocessors. By scaling TensorFlow and Caffe across 256 Nvidia Tesla GPUs, IBM has been able to reduce deep learning times from 16 days to seven hours. The company aims to balloon this strategy to as many as 100 times more GPUs spread across 50,000 nodes by 2021, thereby achieving exascale computing (a billion billion calculations per second) before China does.

  “Power9 is loaded with industry-leading new technologies designed for AI to thrive,” IBM Fellow Brad McCredie, vice president of cognitive systems development, wrote in his blog. “With Power9, we’re moving to a new, off-chip era, with advanced accelerators like GPUs and FPGAs [field-programmable gate arrays] driving modern workloads, including AI.”

  McCredie claims that the Power9 will form the basis of a commercial platform with “giant hose” bandwidth to its GPUs using OpenCAPI. The same OpenCAPI hose will also enable coherent FPGAs to obsolete Top500’s parallel version of the Linpack measurements, which center on systems that merely amass millions of CPUs. Instead, true cognitive benchmarks will enable deep learning metrics that show the superiority of the PowerAI platform on distributed deep learning (DDL) benchmarks, IBM says.

  Intel likewise is rejecting the mere amassing of CPUs in its Xeon Phi evolution, according to Trish Damkroger, vice president of Intel’s Data Center Group and general manager of its Technical Computing Initiative. The company is ditching the planned Knights Hill version in favor of “a new platform and new microarchitecture specifically designed for exascale,” Damkroger wrote in her blog. “Combined with our comprehensive HPC solutions portfolio, spanning compute, storage, I/O, and software, the updated roadmap is well poised to energize the exascale revolution.”

  Intel is concentrating on extending its current Scalable System Framework (SSF), combined with new add-on accelerators that outperform a mere matrix of CPUs. It is also diversifying its Select Solutions with on-chip FPGAs and fat-pipe interfaces to 3-D memory cubes. By offering targeted solutions optimized for Big Data, deep learning AI, and other next-generation workloads, the company hopes to obsolete the Top500’s parallel-CPU version of Linpack, as well as make its U.S. Argonne National Laboratory Coral (Collaboration of Oak Ridge, Argonne, and Livermore Labs) systems the first exascale supercomputers worldwide.

  Likewise, AMD recently announced its reentry into the exascale supercomputer race, by virtue of its new Epyc processors, Infinity interconnection fabric, and Radeon Instinct GPUs. The Epyc optimizes floating-point unit performance, as opposed to the wider vector processors of the Intel Xeon. AMD has already announced supercomputer design wins at Hewlett Packard Enterprise; Supermicro; Penguin Computing; Tyan; ASUS; Gigabyte Technology; BOXX; EchoStreams; and Dell, which will add Epyc servers to its PowerEdge line.

  AMD has also collaborated with Inventec to produce the Project 47 supercomputer, which has four times as many Radeon Instinct GPUs as Epyc processors and is due to be delivered in the first quarter. And, as tradition would have it, AMD is pricing its solutions below those of Intel and IBM.

("Note: The information presented in this article is gathered from the internet and is provided as a reference for educational purposes. It does not signify the endorsement or standpoint of our website. If you find any content that violates copyright or intellectual property rights, please inform us for prompt removal.")

Online messageinquiry

reading
Popular categories
  • Week of hot material
  • Material in short supply seckilling
model brand Quote
model brand To snap up
Hot labels
Original authorized brand
Information leaderboard
  • Week of ranking
  • Month ranking
About us

Qr code of ameya360 official account

Identify TWO-DIMENSIONAL code, you can pay attention to

AMEYA360 weixin Service Account AMEYA360 weixin Service Account
AMEYA360 mall (www.ameya360.com) was launched in 2011. Now there are more than 3,500 high-quality suppliers, including 6 million product model data, and more than 1 million component stocks for purchase. Products cover MCU+ memory + power chip +IGBT+MOS tube + op amp + RF Bluetooth + sensor + resistor capacitance inductor + connector and other fields. main business of platform covers spot sales of electronic components, BOM distribution and product supporting materials, providing one-stop purchasing and sales services for our customers.

Please enter the verification code in the image below:

verification code