#mlperf
Explore tagged Tumblr posts
Link
NVIDIA's MLPerf Training V4.0 is out. It is mostly NVIDIA H100 and H200 so if you are looking to com...
0 notes
Text
Micron 6500 ION SSD: Turn AI with 256 Accelerators
Micron 6500 ION SSD
Results for MLPerf Storage v0.5 on the Micron 9400 NVMe SSD were just released by Micron. These outcomes demonstrate how effectively the Micron 9400 NVMe SSD performs in the use case of an AI server as a local cache, thanks to its high-performance NVMe SSD. The majority of AI training material, however, is stored on shared storage rather than in local cache. The identical MLPerf Storage AI workload was chosen to be tested for SC23 on a WEKA storage cluster that was powered by a 30TB Micron 6500 ION SSD.
They were interested in learning how the MLPerf Storage AI application scaled on a high-performance SDS solution. WEKA is a distributed, parallel filesystem built for AI workloads. The results are insightful, pointing to the need for huge throughput in future AI storage systems and assisting us in sizing suggestions for current-generation AI systems.
Let’s quickly review MLPerf Storage first
In order to facilitate the creation of future state-of-the-art models, MLCommons creates and maintains six distinct benchmark suites in addition to accessible datasets. The most recent addition to the MLCommons benchmark collection is the MLPerf Storage Benchmark Suite.
MLPerf Storage aims to tackle several issues related to the storage workload of AI training systems, including the limited size of available datasets and the high expense of AI accelerators.
See these earlier blog entries for a detailed analysis of the benchmark and the workload produced by MLPerf Storage:
Regarded as the best PCIe Gen4 SSD for AI storage, the Micron 9400 NVMe SSD
MLPerf Storage on the Micron 9400 NVMe SSD: storage for AI training
Let’s now discuss the test WEKA cluster
Earlier this year, they colleague Sujit wrote a post outlining the cluster’s performance in synthetic workloads.
Six storage nodes comprise the cluster, and the configuration of each node is as follows:
The AS-1115CS-TNR Supermicro
Processor AMD EPYC 9554P single-socket
64 cores, 3.75 GHz boost, and 3.1 GHz base
Micron DDR5 DRAM, 384 GB
30TB, 10 Micron 6500 ION SSDs
400 GbE networking
This cluster can handle 838TB of capacity overall and can reach 200 GB/s for workloads with a high queue depth.
Let’s now take a closer look at this cluster’s MLPerf Storage performance
A brief note: Since the data have not been submitted for evaluation to MLPerf Storage, they are unvalidated. Changes are also being made to the MLPerf Storage benchmark from version 0.5 to the upcoming version for the first 2024 release. Utilizing the same methodology as the v0.5 release, the values displayed here share a barrier between accelerators in a client and independent datasets for each client.
In the 0.5 version, the MLPerf Storage benchmark simulates NVIDIA V100 accelerators. There are sixteen V100 accelerators on the NVIDIA DGX-2 server. The number of clients supported for this testing is displayed on the WEKA cluster, where each client simulates 16 V100 accelerators, similar to the NVIDIA DGX-2.
Furthermore, Unet3D and BERT, two distinct models, are implemented in MLPerf Storage benchmark version 0.5. Testing reveals that BERT does not produce a substantial amount of storage traffic, hence the testing here will concentrate on Unet3D.
Micron 6500 ion ssd specs
For a specific number of client nodes, the overall throughput to the storage system is displayed in this graphic. Recall that there are 16 emulated accelerators on each node. Additionally, for a given number of nodes and accelerators to be deemed a “success,” they must maintain an accelerator utilization rate of greater than 90%. The accelerators are idle while they wait for data if their percentage falls below 90%.
The six-node WEKA storage cluster can handle 16 clients, each of which can imitate 16 accelerators, for a total of 256 emulated accelerators, and achieve a throughput of 91 GB/s.
With 16 V100 GPUs per system, this performance is equivalent to 16 NVIDIA DGX-2 systems, which is an astonishingly large number of AI systems powered by a six-node WEKA cluster.
The V100 is a PCIe Gen3 GPU, and NVIDIA’s GPU generations are advancing at a rate that is far faster than PCIe and platform generations. They discover that an emulated NVIDIA A100 GPU performs this workload four times quicker in a single-node system.
They may calculate that this WEKA deployment would handle eight DGX A100 systems (each with eight A100 GPUs) at a maximum throughput of 91 GB/s.
Future-focused AI training servers, such as the H100 / H200 (PCIe Gen5) and X100 (PCIe Gen6) models, are expected to push extremely high throughput.
As of right now, the Micron 6500 ION SSD and WEKA storage offer the ideal balance of scalability, performance, and capacity for your AI workloads.
Read more on Govindhtech.com
0 notes
Text
"Peak Training: Blackwell Przekracza Kolejne Granice w Wydajności Szkolenia MLPerf"
Generatywna sztuczna inteligencja (AI) staje się coraz bardziej zaawansowana, a jej zastosowania obejmują nie tylko tekst, ale także kod komputerowy, łańcuchy białkowe, streszczenia, wideo, a nawet grafikę 3D. Aby efektywnie trenować modele językowe (LLM), które są sercem tych technologii, potrzebne jest przyspieszone przetwarzanie na skalę centrów danych. Właśnie w tym obszarze platformy NVIDII,…
0 notes
Photo
NVIDIA Blackwell Revolutionizes AI Training with Doubling Performance in MLPerf v4.1
0 notes
Text
Nvidia's MLPerf submission shows B200 offers up to 2.2x training performance of H100
http://securitytc.com/TG92RS
0 notes
Text
Industry trend|Snapdragon 8 Extreme Edition is launched: computing power reaches 80TOPS! Edge-side AI sets the mobile phone industry on fire
On October 22, Beijing time, Qualcomm released its new generation flagship mobile phone processor, Snapdragon 8 Elite (hereinafter referred to as Snapdragon 8 Extreme Edition) at the Snapdragon Summit. This processor is said to bring laptop-level functions to smartphones. It is Qualcomm's most powerful and fastest mobile system-on-chip to date.
Qualcomm said that the platform uses the second-generation customized Qualcomm Oryon CPU, Qualcomm Adreno GPU and enhanced Qualcomm Hexagon NPU for the first time to achieve multi-modal generative AI applications on the terminal side. These technologies also enable many other experiences, including imaging functions using the company's powerful AI-ISP, next-generation gaming experience and ultra-fast web browsing.
At the summit, OpenAI CEO Sam Altman, Microsoft CEO Satya Nadella, and Meta CEO Mark Zuckerberg also expressed their support online.
1. Self-developed Oyron CPU, the strongest performance of current smartphones
Public information shows that the Snapdragon 8 Extreme Edition is manufactured using TSMC's 3nm process. Compared with the Snapdragon 8 Gen3, the CPU performance is improved by 45%, the energy efficiency is improved by 44%, and the CPU has a maximum frequency of 4.32GHz, exceeding the MediaTek Dimensity 9400 and Apple A18 Pro.
The Snapdragon 8 Extreme Edition is equipped with a new Oryon CPU. As Qualcomm's "world's fastest mobile CPU", the Oryon CPU does not directly adopt the architecture of computer chips, but consists of two "super cores" with a main frequency of 4.32GHz and six "performance cores" with an operating frequency of up to 3.53GHz, integrating the X80 5G modem RF system.
https://s1.iotexpo.com.cn/ue/24/23/6386529120819322604470021.png
The new Adreno GPU design has also changed significantly, with dedicated memory allocated to each slice, achieving a 40% increase in single-core performance, a 42% increase in multi-core performance, and a 40% reduction in power consumption. It also improves ray tracing performance by 35%.
In addition, Qualcomm has reserved 12MB of memory for the GPU, which reduces the access to system memory during data transmission interaction, and can also reduce power consumption delay and system memory usage.
In terms of network connection performance, the Snapdragon 8 Extreme Edition is equipped with a Snapdragon X80 5G modem, with a peak download speed of up to 10 Gbps and a theoretical maximum upload speed of up to 3.5Gbps. This is a groundbreaking modem that supports 6Rx for smartphones for the first time and is also equipped with a dedicated AI tensor accelerator, which can optimize data transmission speed and reduce latency through AI.
2. Cooperation with Zhipu and Tencent Hunyuan on end-side AI
Qualcomm has been deeply involved in end-side AI for many years. The Qualcomm AI engine based on heterogeneous architecture has successfully implemented large models, AIGC and other applications on the end-side with its performance and flexibility.
The Snapdragon 8 Extreme Edition is equipped with Qualcomm's fastest Hexagon NPU to date, with 80TOPS AI computing power, 46% performance improvement, 45% energy efficiency improvement, and comprehensive improvements in CPU, GPU, memory, etc., which also take the platform's AI performance to a higher level, with a comprehensive AI performance enhancement of 45%.
In the MLPerf BenchMarks test, compared with the third-generation Snapdragon 8, the improvement reached 104% (EDSR super score).
It can be seen that Qualcomm is expanding the capabilities of its AI engine to support multimodal generation tasks. Both the "big cup" and "small cup" multimodal models can run smoothly on its SLM at a speed of up to 70 tokens per second.
In terms of imaging, the Snapdragon 8 Extreme Edition is equipped with an enhanced image signal processor (ISP) and is more deeply integrated with the new Hexagon NPU. This means that taking photos can get more AI capabilities, such as the HDR effect will be taken to a higher level, the skin color of the characters and the color of the sky will be closer to nature, and the autofocus capability will be better. In addition, Qualcomm has also introduced semantic segmentation technology for photos and videos at the chip level, as well as the function of removing objects in videos.
At the Snapdragon Summit, Qualcomm Technologies also announced cooperation with Zhipu and Tencent Hunyuan.
Qualcomm and Zhipu cooperated on the GLM-4V end-side vision model, which was deeply adapted and optimized for inference on the Snapdragon 8 Extreme Edition, supporting a rich multi-modal interaction method. Leveraging the powerful end-side AI performance of the Snapdragon 8 Extreme Edition and the performance optimization brought by the Qualcomm AI software stack to the model, the GLM-4V end-side vision model can run at a high speed of more than 70 tokens/second on the end-side.
The GLM-4V-Mini, GLM-4V-Nano end-side vision model and the GLM-4-9B model will soon be launched on the Qualcomm AI Hub, and commercial mobile phones equipped with the Snapdragon 8 Extreme Edition can support them.
Qualcomm and Tencent Hunyuan have cooperated to implement the terminal-side deployment of the 7B and 3B versions of the Tencent Hunyuan model based on the Snapdragon 8 Extreme Edition mobile platform, further expanding the application and popularization of generative AI technology on the end-side.
In terms of game performance, Qualcomm and NetEase cooperated to create an innovative "Naraka: Bladepoint" mobile game experience based on Qualcomm Snapdragon 8 Extreme Edition chip, using a series of Snapdragon Elite Gaming features, and bringing a new upgraded AI teammate function on the terminal side.
3.Mobile phone manufacturers, new phones are released in succession
Qualcomm revealed that leading OEM manufacturers and smartphone brands such as ASUS, Honor, iQOO, Motorola, Nubia, OnePlus, OPPO, Red Magic, Redmi, realme, Samsung, vivo, Xiaomi and ZTE will release terminal devices equipped with Snapdragon 8 Extreme Edition in the next few weeks.
Not only that, Xiaomi Group Senior Vice President Zeng Xuezhong, Honor CMO Guo Rui, and Samsung Mobile Experience President Lu Taiwen came directly to the press conference to support Qualcomm.
At the Snapdragon Summit, Xiaomi Group Senior Vice President and President of International Department Zeng Xuezhong announced that the Xiaomi 15 series will be the world's first Snapdragon 8 Extreme Edition, and the new phone will be released at the end of this month. It is reported that the Xiaomi 15 series equipped with Snapdragon 8 Extreme Edition has a 29.7% reduction in power consumption and a 3°C drop in temperature at full frame.
Honor CMO Guo Rui showed the real photos of the Magic7 series. The back of the device adopts an octagonal camera Deco design, the middle frame is a metal right-angle edge, and the front is a centered dual-hole screen that supports 3D face recognition + ultrasonic screen fingerprint recognition.
Not only that, terminal equipment manufacturers have sent congratulatory messages and revealed the dynamics of new phones equipped with Snapdragon 8 Extreme Edition:
ASUS Co-CEO Xu Xianyue:
"ROG 9 is equipped with the excellent performance of Snapdragon 8® Extreme Edition mobile platform, integrating innovative terminal-side generative AI and gaming capabilities, bringing changes to consumer experience."
The ROG 9 series includes two models, ROG 9 and ROG 9 Pro. It is reported that ASUS will launch ROG 9 in November 2024 and release the Pro model in the first quarter of 2025.
Zhao Ming, CEO of Honor Terminal Co., Ltd.:
"Our upcoming flagship product, the Honor Magic 7 series, is equipped with Qualcomm Technologies' most advanced Snapdragon 8 Extreme Edition mobile platform. We are very excited to launch the industry's first AI Agent for an open ecosystem, and for the first time introduce the on-device generative AI driven by NPU computing to portraits and games."
Honor will hold a press conference for the Honor Magic 7 and Magic 7 Pro on October 30, 2024. This series of mobile phones will be the first to be equipped with the new Honor MagicOS 9.0 operating system.
Duan Yaohui, senior vice president of OPPO, said:
"OPPO has maintained a close cooperation with Qualcomm Technologies for many years, bringing users many innovative experiences including on-device generative AI. We are very much looking forward to the release of OnePlus 13, which will not only be equipped with the Snapdragon 8 Extreme Edition mobile platform, but also the flagship masterpiece that will open a new decade for the OnePlus brand."
The release of the OnePlus 13 mobile phone is scheduled for 4 pm on October 31. This phone is called OnePlus's "first flagship of the new decade" and is nicknamed "Thirteen Spices" internally.
Xu Qi, Vice President of realme and President of China
"The release of the Snapdragon 8® Extreme Edition mobile platform has once again refreshed the performance boundaries of mobile phone products, injecting unprecedented power into realme's latest flagship products. I believe that the outstanding technical upgrades of the Snapdragon 8® Extreme Edition mobile platform will bring users amazing performance and enable a rich experience across imaging functions and terminal-side generative AI."
realme will release the GT7 Pro this month, with an AnTuTu score of 3.02 million. The GT7 Pro is equipped with a high-specification micro-quad-curved screen of about 6.8 inches, and the display effect is excellent.
Yu Hang, co-founder of Nubia
"As the pioneer of the e-sports gaming phone category, Red Magic has always been committed to breaking through the performance limits and will take it as its responsibility to create professional-quality e-sports equipment for users. The new Red Magic 10 series will be released soon. The series will be equipped with the Snapdragon 8® Extreme Edition mobile platform and will officially meet with everyone in November. We firmly believe that with the excellent performance of the Snapdragon 8® Extreme Edition, the Red Magic 10 series will bring consumers an unprecedented gaming experience."
Shi Yujian, senior vice president and chief technology officer of vivo
"With the new slicing architecture and other rich features from Qualcomm Adreno GPU, iQOO 13 will bring consumers an excellent new gaming experience and stunning visual effects. We will work together to promote the e-sports experience into a new era."
iQOO 13 will be the first to be equipped with Qualcomm Snapdragon 8 Extreme Edition processor, and equipped with a 2K ultra-narrow edge straight screen with a refresh rate of 144Hz. It is estimated to be released in China in October and land in the Indian market on December 3.
Luo Wei, Vice President of ZTE and Director of Product Center of ZTE Terminal Business Unit
"We are pleased to announce that the nubia Z series flagship phone based on the Snapdragon 8 Extreme Edition mobile platform will be launched soon. This new phone not only continues nubia's professional imaging genes, but also upgrades performance, design and system experience."
Lu Weibing, Partner, Group President, and President of Mobile Phone Department of Xiaomi Group
"The Xiaomi 15 series will soon launch the flagship "core king" of Snapdragon 8 Extreme Edition, bringing amazing performance, excellent energy efficiency and terminal-side multi-modal generative AI application support, opening a new era of terminal-side generative AI." It is reported that Xiaomi will launch the Xiaomi 15 series on October 28.
At this Qualcomm Snapdragon Summit, Qualcomm CEO Cristiano Amon spent a lot of time in his speech to explain Qualcomm's understanding of AI trends. In his view, AI capabilities will become the most important user experience on future mobile phones. Users will gradually abandon traditional apps and experience more AI applications, and the status of traditional killer applications will also be threatened.
"Qualcomm is transforming into an interconnected computing company dominated by AI processors." - Today, the Snapdragon 8 Extreme Edition has built a broad stage, and the war for AI mobile phones has officially begun.
This paper is from Ulink Media, Shenzhen, China, the organizer of IOTE EXPO (IoT Expo in China)
0 notes
Text
The First AI Benchmarks Pitting AMD Against Nvidia
Rated horsepower for a compute engine is an interesting intellectual exercise, but it is where the rubber hits the road that really matters. We finally have the first benchmarks from MLCommons, the vendor-led testing organization that has put together the suite of MLPerf AI training and inference benchmarks, that pit the AMD Instinct “Antares” MI300X GPU against Nvidia’s “Hopper” H100 and H200 and the “Blackwell” B200 GPUs.
@tonyshan #techinnovation https://bit.ly/tonyshan https://bit.ly/tonyshan_X
0 notes
Text
NVIDIA Blackwell establece un nuevo estándar para la IA generativa con MLPerf
Blackwell establece un nuevo estándar para la Inteligencia Artificial generativa de @nvidiadc con el debut de la inferencia #MLPerf.
A medida que las empresas se apresuran a adoptar la IA generativa y lanzar nuevos servicios al mercado, las exigencias a la infraestructura de los centros de datos nunca han sido mayores. Entrenar grandes modelos de lenguaje (LLM) es un reto, pero ofrecer servicios en tiempo real basados en LLM es otro. En la última ronda de pruebas de MLPerf, Inference v4.1, las plataformas NVIDIA ofrecieron el…
0 notes
Text
NVIDIA Blackwell B200 performansı H100’ü 4’e katladı!
NVIDIA, Blackwell B200 işlemcisi için ilk MLPerf 4.1 sonuçlarını yayınladı. Sonuçlar, Blackwell GPU’nun, Hopper mimarisine dayanan H100’den dört kat daha fazla performans sunduğunu gösterdi. Ancak, bu sonuçları değerlendirirken bazı önemli noktaları göz önünde bulundurmak gerekiyor. İşte detaylar… NVIDIA, Blackwell B200 performansı H100’den 4 kat daha hızlı NVIDIA’nın sonuçlarına göre, Blackwell…
View On WordPress
0 notes
Photo
MLCommons Releases Latest MLPerf Tiny Benchmark Results for On-Device TinyML
0 notes
Link
NVIDIA's MLPerf Training V4.0 is out. It is mostly NVIDIA H100 and H200 so if you are looking to com...
0 notes
Text
MLPerf Mettle Tested: NVIDIA’s Speed Revolution
NVIDIA Speeds up MLPerf Standards Generative AI Training
Passing the Exam: NVIDIA Speeds up MLPerf Standards Generative AI Training NVIDIA H100 Tensor Core GPUs broke previous marks in the most recent industry-standard testing thanks to their unparalleled scaling and software advancements.
The most recent MLPerf industry benchmarks demonstrate how NVIDIA’s AI technology has elevated the standard for high speed computing and AI training.
One notable record and milestone in generative AI sticks out among the numerous others: In just 3.9 minutes, the AI supercomputer NVIDIA Eos, which is driven by an incredible 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking, finished a training benchmark using a GPT-3 model that has 175 billion parameters and one billion tokens.
Compared to 10.9 minutes, the record NVIDIA set when the test was first offered less than six months ago, that is a roughly three-fold increase.
The benchmark makes use of a subset of the entire GPT-3 data set, which powers the well-known ChatGPT service. Extrapolating from this, Eos was able to train in just eight days, which is 73 times faster than a previous state-of-the-art system that used 512 A100 GPUs.
Training time acceleration speeds up time-to-market, saves energy, and lowers costs. Large language models are made publicly available through heavy lifting, enabling any organization to use them with technologies like NVIDIA NeMo, a framework for tailoring LLMs.
1,024 NVIDIA Hopper architecture GPUs established a record in a new generative AI test this round, finishing a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes.
Given that generative AI is the most revolutionary technology of our time, MLPerf further solidifies its position as the industry standard for evaluating AI performance by incorporating these two tests.
System Sizing Takes Off
The usage of the greatest number of accelerators ever applied to an MLPerf benchmark contributed to the most recent results. When NVIDIA used 3,584 Hopper GPUs for AI training in June, the 10,752 H100 GPUs considerably outstripped that number.
Thanks in part to software enhancements, the 3x scaling of GPU numbers resulted in a 2.8x scaling of performance, or 93% efficiency rate.
Given that LLMs are expanding by an order of magnitude annually, efficient scalability is a fundamental prerequisite for generative AI. The most recent outcomes demonstrate NVIDIA’s capacity to handle this extraordinary task for even the biggest data centers on the planet.
The accomplishment may be attributed to the utilization in the most recent round by both Eos and Microsoft Azure of a full-stack platform of breakthroughs in accelerators, systems, and software.
10,752 H100 GPUs were used by Eos and Azure in different submissions. They demonstrated the effectiveness of NVIDIA AI in data center and public cloud deployments by achieving within 2% of the same performance.
Eos is used by NVIDIA for a variety of vital tasks. It contributes to the advancement of projects like NVIDIA DLSS, which is AI-powered software for cutting-edge computer graphics, and NVIDIA Research projects like ChipNeMo, which are generative AI tools for GPU design.
Progress in All Workloads
NVIDIA made advancements in generative AI and achieved numerous new marks in this round.
H100 GPUs, for instance, were 1.6 times faster than the widely used recommender models trained on earlier rounds of data that assist users in finding what they’re looking for on the internet. On the computer vision model RetinaNet, performance increased by 1.8 times.
Both scalable hardware and software advancements were responsible for these increases.
Once more, NVIDIA was the only business to conduct every MLPerf test. In all nine benchmarks, H100 GPUs showed the highest scalability and fastest performance.
Accelerations result in reduced expenses, quicker time to market, and energy savings for customers that are training large LLMs or tailoring them with frameworks like NeMo to meet their unique business requirements.
This time, eleven system manufacturers including ASUS, Dell Technologies, Fujitsu, GIGABYTE, Lenovo, QCT, and Supermicro used the NVIDIA AI platform in their submissions.
Partners of NVIDIA take part in MLPerf because they are aware of its value as a tool for clients assessing AI systems and suppliers.
Benchmarks for HPC Expand
The performance of H100 GPUs in MLPerf HPC, a different benchmark for AI-assisted simulations on supercomputers, was up to twice that of NVIDIA A100 Tensor Core GPUs in the previous HPC round. The outcomes demonstrated improvements of up to 16 times since the 2019 MLPerf HPC round one.
A novel test for training Open Fold a model that infers a protein’s three-dimensional structure from its amino acid sequence was incorporated into the benchmark. Open Fold can complete crucial healthcare tasks in minutes that previously took researchers weeks or months.
Since most drugs act on proteins, the cellular machinery that helps control many biological processes, an understanding of a protein’s structure is essential to quickly discovering effective medications.
H100 GPUs trained OpenFold in 7.5 minutes in the MLPerf HPC test. The OpenFold test is a sample of the whole AlphaFold training procedure, which involved 128 accelerators and 11 days of work two years ago.
The OpenFold model and the training software developed by NVIDIA will soon be accessible in NVIDIA BioNeMo, a generative AI drug discovery platform.
In this round, a number of partners submitted content to the NVIDIA AI platform. These included Lawrence Berkeley National Laboratory (LAB) with support from Hewlett Packard Enterprise (HPE), Texas Advanced Computing Center, Dell Technologies, and Clemson University’s supercomputing centers.
Benchmarks With Wide Support
The industry and academia have mostly supported the MLPerf benchmarks since they were first introduced in May 2018. Amazon, Arm, Baidu, Google, Harvard, HPE, Intel, Lenovo, Meta, Microsoft, NVIDIA, Stanford University, and the University of Toronto are among the companies that support them.
Because MLPerf testing are transparent and objective, consumers may trust the findings to help them make wise purchasing decisions.
The MLPerf repository hosts all of the tools that NVIDIA used, enabling developers to get identical top-notch outcomes. NGC, NVIDIA’s software hub for GPU applications, hosts containers that are regularly updated with these software enhancements.
Read more on Govindhtech.com
0 notes
Text
Inflection-2.5: The Powerhouse LLM Rivaling GPT-4 and Gemini
New Post has been published on https://thedigitalinsider.com/inflection-2-5-the-powerhouse-llm-rivaling-gpt-4-and-gemini/
Inflection-2.5: The Powerhouse LLM Rivaling GPT-4 and Gemini
Inflection AI has been making waves in the field of large language models (LLMs) with their recent unveiling of Inflection-2.5, a model that competes with the world’s leading LLMs, including OpenAI’s GPT-4 and Google’s Gemini.
Inflection AI’s rapid rise has been further fueled by a massive $1.3 billion funding round, led by industry giants such as Microsoft, NVIDIA, and renowned investors including Reid Hoffman, Bill Gates, and Eric Schmidt. This significant investment brings the total funding raised by the company to $1.525 billion.
In collaboration with partners CoreWeave and NVIDIA, Inflection AI is building the largest AI cluster in the world, comprising an unprecedented 22,000 NVIDIA H100 Tensor Core GPUs. This colossal computing power will support the training and deployment of a new generation of large-scale AI models, enabling Inflection AI to push the boundaries of what is possible in the field of personal AI.
The company’s groundbreaking work has already yielded remarkable results, with the Inflection AI cluster, currently comprising over 3,500 NVIDIA H100 Tensor Core GPUs, delivering state-of-the-art performance on the open-source benchmark MLPerf. In a joint submission with CoreWeave and NVIDIA, the cluster completed the reference training task for large language models in just 11 minutes, solidifying its position as the fastest cluster on this benchmark.
This achievement follows the unveiling of Inflection-1, Inflection AI’s in-house large language model (LLM), which has been hailed as the best model in its compute class. Outperforming industry giants such as GPT-3.5, LLaMA, Chinchilla, and PaLM-540B on a wide range of benchmarks commonly used for comparing LLMs, Inflection-1 enables users to interact with Pi, Inflection AI’s personal AI, in a simple and natural way, receiving fast, relevant, and helpful information and advice.
Inflection AI’s commitment to transparency and reproducibility is evident in the release of a technical memo detailing the evaluation and performance of Inflection-1 on various benchmarks. The memo reveals that Inflection-1 outperforms models in the same compute class, defined as models trained using at most the FLOPs (floating-point operations) of PaLM-540B.
The success of Inflection-1 and the rapid scaling of the company’s computing infrastructure, fueled by the substantial funding round, highlight Inflection AI’s unwavering dedication to delivering on its mission of creating a personal AI for everyone. With the integration of Inflection-1 into Pi, users can now experience the power of a personal AI, benefiting from its empathetic personality, usefulness, and safety standards.
Inflection-2.5
Inflection-2.5 is now available to all users of Pi, Inflection AI’s personal AI assistant, across multiple platforms, including the web (pi.ai), iOS, Android, and a new desktop app. This integration marks a significant milestone in Inflection AI’s mission to create a personal AI for everyone, combining raw capability with their signature empathetic personality and safety standards.
A Leap in Performance Inflection AI’s previous model, Inflection-1, utilized approximately 4% of the training FLOPs (floating-point operations) of GPT-4 and exhibited an average performance of around 72% compared to GPT-4 across various IQ-oriented tasks. With Inflection-2.5, Inflection AI has achieved a substantial boost in Pi’s intellectual capabilities, with a focus on coding and mathematics.
The model’s performance on key industry benchmarks demonstrates its prowess, showcasing over 94% of GPT-4’s average performance across various tasks, with a particular emphasis on excelling in STEM areas. This remarkable achievement is a testament to Inflection AI’s commitment to pushing the technological frontier while maintaining an unwavering focus on user experience and safety.
Coding and Mathematics Prowess Inflection-2.5 shines in coding and mathematics, demonstrating over a 10% improvement on Inflection-1 on BIG-Bench-Hard, a subset of challenging problems for large language models. Two coding benchmarks, MBPP+ and HumanEval+, reveal massive improvements over Inflection-1, solidifying Inflection-2.5’s position as a force to be reckoned with in the coding domain.
On the MBPP+ benchmark, Inflection-2.5 outperforms its predecessor by a significant margin, exhibiting a performance level comparable to that of GPT-4, as reported by DeepSeek Coder. Similarly, on the HumanEval+ benchmark, Inflection-2.5 demonstrates remarkable progress, surpassing the performance of Inflection-1 and approaching the level of GPT-4, as reported on the EvalPlus leaderboard.
Industry Benchmark Dominance
Inflection-2.5 stands out in industry benchmarks, showcasing substantial improvements over Inflection-1 on the MMLU benchmark and the GPQA Diamond benchmark, renowned for its expert-level difficulty. The model’s performance on these benchmarks underscores its ability to handle a wide range of tasks, from high school-level problems to professional-level challenges.
Excelling in STEM Examinations The model’s prowess extends to STEM examinations, with standout performance on the Hungarian Math exam and Physics GRE. On the Hungarian Math exam, Inflection-2.5 demonstrates its mathematical aptitude by leveraging the provided few-shot prompt and formatting, allowing for ease of reproducibility.
In the Physics GRE, a graduate entrance exam in physics, Inflection-2.5 reaches the 85th percentile of human test-takers in maj@8 (majority vote at 8), solidifying its position as a formidable contender in the realm of physics problem-solving. Furthermore, the model approaches the top score in maj@32, exhibiting its ability to tackle complex physics problems with remarkable accuracy.
Enhancing User Experience Inflection-2.5 not only upholds Pi’s signature personality and safety standards but elevates its status as a versatile and invaluable personal AI across diverse topics. From discussing current events to seeking local recommendations, studying for exams, coding, and even casual conversations, Pi powered by Inflection-2.5 promises an enriched user experience.
With Inflection-2.5’s powerful capabilities, users are engaging with Pi on a broader range of topics than ever before. The model’s ability to handle complex tasks, combined with its empathetic personality and real-time web search capabilities, ensures that users receive high-quality, up-to-date information and guidance.
User Adoption and Engagement The impact of Inflection-2.5’s integration into Pi is already evident in the user sentiment, engagement, and retention metrics. Inflection AI has witnessed a significant acceleration in organic user growth, with one million daily and six million monthly active users exchanging more than four billion messages with Pi.
On average, conversations with Pi last 33 minutes, with one in ten lasting over an hour each day. Furthermore, approximately 60% of people who interact with Pi in a given week return the following week, showcasing higher monthly stickiness than leading competitors in the field.
Technical Details and Benchmark Transparency
In line with Inflection AI’s commitment to transparency and reproducibility, the company has provided comprehensive technical results and details on the performance of Inflection-2.5 across various industry benchmarks.
For example, on the corrected version of the MT-Bench dataset, which addresses issues with incorrect reference solutions and flawed premises in the original dataset, Inflection-2.5 demonstrates performance in line with expectations based on other benchmarks.
Inflection AI has also evaluated Inflection-2.5 on HellaSwag and ARC-C, common sense and science benchmarks reported by a wide range of models, and the results showcase strong performance on these saturating benchmarks.
It is important to note that while the evaluations provided represent the model powering Pi, the user experience may vary slightly due to factors such as the impact of web retrieval (not used in the benchmarks), the structure of few-shot prompting, and other production-side differences.
Conclusion
Inflection-2.5 represents a significant leap forward in the field of large language models, rivaling the capabilities of industry leaders like GPT-4 and Gemini while utilizing only a fraction of the computing resources. With its impressive performance across a wide range of benchmarks, particularly in STEM areas, coding, and mathematics, Inflection-2.5 has positioned itself as a formidable contender in the AI landscape.
The integration of Inflection-2.5 into Pi, Inflection AI’s personal AI assistant, promises an enriched user experience, combining raw capability with empathetic personality and safety standards. As Inflection AI continues to push the boundaries of what is possible with LLMs, the AI community eagerly anticipates the next wave of innovations and breakthroughs from this trailblazing company.
Inflection AI’s visionary approach extends beyond mere model development, as the company recognizes the importance of pre-training and fine-tuning in creating high-quality, safe, and useful AI experiences. As a vertically integrated AI studio, Inflection AI handles the entire process in-house, from data ingestion and model design to high-performance infrastructure.
#000#Advice#ai#ai assistant#ai studio#android#app#approach#Aptitude#arc#Art#Artificial General Intelligence#benchmark#benchmarks#billion#Building#cluster#coding#Collaboration#Community#comprehensive#computing#data#data ingestion#deployment#Design#desktop#details#development#emphasis
1 note
·
View note
Text
MLPerf training tests put Nvidia ahead, Intel close, and Google well behind
https://spectrum.ieee.org/generative-ai-training
0 notes
Link
Intel, Nvidia and Google have made significant strides in recent months that enable faster LLM training results in the MLPerf Training 3.1 benchmarks. #AI #ML #Automation
0 notes
Text
MLPerf 3.1 includes big language design criteria for reasoning
September 11, 2023 1:01 PM Head over to our on-demand library to see sessions from VB Transform 2023. Register Here MLCommons is growing its suite of MLPerf AI criteria with the addition of screening for big language designs (LLMs) for reasoning and a brand-new criteria that determines efficiency of storage systems for artificial intelligence (ML) work. MLCommons is a supplier…
View On WordPress
0 notes