Hewlett Packard Enterprise (NYSE: HPE) announced that it is removing barriers for enterprises to easily build and train machine learning models at scale, to realize value faster, with the new HPE Machine Learning Development System. The new system, which is purpose-built for AI, is an end-to-end solution that integrates a machine learning software platform, compute, accelerators, and networking to develop and train more accurate AI models faster, and at scale.
The HPE Machine Learning Development System builds on HPE’s strategic investment in acquiring Determined AI to combine its robust machine learning (ML) platform, now formally called the HPE Machine Learning Development Environment, with HPE’s world-leading AI and high performance computing (HPC) offerings. With the new HPE Machine Learning Development System, users can speed up the typical time-to-value to start realizing results from building and training machine models, from weeks and months, to days.
“Enterprises seek to incorporate AI and machine learning to differentiate their products and services, but are often confronted with complexity in setting up the infrastructure required to build and train accurate AI models at scale,” said Justin Hotard, executive vice president and general manager, HPC and AI, at HPE. “The HPE Machine Learning Development System combines our proven end-to-end HPC solutions for deep learning with our innovative machine learning software platform into one system, to provide a performant out-of-the box solution to accelerate time to value and outcomes with AI.”
Removing barriers to realize full potential of AI with complete machine learning solution
Organizations have yet to reach maturity in their AI infrastructure, which according to IDC, is the most significant and costly investment required for enterprises that want to speed up their experimentation or prototyping phase, to develop AI products and services. Typically, adopting AI infrastructure to support model development and training at scale, requires a complex, multi-step process involving the purchase, setup and management of a highly parallel software ecosystem and infrastructure spanning specialized compute, storage, interconnect and accelerators.
The HPE Machine Learning Development System helps enterprises bypass the high complexity associated with adopting AI infrastructure by offering the only solution that combines software, specialized computing such as accelerators, networking, and services, allowing enterprises to immediately begin efficiently building and training optimized machine learning models at scale.
Gaining accurate models to unlock value faster with the HPE Machine Learning Development System
The system also helps improve accuracy in models faster with state-of-art distributed training, automated hyperparameter optimization and neural architecture search, which are key to machine learning algorithms.
Compared to a public cloud provider’s system scaling up to 32 GPUs, the HPE Machine Learning Development System delivers approximately 90% scaling efficiency for workloads such as Natural Language Processing (NLP) and Computer Vision, owing primarily to the communication fabric differences and its effect on this workload. Additionally, the system delivers 5.7 times faster throughput than a cloud provider system on 32 GPUs on the NLP workload and is slightly faster on the Computer Vision workload.1
Speeding up POC to production with ready-to-use, AI model development and training solution
The HPE Machine Learning Development System is offered as one, integrated solution that provides preconfigured, fully installed AI infrastructure for turnkey model development and training at scale. As part of the offering, HPE Pointnext Services will provide onsite installation, software setup, allowing users to immediately implement and train machine learning models for faster and more accurate insights from their data.
The HPE Machine Learning Development System is offered starting in a small building block, with options to scale up. The small configuration starts with the following:
- Innovative machine learning platform with the HPE Machine Learning Development Environment to enable enterprises to rapidly develop, iterate, and scale high-quality models from POC to production
- Optimized AI infrastructure using the HPE Apollo 6500 Gen10 system to provide massive, specialized computing capabilities to train and optimize AI models, starting with 8 80 GB NVIDIA A100 Tensor Core GPUs for accelerated compute
- Enabling fine-grained centralized monitoring and management of for optimal performance with the HPE Performance Cluster Management, a system management software solution
- Management stack to control and manage system components using HPE ProLiant DL325 servers and 1Gb Ethernet Aruba CX 6300 switch
- Ensuring performance of compute and storage communications using high-performance networking solutions using NVIDIA InfiniBand HDR Switches and HCAs .
Sign up for the free insideAI News newsletter.
Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1