In Latest Research, HPE Uses Blockchain Infrastructure for Distributed Machine Learning Models
This Article Is Based On The Research Article 'HPE MACHINE LEARNING DEVELOPMENT SYSTEM' and 'HPE uses blockchain for distributed machine learning models'. All Credit For This Research Goes To The Researchers Of This Paper 👏👏👏 Please Don't Forget To Join Our ML Subreddit
HPE unveiled two new AI products. One who builds and trains large-scale machine learning (ML) models. The other is a decentralized ML system that allows remote or edge facilities to communicate model updates.
HPE Machine Learning Development System
The HPE Machine Learning Development System is a hardware and software platform. The HPE Machine Learning Development Environment (MLDS) is integrated with the HPE compute infrastructure to provide a system that can reduce the typical time to value of building and training machine models from weeks to days. Acceleration is delivered as an integrated solution with a pre-configured infrastructure suitable for building ML models, allowing customers to focus on training ML models rather than worrying about setting up the infrastructure. These tools include GPUs and software tools to help ML engineers scale their workflows automatically.
The base architecture is based on HPE Apollo 6500 Gen10 server nodes with eight 80 GB Nvidia A100 GPUs and Nvidia Quantum InfiniBand networking. Up to 4TB of RAM and 30TB of local NVMe scratch storage are available on Apollo nodes, with optional HPE Parallel File System Storage. To control the system, additional ProLiant DL325 servers function as service nodes and are connected to the corporate network through an Aruba CX 6300M switch.
The system comes in four nodes, but it can still be expanded. The software stack runs on Red Hat Enterprise Linux. It includes Machine Learning Development Environment and HPE Performance Cluster Manager for provisioning, managing, and monitoring server nodes.
Internal testing using client workloads found that HPE MLDS with 32 GPUs is up to 5.7x faster at natural language processing than a comparable platform with the same GPUs that did not have the optimized interconnect provided by HPE.
It is now available for purchase worldwide.
HPE Swarm Learning
HPE Swarm Learning is a decentralized machine learning framework for edge or dispersed sites developed by Hewlett Packard Labs, is HPE’s other AI announcement.
Swarm Learning does not send data back to a centralized location like a data center (where a master ML model is updated and changes are distributed). A group of distributed nodes can share any updated parameters that each system’s ML model may have learned while running.
This latter strategy can be inefficient and costly if large amounts of data need to be transferred back to the mothership. It may also violate privacy and data ownership laws that limit data sharing. In some cases, moving data from the edge to the core has compliance and GDPR implications, so moving everything to a central location is not straightforward.
HPE Swarm Learning, on the other hand, allows models to be trained locally, with the learning of those models being shared between nodes rather than data. This involves establishing a peer-to-peer network between the different nodes and ensuring that model parameters can be transferred securely. The latter is achieved through blockchain technology, commonly used in cryptocurrency systems to ensure that transactions cannot be tampered with or that any tampering is immediately apparent.
There are a variety of business scenarios where an ML model can be deployed across multiple sites, and a simple approach to keeping all up-to-date models in sync would be beneficial. One such use is in financial services fraud detection, where HPE Swarm Learning, in conjunction with its data analytics platform, detects strange behavior in credit card transactions. Both technologies can improve accuracy when training machine learning models from massive amounts of financial data from various bank branches across a large area.
Manufacturing is a more common edge use case, where predictive maintenance using machine learning can prevent unexpected machine downtime. Swarm learning can increase system accuracy by bringing together understanding of sensor data from different production sites.
Swarm Learning is part of a containerized Swarm learning library that can run on Docker in virtual machines and is hardware independent. Most nations now have access to the platform. For further reading, refer here.