Deliver GPU-Accelerated AI and Machine Learning Workloads on Demand
Artificial Intelligence and Machine Learning products and solutions are quickly becoming commonplace today and are shaping our experiences in computing like no other time in history. Interactive speech (e.g. Alexa, Google Home, etc.), Visual Search and recommendation engines are just a few of the consumer applications that are available today on our phones, websites and e-commerce platforms. The impact of machine learning is getting broader with enterprise applications in health sciences (e.g. Dr. Watson), finance, security, data centers and cyber surveillance. General-purpose CPUs cannot deliver the user responsiveness and inference latency required by complex deep learning and AI workloads. That’s because – unlike GPUs built for this purpose – general purpose CPUs are not designed to rapidly perform parallel operations on large amounts of data, e.g., multiplying matrices of tens or hundreds of thousands of numbers. Processing large data sets through the same hypothesized algorithm for learning and for intelligent inference is a fairly common operation in machine learning and deep learning applications.
ZeroStack’s GPU-as-a-Service capability gives customers powerful features to automatically detect GPUs and make them available in the ZeroStack environment. In order to maximize utilization of this powerful resource, cloud admins can configure, scale, and allow fine-grained access control of GPU resources to end users. Users can enable GPU acceleration, deploy new machine learning and deep learning workloads with tools such as TensorFlow, Caffe, etc., and provide the apps dedicated access to multiple GPU resources for an order of magnitude, faster inference latency and user responsiveness.
ZeroStack recommends the following hardware specs
for servers hosting the GPU cards.
- CPU: Intel Xeon E2630 v4 – 10 core processor w/ virtualization (VT-x) and IOMMU (VT-d) support or a similar AMD CPU with AMD-V and AMD-Vi support
- RAM: Minimum 80 GB DDR4 2133 MHz (128 Gb recommended for Deep learning apps)
- Motherboard: PCI 3.0 compliant motherboard, check for GPU compatibility and BIOS options
- Storage: At least 2 TB HDD (7200 RPM) + 1TB SSD
- Windows 2012, 2014 server Operating Systems
- RHEL, CentOS, Ubuntu 16.04 OS
- CUDA libraries