Artificial Intelligence and Machine Learning products and solutions are quickly becoming commonplace today and are shaping our experiences in computing like no other time in history. Interactive speech (e.g., Alexa, Google Home, etc.), Visual Search and recommendation engines are just a few of the consumer applications that are available today on our phones, websites and e-commerce platforms. The impact of machine learning is getting broader with enterprise applications in health sciences (e.g. Dr. Watson), finance, security, data centers and cyber surveillance.
These AI applications and solutions are now more viable than ever with the availability of modern machine learning and deep learning tools such as TensorFlow, Caffe, etc. and access to GPUs that are built specifically to perform parallel operations on large amounts of data, e.g., multiplying matrices of tens or hundreds of thousands of numbers. Processing large data sets through the same hypothesized algorithm for learning and for intelligent inference is a fairly common operation in machine learning and deep learning applications.
However, one significant challenge remains: deploying, configuring, and executing these complex tools and managing their inter-dependencies and versioning and compatibility with servers and GPUs. For example, in order to run TensorFlow, users need to make sure that they have the correct version of BIOS on their server, the compatible Windows or Linux drivers, and CUDA library for the specific GPU and server they want to run their AI workload on. If any of these are not correctly configured with compatible versions, the AI application will not function correctly or will perform very poorly.
ZeroStack provides single-click deployment of TensorFlow deep learning tool sets, taking care of all the OS and CUDA library dependencies. Furthermore, users can enable GPU acceleration with dedicated access to multiple GPU resources for an order-of-magnitude faster inference latency and user responsiveness.
ZeroStack’s GPU capability gives customers powerful features to automatically detect GPUs and make them available for users to run their AI applications. In order to maximize utilization of this powerful resource, cloud admins can configure, scale, and allow fine-grained access control of GPU resources to end users.
ZeroStack recommends the following hardware specs
for servers hosting the GPU cards.
- CPU: Intel Xeon E2630 v4 – 10 core processor w/ virtualization (VT-x) and IOMMU (VT-d) support or a similar AMD CPU with AMD-V and AMD-Vi support
- RAM: Minimum 80 GB DDR4 2133 MHz (128 Gb recommended for Deep learning apps)
- Motherboard: PCI 3.0 compliant motherboard, check for GPU compatibility and BIOS options
- Storage: At least 2 TB HDD (7200 RPM) + 1TB SSD
- Windows 2012, 2014 server Operating Systems
- RHEL, CentOS, Ubuntu 16.04 OS
- CUDA libraries