September 13th, 2018 by Adam Armstrong
NVIDIA Launches AI Data Center Platform
At GTC in Tokyo, NVIDIA announced the launch of its new AI data center platform, the NVIDIA TensorRT Hyperscale Platform. This new platform is stated as delivering the most advanced inference accelerator for voice, video, image, and recommendation services. The platform will leverage the company’s new Tesla T4 GPUs as well as a comprehensive set of new inference software.
Data centers process all types of queries now including voice, translations, images, videos, and various social media interactions. In order to address all of these different quires—that will require different types of neural networks—organizations need to leverage AI. NVIDIA’s new TensorRT Hyperscale Platform is a combination of hardware and software aimed at addressing the above issues. Leveraging Tesla T4 GPUs, based on the company’s Turing architecture, the new platform will be able to deliver high-performance with low latency for end-to-end applications.
Key elements include:
- NVIDIA Tesla T4 GPU – Featuring 320 Turing Tensor Cores and 2,560 CUDA cores, this new GPU provides breakthrough performance with flexible, multi-precision capabilities, from FP32 to FP16 to INT8, as well as INT4. Packaged in an energy-efficient, 75-watt, small PCIe form factor that easily fits into most servers, it offers 65 teraflops of peak performance for FP16, 130 teraflops for INT8 and 260 teraflops for INT4.
- NVIDIA TensorRT 5 – An inference optimizer and runtime engine, NVIDIA TensorRT 5 supports Turing Tensor Cores and expands the set of neural network optimizations for multi-precision workloads.
- NVIDIA TensorRT inference server – This containerized microservice software enables applications to use AI models in data center production. Freely available from the NVIDIA GPU Cloud container registry, it maximizes data center throughput and GPU utilization, supports all popular AI models and frameworks, and integrates with Kubernetes and Docker.