Home Enterprise VAST Data Deep Learning Data Platform – Built for AI

VAST Data Deep Learning Data Platform – Built for AI

by Harold Fritts
vast data platform

VAST Data introduced a data computing platform designed to be the foundation of AI-assisted discovery. The VAST Data Platform is the latest offering unifying storage, database, and virtualized compute engine services in a scalable system built from the ground up for the future of AI.

VAST Data introduced a data computing platform designed to be the foundation of AI-assisted discovery. The VAST Data Platform is the latest offering unifying storage, database, and virtualized compute engine services in a scalable system built from the ground up for the future of AI.

The VAST Data Platform was built to include the volumes of global data generated and processed in real-time, including unstructured and structured data like video, imagery, free text, data streams, and instrument data. This approach aims to close the gap between event-driven and data-driven architectures with the ability to access and process data in any private or major public cloud data center, embed queryable semantic layers into the data to better understand natural data, and compute data in real-time continuously and recursively with each interaction.vast data platform

Beyond Large Language Models to AI-Assisted Discovery

Generative AI and Large Language Models (LLMs) introduced the world to the early capabilities of artificial intelligence; however, LLMs are limited to performing routine tasks like business reporting or reciting already-known information. Only when machines can recreate the process of discovery by capturing, synthesizing, and learning from data will the true promise of AI be realized. This level of specialization can now be achieved in a matter of days rather than decades.

AI-driven discovery will accelerate the search to solve our biggest challenges, finding treatments for disease and cancers, tackling climate change, innovative approaches to agriculture, and uncovering new fields of science and maths. Existing data platforms are popular for global enterprises, dramatically reducing infrastructure deployment complexity for business intelligence and reporting applications. However, they still need to meet the needs of new deep-learning applications.

The next generation of AI infrastructure must deliver parallel file access, GPU-optimized performance for neural network training and inference on unstructured data, and a global namespace spanning hybrid multi-cloud and edge environments; all unified within one easy-to-manage offering that enables federated deep learning.

DASE: The Heart of VAST Data Platform

From its beginning, VAST has put natural data, rich metadata, functions, and triggers at the center of the VAST Disaggregated Shared-Everything (DASE) distributed systems architecture. By eliminating tradeoffs of performance, capacity, scale, simplicity, and resilience, DASE has laid the data foundation for deep learning, making it possible to train models on the entirety of an enterprise’s data. Allowing customers to add logic to the system, machines can continuously and recursively enrich and understand data from the natural world.

The new announcements from VAST roadmap out a way to accelerate training workflows. For large enterprises, having a rapid implementation path for generative AI is paramount. VAST laid out their plans to help achieve this by being able to run transformer-type functions on objects stored on their platform. Take for example random distortions applied to a set of training images, the functions coming to the VAST platform would allow for transformations on training data as it is needed, rather than having to preprocess them at the cost of consuming more storage.

The developments from VAST promising to accelerate training workflows open a new horizon for generative AI within enterprises that require high fidelity, rapid retraining response, and complex modeling. Highly regulated industry stands to benefit enormously. Analysts can leverage VAST’s capabilities to run transformer functions on objects, generating detailed models that would be time and space-consuming to create manually. Real-time generation and alteration of graphical elements can enhance the creative workflow as well, allowing for a more dynamic and interactive design process.

Unified Global DataStore, DataBase, And AI Computing Engine

The VAST DataStore is a scalable storage architecture for unstructured data that eliminates storage tiering. Designed to capture and serve data from the natural world, VAST first engineered the foundation of its platform. The VAST DataStore is an enterprise network-attached storage platform built to meet the needs of robust AI computing architectures, such as NVIDIA DGX SuperPOD AI supercomputers and big-data and HPC platforms.

The efficiency of the exabyte-scale DataStore brings archive economics to flash infrastructure, making it suitable for archive applications. Resolving the cost of flash storage is critical to laying the foundation for deep learning for enterprise customers as they look to train models on their proprietary data assets.

VAST DataBase

VAST DataBase has been introduced to apply structure to unstructured natural data. By combining the characteristics of a database, a data warehouse, and a data lake all in one simple, distributed, and unified database management system, VAST has resolved the tradeoffs between transactions (to capture and catalog natural data in real-time) and analytics (to analyze and correlate data in real-time). VAST DataBase Designed for rapid data capture and fast queries at any scale, the VAST DataBase breaks the barriers of real-time analytics from the event stream to the archive.

With a foundation for synthesized structured and unstructured data, the VAST Data Platform makes it possible to refine and enrich raw unstructured data into structured, queryable information with support for functions and triggers. The VAST DataEngine is a global function execution engine that consolidates data centers and cloud regions into one global computational framework. The engine supports popular programming languages, such as SQL and Python. It introduces an event notification system and materialized and reproducible model training making it easier to manage AI pipelines.

VAST DataSpace

The final element of the VAST Data Platform strategy is the VAST DataSpace. This global namespace permits every location to store, retrieve, and process data from any location with high performance while enforcing strict consistency across every access point. With the DataSpace, the VAST Data Platform is deployable in on-premises data centers and edge environments. It now also extends DataSpace access into leading public cloud platforms, including AWS, Microsoft Azure, and Google Cloud.

This global, data-defined computing platform takes a new approach to marrying unstructured data with structured data by storing, processing, and distributing that data from a single, unified system.

The VAST DataStore, DataBase, and DataSpace are generally available within the VAST Data Platform today. The VAST DataEngine will be made available in 2024.

Learn more by visiting Vast’s BuildBeyond.ai.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed