September 17th, 2015 by Adam Armstrong
Hedvig Incorporated Company Overview
As more and more companies look to software-defined storage (SDS) and software-defined data centers (SDDC), they have to begin to rethink their own data center architecture. More and more vendors and startups are coming out with products that challenge traditional storage (software-defined, Hyper-Converged) they do run into various limitations. This is where Hedvig enters the picture with its highly scalable SDS that is simple and flexible, but most importantly, elastic.
Avinash Lakshman founded Hedvig Incorporated (H.E.D.V.I.G. = Hyperscale Elastic Distributed Virtual Intelligent Granular) in 2012. That name may ring a few bells as Lakshman co-invented Amazon Dynamo in 2004. Lakshman worked at Amazon from 2004 until 2007. That year he invented Apache Cassandra. Initially invented to power Inbox Search feature at Facebook, Apache Cassandra's ability to search huge amounts of data and terrific fault tolerance has gone on to be used by giants such as Apple and Wikipedia. It was through the creation of Cassandra that Lakshman got the idea behind Hedvig.
Coming out of stealth in March of this year, Hedvig has already been able to accomplish a few substantial feats including landing on CRN’s emerging vendor list, being named a Tie50 winner out of 2,716 candidates, and securing over $30 million in funding. Their current customers include Intuit Inc., Dovilo, Van Dijk Education, and Paul Hastings LLP.
Hedvig’s product is its Distributed Storage Platform. The Distributed Storage Platform is what Hedvig is calling rearchitectured storage. While there are plenty of new SDS platforms emerging all over the market, Hedvig sets itself apart by its patented distributed systems technology. Its technology is comprised of three primary components: Hedvig Storage Service, Hedvig Storage Proxy, and Hedvig Virtual Disk Provides.
Hedvig Storage Service
Hedvig states that they can transform everyday, existing hardware into modern storage. Hedvig’s software can be deployed on any x86 or ARM server and it can be deployed in cloud environments. This gives customers the ability to use commodity hardware (including SSD and/or HDD) or their deployment of choice in the case of cloud environments. Hedvig Storage Service writes the data directly to the storage media. It captures all random writes into the system, sequentially ordering them into a log structured format that flushes sequential writes to disk. This gives the Distributed Storage Platform the ability to ingest data at a high rate while optimizing disk utilization.
Replication and protection of data happens in the Hedvig Storage Service as it uses a combination of synchronous and asynchronous replication to distribute data across the cluster, supporting up to 4 active data centers and up to six copies of the data in a single cluster. This distribution of data helps Hedvig better manage disk failure by automatically accessing data from other replicas across the cluster and then leverages all relevant nodes and disks for a faster rebuild. Hedvig states that this rebuild won’t impact primary I/O and even in the event of a node failure, read and writes continue as usual using the remaining replicas.
Hedvig Storage Proxy
The second component of the Distributed Storage Platform is the Storage Proxy. The Storage Proxy is an abstraction that is presented via either a VM or a container at the application tier. This gives storage access to each physical host and enables the Distributed Storage Platform to operate in existing environments. Users won’t have to change hypervisors, guest VMs, OSes, or applications, so there is no need to learn or adopt new processes or the time that is consumed in the adoption of new processes. The Storage Proxy can present block, file, or object to any compute environment.
The Storage Proxy presents virtual disks as locally mounted storage and traps local I/O, converting traffic to the Hedvig RPC protocol for communication to the underlying storage cluster. The proxy provides a client-side cache using SSDs and PCIe devices. Here the data is deduplicated before being transmitted over network links. The Storage Proxy also enable high availability by installing a HA active/passive pair. So if an active instance of the Storage Proxy is lost, it will automatically switch over to the passive instance.
Hedvig Virtual Disk
Storage is presented as a scalable abstraction called the Virtual Disk. In a manner of seconds users can create, provision, or remove an unlimited amount of Virtual Disks. This can be done through the CLI, GUI, or via API calls directly to the cluster. Users can configure the Virtual Disks with several attributes including: name, description, size, block size, disk type, residence (HDD or flash), raw device mapping, clustered file system, client-side caching, compression, deduplication, replication policy (agnostic, rack aware, or datacenter aware), and replication factor. Users can also setup unlimited snapshots and clones of virtual disks.
Deployment Options And Use Cases
The Hedvig Distributed Storage Platform support two different types of deployments, however both can be leveraged within the same cluster. Hedvig supports Hyperscale with the Storage Service being deployed on commodity servers and the Storage Proxy being deployed on application hosts. The other deployment type is Hyperconverged where the Storage Service and Storage Proxy are deployed on the same server. Users can set up a hyperscale system where they can scale compute and storage separately or a hyperconverged where they scale together.
The use cases where the Hedvig Distributed Storage Platform can be best utilized are Server Virtualization, Private Cloud, and Big Data. Hedvig is an ideal choice for Server Virtualization as they support multiple hypervisors with no change need to the OS and companies can use commodity hardware. With Private Cloud, Hedvig already provides cloud-like provisioning and allows businesses to pay as they grow similar to the cloud. And Hedvig’s elastic architecture is a good fit to the elastic nature of Big Data.