by Josh Linden

Cloudian HyperStore Review

Cloudian’s HyperStore is a software-defined storage platform built on native support for the Amazon S3 API, with integrations with several other cloud storage APIs. Cloudian offers HyperStore in the form of storage appliances and a HyperStore Operating Environment software solution for commodity hardware. HyperStore's tight cloud integrations underline the platform's tiering, backup, replication, cold storage, and other functionality. This also allows Cloudian to support a variety of deployment options that can take advantage of a combination of local, remote, and cloud storage.

The HyperStore scale-out architecture distributes all data, metadata, configurations and operations across the cluster and supports deployment across multiple datacenters. HyperStore leverages object streaming and dynamic auto-tiering functionality to move data between the on-premises cloud and remote cloud storage services at scales up to thousands of servers and hundreds of petabytes of data in multiple data centers.

HyperStore implements the 51 operations required to meet the standard for "advanced" Amazon S3 compatibility, which allows developers and administrators to deploy storage that works natively with the Amazon S3 SDK. HyperStore was also developed from the ground up with support for multi-tenant deployments and requisite QoS, billing, and reporting features required in order for resellers and service providers to put the platform to in managed service provider environments. Users can run Hadoop analytics directly on HyperStore software and appliances as well.

One of the use cases for deploying HyperStore is to take advantage of CloudBerry’s backup functionality. CloudBerry Managed Backup can be run from Amazon EC2 servers with HyperStore storage integration with Amazon S3, Amazon Glacier, Google Nearline, Windows Azure, OpenStack, and other cloud storage providers. Service providers using HyperStore with CloudBerry’s Managed Backup solution can also provide web access to users or customers for data stored in the HyperStore instance.

Cloudian has just recently released Hyperstore Connect for Files. This new feature enables native support for SMB, NFS and FTP. This support means that file-based storage is now plug-and-play with Cloudian HyperStore. Hyperstore Connect for Files runs on top of a single global HyperStore object storage deployment reducing costs and management complexity and has two modules: Access Point is designed to be stateless and acts as a server that clients connect to in order to translate files to objects; and Global View Manager delivers a global namespace and global file locking for distributed collaboration across locations.

In order to prepare this overview of the HyperStore platform, we worked with HyperStore in its software appliance form in addition to a small DIY configuration in our lab. The software appliance can be used to establish a HyperStore instance once it is deployed to a minimum of three RedHat or CentOS server nodes. Current Cloudian hardware offerings include HyperStore FL3000 rack appliances feature eight storage nodes in 3U. Each 4U expansion unit can be deployed with up to 480TB. Cloudian list price for a 12U, 576TB HyperStore appliance deployment with five years of support at $324,000.

Cloudian HyperStore Hardware Specifications and Options

  • HyperStore FL3000 µNode Chassis
    • Function: Data Serving
    • Form Factor: 3U
    • Drives: 16xSSDs (2 per µNode)
    • Modules: 8xµNode
    • Connectivity: 16x10G SFP+, 8x1Gbe IPMI
    • Dimensions (LxWxH): 589mm x 438.4mm x 132.5mm 23.2” x 17.26” x 45.21”
    • Weight: 88lbs (39.92 kg)
    • Drive Size: 480GB SSD MLC
    • Power Supply: (2) 1620W Output @ 180-240V 10.5-8A, 50-60Hz
    • Cooling: 4x 8cm 11K RPM, 4-pin PWM cooling fans
    • RohS: Compliant
    • Disk Zones: 1 Zone per Node
  • HyperStore Expansion Shelf
    • Function: Data Storage
    • Form Factor: 4U
    • Drives: 60xHDDs
    • Modules: (2) Hot-swappable SAS Interface Modules (SIM) & (4) Hot-swappable Internal SAS Interface Modules (ISIM)
    • Connectivity: 2x4-port 6Gb/s mini-SAS ports
    • Dimensions (LxWxH): 1103.1mm x 447mm x 175.3mm 43.43” x 17.60” x 6.90”
    • Weight: 187.39 lbs (85 kg) with HDDs
    • Drive Size: 2, 4, 6, 8 TB SATA 7200rpm
    • Power Supply: (2) 1400W high efficiency redundant PSUs 200-240VAC, 50/60 Hz
    • Cooling: (7+1) Rotors redundant fan modules per system
    • RohS: Compliant
    • Disk Zones: 2 or 4 Zones per shelf
  • HyperStore FL3020 µNode
    • Data Disks per Node: 15
    • Supported Drive Types: 2, 4, 5, 6 TB
    • Max Capacity/Node: 30TB, 60TB, 75TB, 90TB
    • CPU Type: Intel E5-2640 V2.2GHz, 8 cores
    • Memory: 64GB
    • Connectivity: 2x10Gbe SFP+Port, 1x1Gbe IPMI LAN Port
    • Hyperstore OS Disks: 2x480GB SSDs
    • Disk Connectivity: 2x6bps SAS Ports
    • KVM: 1xVGA, 1xCOM, and 2xUSB 2.0 (with KVM dongle) ports
    • Switch: Power
  • HyperStore FL3050 µNode
    • Data Disks per Node: 30
    • Supported Drive Types: 2, 4, 5, 6 TB
    • Max Capacity/Node: 60TB, 120TB, 150TB, 180TB
    • CPU Type: Intel E5-2640 V2.2GHz, 8 cores
    • Memory: 128GB
    • Connectivity: 2x10Gbe SFP+ Port, 1x1Gbe IPMI Lan Port
    • Hyperstore OS Disks: 2x480GB SSDs
    • Disk Connectivity: 2x6bps SAS Ports
    • KVM: 1xVGA, 1xCOM, and 2xUSB 2.0 (with KVM dongle) ports
    • Switch: Power

Operating System and Environment

HyperStore makes use of web-based administration for system and cluster monitoring and data management, as well as to provide management interfaces for users, groups, rating plans, Quality of Service controls, and billing. REST-ful API options are available for integration with other provisioning, authentication, and billing systems.

HyperStore’s access management system provides identity and security workflows for users and administrators, including management of billing and charge-back policies for service providers. Multiple credentials per user are supported along with configurable group- and user-based QoS quotas for storage and bandwidth in multi-tenant clouds.

HyperStore deployments utilize up to three distributed filesystems: the Cassandra Files System (CASSANRDA), HyperStore File System (HFS), and Erasure Code (EC). CASSANDRA is used for metadata indexes and also to optimize storage of small files. The HyperStore File System is the data storage layer. It can use either replication or erasure coding (EC) to store objects. You are able to choose which storage method for different pools of storage (called Buckets). With EC it provides high data durability and availability with minimal space overhead (as low as 20% overhead in some deployments). The tradeoff is that it will be a higher latency to access objects and require more processing. For large backup and archive workloads, erasure coding is commonly used. Replication is used when people require faster access and also for cross region replication. Schedule-based automatic transition (Cloudian’s term for auto-tiering) is available from HyperStore storage to Amazon S3 storage, Amazon Glacier storage, a remote HyperStore deployment, or a third-party HyperStore service.

HyperStore AES-256 server-side encryption protects data at rest with SSL encryption for data in transit via HTTPS. The HyperStore File System also incorporates three optional compression schemes: snappy, lz4, and zlib. Snappy emphasizes speed over compression, with a rated compression throughput of 250MB/sec or more and a decompression rate of 500MB/sec or more. lz4 features a lower compression ratio than zlib with a rated compression speed of 400MB/s per core. zlib offers a medium compression ratio and speed with a high decompression rate.

In addition to HyperStore’s built-in functionality, HyperStore users can leverage several applications available from Cloudian’s CloudBerry Labs. The most notable of these offerings are the members of the CloudBerry Backup family, but Cloudian also offers applications for simplified cloud-based file management, mounting cloud storage as network drives, and CloudBerry Box, a bi-directional Dropbox-like tool to synchronize data across remote computers via a cloud storage account.

CloudBerry Backup includes scheduled and real-time backups, encryption and compression, bandwidth throttling, and block-level backup as well as backup for MS SQL server, MS Exchange, VMware and Hyper-V. An edition for Managed Service Providers can monitor user backup history from a Managed Backup control panel and can create sub-admin accounts with limited permissions.

Management

Signed in as the administrator, the main screen of the GUI gives us the region across the top. Beneath the region is the capacity managed blue being the amount used and green being the free amount. Directly beneath the capacity managed is the cluster health (as one can see there are Alerts currently active). To the right of these are the transactions per second and throughput measured in KB/s (in both charts PUTs are blue while GETs are green). Along the bottom of the screen is the number of users, groups, objects, nodes, and datacenters managed along with the software version. 

Along the top of the screen are tabs such as the main screen, analytics, buckets & objects, users & groups, cluster, alerts, settings, and help. In order to set up users and groups, admins need to click on the Users & Groups tab. Through this tab a new group can be added as well as a specific QoS for each group.

Admins can also creating rating plans within this tab. The rating plans are created for the purpose of billing reports. Admins can also check on the account activity for a set group or user. 

Through the analytics tab, admins can look at cluster usage looking at things such as region capacity consumption over time in GB, Object transactions per second, and throughput in KB/s. Capacity explorer shows capacity usage through a graphical representation. Again for billing reasons, admins can check usage by user. And they can search for specific objects.

Through the Buckets & Objects tab, admins can upload, create, or search for buckets and objects. The available buckets are listed off to the left hand side. Underneath the search is the list of the objects, their size, and when they were last modified.

The Cluster tab gives admins several different looks at their cluster. Within the tab are several sub-tabs for Data Centers, which show cluster by region and show the health of the cluster—green being clear, amber meaning an alert (pictured below each hexagon represent a node within the cluster). 

There is a Nodes Status tab that shows overall status, indicating aspects such as the percentage of disk space used, the percentage of CPU utilization, detailed info about the disks being used, information on memory usage, as well as service status, and event lists.

The Node Activity tab gives admins graphical representation of a particular operational aspect of the node. Users have several options including CPU utilization (pictured below), Disk Available, Disk Reads, Disk Writes, Network Throughput (outgoing), Network Throughput (incoming), Transactions (Get), Transactions (Put, Request Throughput (Get), Request Throughput (Put), Average Request Latency (Get), Average Request Latency (Put), Admin Memory Heap Usage, Cassandra Memory Heap Usage, HyperStore Memory Heap Usage, and S3 Memory Heap Usage.

The Advanced settings enable maintenance, disabling disks, the ability to collect diagnostics, and uninstall a node.

The Cluster Config tab allows admins to view their cluster information as well as update their license, view and edit their cluster configuration settings, and set up Auto teiring. For auto tiering customers are expected to have a single Amazon account.

The Storage Policy tab enables admins to set up the policy including EC2+1. This tab has point-and-click data distribution and a drop screen for selecting erasure coding K+M value. After select these, admins can assign the data center by region and data center within each region. Once the data centers are assigned users can set data and meta data consistency levels as well as group visibility.

Notification Rules allow an email to be sent to a specific address for specific items with in the node. Admins need to add in the email address they wish to receive the notification on and then select which item they would like to be notified about through the given rules. For example if they want to be notified if the cluster is using over 90% CPU utilization, they can set that up through this tab.

The final sub tab within the cluster tab shows repair status and repair history.

The Alerts tab indicates when there has been an issue or an change within the system. Alerts can be sorting by node or region and cleared by acknowledging them.

Conclusion

HyperStore is a storage platform that exemplifies the opportunities made possible and the convergence of widely-available cloud storage and open APIs. The HyperStore software appliance allows administrators to deploy a fully-compliant S3 object storage cloud across commodity server hardware, with the option to scale seamlessly with the addition of new commodity hardware or purpose-build HyperStore hardware appliances.

By building HyperStore from the ground up for interoperability with Amazon S3 other cloud platforms, Cloudian can offer customers solutions that make the most of third party cloud providers for tiering, backup, replication, and other functions without having to manage two different storage environments with different architectures or management paradigms. This also means that HyperStore administrators and users can make use of HyperStore's native support for the well-established ecosystem of S3 applications.

The Bottom Line

HyperStore brings the power and flexibility of on-premise S3 cloud storage to commodity hardware along with API integrations to make the most of offsite cloud storage from Amazon, Google, Microsoft, and others.

Cloudian HyperStore Operating Environment

Cloudian HyperStore Hardware Appliances

Discuss this review

Sign up for the StorageReview newsletter

Related News and Reviews