Qumulo Shift - File to Object in One Easy Leap

Enterprise infrastructure is increasingly synonymous with hybrid cloud architecture. Today’s enterprise IT service delivery requires hybrid designs that smartly leverage the best characteristics of “traditional”(on-premises) and public cloud solutions to achieve SLA commitments to the business. To this end, p1 seeks out technologies with outstanding performance characteristics for on-prem workloads and software features that deliver seamless integration with public cloud platforms. One vendor we believe has adopted the “hybrid cloud is the way forward” approach with their product development strategy is Qumulo. The latest proof of their support for hybrid architecture is a new software tool they call Qumulo Shift. In this post, we’ll provide an overview of what Shift is today and a few of the anticipated future features. We’ll also touch on what is not. Finally, we’ll provide a link to a how-to video for setting up a Shift replication job.

Shift – A Work in Progress

Shift is free software feature available in Qumulo’s QFS to replicate native file data to an AWS S3 object storage format. It’s a powerful but simple way to stage data sets in order to leverage the vast portfolio of AWS cloud services and applications. It is clear that Shift is a key piece of Qumulo’s data placement tool-set. It is also evidence of their commitment to enable customers to put their data wherever it best serves their needs, even if that means helping them to move the data off their Qumulo storage platform. Shift treats data in S3 as an independent entity, not as an extension of your on-prem storage and access is de-coupled from the local cluster.

The first public release of Shift was in July of 2020 with QFS 3.1.4 and it’s important to note up front it is a work in progress. Qumulo’s approach has been to release a well-tested basic features and then layer on functionality over time until they are full-featured components of the overall software suite. With Shift, the strategy is the same and similar to early releases of sync, snapshots and quotas by Qumulo. We fully expect Shift to grow.

Shift is limited to a one-time replication of a directory structure on a Qumulo storage array to a folder in an S3 bucket presently. The relationships are 1-to-1, but multiple relationships are supported. Files are stored as unmodified objects in S3 and are usable by any tool that supports the AWS S3 API or CLI. This includes most AWS Services as well as 3^rd party products like MSP360 and CyberDuck. Shift uses S3 metadata tags to store a checksum of the file to ensure data integrity after conversion. Today, there’s a 5 GB file size limit on objects moved into S3 by Shift, but they are working to increase that.

The Roadmap

On the development roadmap are features such as:

Scheduled replication
Repeat transfer of data from QFS file to S3
Setup of transfer windows for performance control
Checks for re-transmission of objects using the checksum metadata tag
The ability to store ownership and permissions info as metadata tags on the S3 objects
Export of objects back into QFS file from S3, closing the data transport loop

In addition, we’ve suggested customer friendly features such as:

Add user-defined tags to be applied to objects when creating a replication relationship
Support for authorization via AWS Roles to comply with AWS security best practices for Qumulo clusters running in EC2
The ability to define S3 storage classes when creating a replication relationship

A couple points to be made about what Shift is not today. Shift is not a backup strategy – yet. There’s no easy way to return objects to Qumulo cluster files currently. There’s no data catalog and no current concept of incremental transfers. It’s also not a fully featured archive solution, but it could be a key part of a larger solution in the future with further feature development.

In summary, Shift is the beginning of a robust distributed data strategy for Qumulo and a foundation for future development of features which could include data protection. For now, its primary value is easy conversion of files to objects and for that alone it is valuable.

If you’d like to learn a bit more, you can follow along in our video where we’ll walk you through creating a replication relationship and moving data from an on-premises Qumulo cluster into the S3 bucket. Thanks for your consideration, and if you’d like to learn more, please reach out to the outstanding engineering team at p1 Technologies.