OpenIO

Last updated
OpenIO
Industry Information technology, Data storage, Data processing
Founded2015
Headquarters
France
Area served
Worldwide
Key people
Laurent Denel (CEO,
Jean-François Smigielski (CTO)
ProductsOpenIO Object Storage
Website www.openio.io   OOjs UI icon edit-ltr-progressive.svg

OpenIO has been founded in 2015 by Laurent Denel (CEO) and six co-founders, to offer an object storage solution for building hyper-scalable IT infrastructures for a wide range of applications. [1] OpenIO leverages open source software, developed since 2006, which is based on a grid technology that enables dynamic behaviors and supports heterogenous hardware. [2] In October 2017 OpenIO completed a $5 million funding round. [3] In October 2019, OpenIO announced that it had crossed the terabit per second barrier by benchmarking an infrastructure provided by Criteo. [4]

Contents

Product

OpenIO is a software-defined object storage solution that supports S3 and can be deployed on-premises, cloud-hosted or at the edge, on any hardware mix. It has been designed from the beginning for performance and cost-efficiency at any scale [5] , and it has been optimized for Big Data, HPC and AI. [6]

OpenIO stores objects within in a flat structure within a massively distributed directory with indirections, this allows the data query path to be independent of the number of nodes and the performance not to be affected by the growth of the capacity. Servers are organized as a grid of nodes massively distributed, where each node takes part in directory and storage services, which ensures that there is no single point of failure and that new nodes are automatically discovered and immediately available without the need to rebalance data. [7]

The software is built on top of ConsciousGrid, a technology that ensures optimal data placement based on real-time metrics and allows the addition or removal of storage devices with automatic performance and load impact optimization. [8] [9] For data protection OpenIO has synchronous and asynchronous replication with multiple copies, and an erasure coding implementation based on Reed-Solomon that can be deployed in one data center or geo-distributed or stretched clusters. [10] [11]

The software has a GridForApps feature that catches all events that occur in the cluster and can pass them up in the stack or to applications running on OpenIO nodes. This enables event-driven computing directly into the storage infrastructure. [12] [13]

OpenIO has native object APIs and SDKs for Python, C and Java, it integrates a HTTP REST/API and has strong compatibility with the Amazon S3 API and the OpenStack Swift API. [7] The company also offers a proprietary File System connector to access data stored in an OpenIO object store through file access methods: it is based on Fuse and presents a POSIX File System which can be shared over local networks via NFS, SMB and FTP. [14]

OpenIO is compatible with x86 and ARMv7/ARMv8 servers running Linux [15] and has low hardware requirements [16] , it can be installed also on Raspberry Pis [17] [18] and on storage drives with embedded server. [19] [20] [21]

The open source code is available on Github and it is licensed under AGPL3 for server code and LGPL3 for client code.

Performance

OpenIO claims to have reached 1.372 Tbps write speed (171 GB/s) on a cluster of 350 physical machines. [22] The benchmark scenario, conducted under production conditions with standard hardware (commodity servers with 7200 rpm HDDs), consisted in backing up a 38 PB Hadoop datalake via the DistCp command. [23] This level of performance marks, according to analysts [24] , the arrival of a new generation of object storage technologies oriented toward high performance and hyper-scalability.

See also

Related Research Articles

NetApp company

NetApp, Inc. is an American hybrid cloud data services and data management company headquartered in Sunnyvale, California. It has ranked in the Fortune 500 since 2012. Founded in 1992 with an IPO in 1995, NetApp offers hybrid cloud data services for management of applications and data across cloud and on-premises environments.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 2 and No. 3 ranked TOP500 supercomputers in 2014, Titan and Sequoia.

In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on top of" the resulting platform.

Gluster Inc. was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineering center in Bangalore, India. Gluster was funded by Nexus Venture Partners and Index Ventures. Gluster was acquired by Red Hat on October 7, 2011.

Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for computer clusters built from commodity hardware—still the common use—it has also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network. Amazon S3 can be employed to store any type of object which allows for uses like storage for Internet applications, backup and recovery, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage. AWS launched Amazon S3 in the United States on March 14, 2006, then in Europe in November 2007.

Ceph is an open-source software storage platform, implements object storage on a single distributed computer cluster, and provides 3in1 interfaces for : object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available.

Scale out File Services (SoFS) is a highly scalable, grid-based NAS implementation developed by IBM. It is based on IBM's high-performance shared-disk clustered file system Spectrum Scale. SoFS exports the clustered file system through industry standard protocols like SMB, NFS, FTP and HTTP. Released in 2007, SoFS is a second generation file services architecture used within IBM since 2001.

Microsoft Azure is a cloud computing service created by Microsoft for building, testing, deploying, and managing applications and services through Microsoft-managed data centers. It provides software as a service (SaaS), platform as a service (PaaS) and infrastructure as a service (IaaS) and supports many different programming languages, tools, and frameworks, including both Microsoft-specific and third-party software and systems.

Redis open-source in-memory database

Redis is an in-memory data structure project implementing a distributed, in-memory key-value database with optional durability. Redis supports different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets, HyperLogLogs, bitmaps, streams, and spatial indexes. The project is mainly developed by Salvatore Sanfilippo and as of 2019, is sponsored by Redis Labs. It is open-source software released under a BSD 3-clause license.

OpenStack software system for cloud computing

OpenStack is a free open standard cloud computing platform, mostly deployed as infrastructure-as-a-service (IaaS) in both public and private clouds where virtual servers and other resources are made available to users. The software platform consists of interrelated components that control diverse, multi-vendor hardware pools of processing, storage, and networking resources throughout a data center. Users either manage it through a web-based dashboard, through command-line tools, or through RESTful web services.

Object storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier. Object storage can be implemented at multiple levels, including the device level, the system level, and the interface level. In each case, object storage seeks to enable capabilities not addressed by other storage architectures, like interfaces that can be directly programmable by the application, a namespace that can span multiple instances of physical hardware, and data-management functions like data replication and data distribution at object-level granularity.

Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail and YouTube. Alongside a set of management tools, it provides a series of modular cloud services including computing, data storage, data analytics and machine learning. Registration requires a credit card or bank account details.

Scality is a global company based in San Francisco, California that develops software-defined object storage. The Scality scale-out object storage software platform called RING is the company's commercial product. Scality RING software deploys on industry-standard x86 servers to store objects and files. Scality also offers a number of open source tools called Zenko, including Zenko CloudServer, compatible with the Amazon S3 API.

Kubernetes software to manage containers on a server-cluster

Kubernetes is an open-source container-orchestration system for automating application deployment, scaling, and management. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation. It aims to provide a "platform for automating deployment, scaling, and operations of application containers across clusters of hosts". It works with a range of container tools, including Docker. Many cloud services offer a Kubernetes-based platform or infrastructure as a service on which Kubernetes can be deployed as a platform-providing service. Many vendors also provide their own branded Kubernetes distributions.

Dell EMC ScaleIO is a software-defined storage product from Dell EMC that creates a server-based storage area network (SAN) from local application server storage using existing customer hardware or EMC servers. It converts direct-attached storage into shared block storage.

MinIO is a cloud storage server compatible with Amazon S3, released under Apache License v2.

Nextcloud free and open-source file hosting software suite

Nextcloud is a suite of client-server software for creating and using file hosting services. Nextcloud is free and open-source, which means that anyone is allowed to install and operate it on their own private server devices.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed word "Data" from it and remove 7-Mode image, therefore, ONTAP 9 is successor from Clustered Data ONTAP 8.

Datera is a global enterprise software company headquartered in Santa Clara, California that develops elastic block scale-out, software-defined storage.

References

  1. "OpenIO Object Storage Overview". OpenIO Official Website. Retrieved 2019-09-10.
  2. "The History Boys: Object storage ... from the beginning". The Register. Retrieved 2017-10-05.
  3. Dillet, Romain. "OpenIO raises $5 million to build your own Amazon S3 on any storage device". TechCrunch. Retrieved 2017-10-26.
  4. "OpenIO To Lead In Big Data Storage, After Achieving Ultra-High Performance". OpenIO Object Storage. Retrieved 2019-11-22.
  5. "Openio's objective is opening up object storage space". The Register. Retrieved 2017-10-06.
  6. "OpenIO | High Performance Object Storage for Big Data and AI". OpenIO Official Website. Retrieved 2019-09-30.
  7. 1 2 "OpenIO Core Concepts". OpenIO Documentation. Retrieved 2016-11-11.
  8. "OpenIO Object Storage for Big Data". OpenIO Official Website. Retrieved 2019-09-30.
  9. "Why We Designed an Object Store with a Conscience". OpenIO Blog. 2017-07-18. Retrieved 2019-10-01.
  10. "OpenIO Data Management Features". OpenIO Documentation. Retrieved 2019-09-30.
  11. "OpenIO Storage Policies". OpenIO Documentation. Retrieved 2019-10-01.
  12. "Simple Metadata Indexing through Grid for Apps". OpenIO Blog. Retrieved 2017-10-06.
  13. "Detect patterns in pictures at scale using Tensorflow and OpenIO GridForApps". OpenIO Blog. Retrieved 2017-10-06.
  14. "OpenIO File System Connector (OIO-FS) Architecture". OpenIO Documentation. Retrieved 2019-10-01.
  15. "OpenIO Supported Linux Distributions". OpenIO Documentation. Retrieved 2019-10-01.
  16. "OpenIO Sizing Guide". OpenIO Documentation. Retrieved 2019-10-01.
  17. "Raspberry Pi in the data center: A unique option for object storage and edge computing". TechRepublic. Retrieved 2017-08-28.
  18. "OpenIO on a Raspberry Pi". OpenIO Documentation. Retrieved 2019-10-01.
  19. "Open source object storage startup OpenIO adds hardware". SearchCloudStorage. Retrieved 2017-08-28.
  20. "OpenIO wants to turn your spinning rust into object storage nodes". The Register. Retrieved 2017-10-06.
  21. "Is this the real life? Is this just fantasy? Self-processing flash drives, we'll need more capacity". The Register. Retrieved 2017-10-06.
  22. "OpenIO 'solves' the problem with object storage hyperscalability – Blocks and Files".
  23. "Terabit Challenge | OpenIO Object Storage".
  24. "S3, file access and high performance… this is not your old object storage".