Alluxio

Last updated • 2 min readFrom Wikipedia, The Free Encyclopedia
Alluxio
Original author(s) Haoyuan Li
Developer(s) UC Berkeley AMPLab
Initial release April 8, 2013;12 years ago (2013-04-08)
Stable release
v2.9.4 / June 11, 2024;9 months ago (2024-06-11) [1]
Repository https://github.com/Alluxio/alluxio
Written in Java
Operating system macOS, Linux
Available inJava
License Apache License 2.0
Website www.alluxio.io

Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, [2] advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio is situated between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License.

Contents

Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs (such as Hadoop HDFS API, S3 API, FUSE API) provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Apache Spark, [3] Presto, TensorFlow, Trino, [4] Apache Hive, and PyTorch, etc.[ citation needed ]

Alluxio can be deployed on-premise, in the cloud (e.g. Microsoft Azure, AWS, Google Compute Engine), or a hybrid cloud environment. It can run on bare-metal or in containerized environments such as Kubernetes, Docker, Apache Mesos.

Alluxio also plays a key role in accelerating AI-related workloads, particularly in the areas of model training and model distribution. In machine learning and AI workflows, model training often requires access to large datasets stored across multiple platforms, including on-premises and cloud storage. Alluxio addresses this challenge by providing a unified data layer that caches frequently accessed data in memory, reducing data retrieval latency and eliminating I/O bottlenecks. This leads to faster and more efficient model training. Additionally, Alluxio facilitates model distribution by enabling seamless data access across heterogeneous storage systems, making it easier to share models and datasets between different computational frameworks and environments. By integrating with popular AI frameworks such as TensorFlow and PyTorch, Alluxio ensures fast data access, regardless of its physical location, which is essential for the efficient training and distribution of AI models at scale.

History

Alluxio was initially started by Haoyuan Li at UC Berkeley's AMPLab in 2013, and open sourced in 2014. Alluxio had in excess of 1000 contributors in 2018, [5] making it one of the most active projects in the data eco-system.

VersionOriginal release dateLatest versionRelease date
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.2" | Old version, not maintained: 0.22013-04-080.2.12013-04-25
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.3" | Old version, not maintained: 0.32013-10-210.3.02013-10-21
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.4" | Old version, not maintained: 0.42014-02-020.4.12014-02-25
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.5" | Old version, not maintained: 0.52014-07-200.5.02014-07-20
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.6" | Old version, not maintained: 0.62015-03-010.6.42015-04-23
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.7" | Old version, not maintained: 0.72015-07-170.7.12015-08-10
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="0.8" | Old version, not maintained: 0.82015-10-210.8.22015-11-10
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.0" | Old version, not maintained: 1.02016-02-231.0.12016-03-27
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.1" | Old version, not maintained: 1.12016-06-061.1.12016-07-04
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.2" | Old version, not maintained: 1.22016-07-171.2.02016-07-17
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.3" | Old version, not maintained: 1.32016-10-051.3.02016-10-05
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.4" | Old version, not maintained: 1.42017-01-121.4.02017-01-12
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.5" | Old version, not maintained: 1.52017-06-111.5.02017-06-11
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.6" | Old version, not maintained: 1.62017-09-241.6.12017-11-02
class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.7" | Old version, not maintained: 1.72018-01-141.7.12018-03-26
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="1.8" | Old version, still maintained: 1.82018-07-071.8.22019-08-05
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.0" | Old version, still maintained: 2.02019-06-272.0.12019-09-03
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.1" | Old version, still maintained: 2.12019-11-062.1.22020-02-04
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.2" | Old version, still maintained: 2.22020-03-112.2.22020-06-24
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.3" | Old version, still maintained: 2.32020-06-302.3.02020-06-30
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.4" | Old version, still maintained: 2.42020-10-192.4.12020-11-20
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.5" | Old version, still maintained: 2.52021-03-102.5.02021-03-10
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.6" | Old version, still maintained: 2.62021-06-232.6.22021-09-17
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.7" | Old version, still maintained: 2.72021-11-162.7.42022-04-19
class="templateVersion co swatch-maintained" style="color: var(--color-base, #202122); " title="Old version, still maintained" data-sort-value="2.8" | Old version, still maintained: 2.82022-05-042.8.12022-08-17
class="templateVersion c swatch-latest" style="color: var(--color-base, #202122); " title="Latest version" data-sort-value="2.9" | Latest version:2.92022-11-162.9.32023-03-27
Legend:
Old version, not maintained
Old version, still maintained
Latest version
Latest preview version

Enterprises that use Alluxio

The following is a list of notable enterprises that have used or are using Alluxio:

See also

References

  1. "Releases · Alluxio/alluxio". github.com. Retrieved 2025-02-09.
  2. Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.
  3. "Running Spark on Alluxio - Alluxio v2.9.5 (stable)". Alluxio. Retrieved 14 February 2025.
  4. "Alluxio file system support — Trino 470 Documentation". trino.io. Retrieved 14 February 2025.
  5. Open HUB Alluxio development activity
  6. "This New Open Source Project Is 100X Faster than Spark SQL In Petabyte-Scale Production".
  7. "Making the Impossible Possible with Tachyon: Accelerate Spark Jobs from Hours to Seconds".
  8. "China Unicom's big bet on open source".
  9. "Operationalizing Machine Learning—Managing Provenance from Raw Data to Predictions".
  10. "Cray Analytics and Alluxio – Wrangling Enterprise Storage". Archived from the original on 2019-07-14. Retrieved 2019-02-19.
  11. "Alluxio's Use and Practice in Didi".
  12. "Data Transformation in Financial Services".
  13. "ArcGIS and Alluxio - Using Alluxio to enhance ArcGIS data capability and get faster insights from all your data".
  14. "Huawei hugs open-sourcey Alluxio: Thanks for the memories". The Register .
  15. "How Alluxio is Accelerating Apache Spark Workloads". Archived from the original on 2019-07-14. Retrieved 2019-02-19.
  16. "Getting Started with Tachyon by Use Cases".
  17. "Using Alluxio as a fault-tolerant pluggable optimization component of JD.com's compute frameworks".
  18. "World's Largest Computer Maker Lenovo Selects Alluxio for Data Management of Worldwide Smartphone Data".
  19. "Enhancing the Value of Alluxio with Samsung NVMe SSDs".
  20. "Tencent Delivering Customized News to Over 100 Million Users per Month with Alluxio".
  21. "The Practice of Alluxio in Near Real-Time Data Platform at VIPShop".
  22. "Bringing Data to Life - Data Management and Visualization Techniques".