Haoyuan Li

Haoyuan Li
Haoyuan Li
Alma mater	UC Berkeley (Ph.D.) ; Cornell University (M.S.) ; Peking University (B.S.)
Known for	Alluxio
	Scientific career
Fields	Computer Science
Thesis	Alluxio: A Virtual Distributed File System (2018)
Doctoral advisor	Ion Stoica ; Scott Shenker
Website	haoyuanli.com

Last updated January 25, 2026

Haoyuan (H.Y.) Li is a computer scientist and entrepreneur specializing in distributed systems, big data, and cloud computing. He is best known for proposing Virtual Distributed File System (VDFS),^[1] and creating an open-source data orchestration system, Alluxio. He is the founder, chairman, and CEO of Alluxio, Inc,^[2]^[3] a company commercializing the Alluxio Data Orchestration Technology. He is also an adjunct professor at Peking University. He is a frequent speaker on the topic of AI, big data, cloud computing, and open source at conferences.

Biography

Li was born and raised in China. He attended Peking University, where he received a BS in Computer Science. While at university, he participated in programming contests representing Peking University, and placed 11th worldwide (bronze medal) in ACM ICPC 2005 and 13rd place worldwide in 2006. He then studied at Cornell University, where he received a MS in Computer Science.

He received his Computer Science PhD^[1] from the UC Berkeley AMPLab, under the supervision of Prof. Ion Stoica and Prof. Scott Shenker. During his PhD, he co-created the Alluxio (a.k.a. Tachyon) open-source project,^[4] which was commercialized by San Francisco Bay Area venture-backed company Alluxio, Inc.^[1]^[5]^[6]^[7]^[8]^[9] He was a co-founder of Alluxio, Inc.

During his PhD, he also co-created the Apache Spark Streaming project^[10] and became an Apache Spark committer.^[11]

References

1 2 3 Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.
↑ "Alluxio launches its memory-centric storage system for big data workloads". techcrunch.com. TechCrunch. 26 October 2016.
↑ Woodie, Alex (3 July 2019). "Celebrating Data Independence". datanami.com. Tabor Communications.
↑ Li, Haoyuan; Ghodsi, Ali; Zaharia, Matei; Shenker, Scott; Stoica, Ion. "Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks" (PDF).{{cite journal}}: Cite journal requires |journal= (help)
↑ Gage, Deborah (17 March 2015). "Andreessen Horowitz Invests $7.5M in Big-Data Startup Tachyon". wsj.com. The Wall Street Journal.
↑ Brust, Andrew (15 July 2019). "Alluxio 2.0 seeks to unify fragmented data ecosystem". ZDNet . CBS Interactive.
↑ Gillin, Paul (11 July 2019). "Alluxio's data orchestration platform now spans multiple clouds". siliconangle.com. SiliconANGLE Media Inc.
↑ Mellor, Chris (12 July 2019). "You need access to those big data silos – fast? No problem, says Alluxio". blocksandfiles.com. Blocks & Files.
↑ Wells, Joyce (11 July 2019). "Breaking Down Data Silos with Data Orchestration". dbta.com. Information Today Inc.
↑ Zaharia, Matei; Das, Tathagata; Li, Haoyuan; Hunter, Timothy; Shenker, Scott; Stoica, Ion. "Discretized Streams: Fault-Tolerant Streaming Computation at Scale" (PDF).{{cite journal}}: Cite journal requires |journal= (help)
↑ "Apache Spark Committer List".

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[VDFS-1] 1 2 3 Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.

[2] "Alluxio launches its memory-centric storage system for big data workloads". techcrunch.com. TechCrunch. 26 October 2016.

[3] Woodie, Alex (3 July 2019). "Celebrating Data Independence". datanami.com. Tabor Communications.

[4] Li, Haoyuan; Ghodsi, Ali; Zaharia, Matei; Shenker, Scott; Stoica, Ion. "Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks" (PDF).{{cite journal}}: Cite journal requires |journal= (help)

[5] Gage, Deborah (17 March 2015). "Andreessen Horowitz Invests $7.5M in Big-Data Startup Tachyon". wsj.com. The Wall Street Journal.

[6] Brust, Andrew (15 July 2019). "Alluxio 2.0 seeks to unify fragmented data ecosystem". ZDNet . CBS Interactive.

[7] Gillin, Paul (11 July 2019). "Alluxio's data orchestration platform now spans multiple clouds". siliconangle.com. SiliconANGLE Media Inc.

[8] Mellor, Chris (12 July 2019). "You need access to those big data silos – fast? No problem, says Alluxio". blocksandfiles.com. Blocks & Files.

[9] Wells, Joyce (11 July 2019). "Breaking Down Data Silos with Data Orchestration". dbta.com. Information Today Inc.

[10] Zaharia, Matei; Das, Tathagata; Li, Haoyuan; Hunter, Timothy; Shenker, Scott; Stoica, Ion. "Discretized Streams: Fault-Tolerant Streaming Computation at Scale" (PDF).{{cite journal}}: Cite journal requires |journal= (help)

[11] "Apache Spark Committer List".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Haoyuan Li

Alma mater	UC Berkeley (Ph.D.) Cornell University (M.S.) Peking University (B.S.)
Known for	Alluxio
Scientific career
Fields	Computer Science
Thesis	Alluxio: A Virtual Distributed File System (2018)
Doctoral advisor	Ion Stoica Scott Shenker

Website	haoyuanli.com