Matei Zaharia

Last updated
Matei Zaharia
Born1984or1985(age 39–40)
Education UC Berkeley (Ph.D.)
University of Waterloo (BMath)
Known for Apache Spark
Awards ACM Doctoral Dissertation Award (2014)
Presidential Early Career Award for Scientists and Engineers (2019)
SIGOPS Mark Weiser Award (2023)
Scientific career
Fields Computer science
Institutions UC Berkeley
Stanford University
Databricks
Thesis An Architecture for Fast and General Data Processing on Large Clusters  (2013)
Doctoral advisor Ion Stoica
Scott Shenker
Website people.eecs.berkeley.edu/~matei/

Matei Zaharia (born 1984 or 1985 [1] ) is a Romanian-Canadian computer scientist, educator and the creator of Apache Spark. [2] [3] [4]

Contents

As of April 2022, Forbes ranked him and Ion Stoica as the 3rd-richest people in Romania with a net worth of $1.6 billion. [5]

Biography

Zaharia graduated from secondary school at Jarvis Collegiate Institute before moving to become an undergraduate at the University of Waterloo. [6] Zaharia was a gold medalist at the International Collegiate Programming Contest, where his team University of Waterloo placed fourth in the world and first in North America in 2005. [7] During his undergraduate degree at the University of Waterloo, he also greatly contributed to water rendering physics in the now open-source game called 0 A.D. [8] He also helped mod the Age of Mythology scenario called Norse Wars, which was re-adapted into the Age of Empires III scenario called Fort Wars. [9] While at University of California, Berkeley's AMPLab in 2009, he created Apache Spark as a faster alternative to MapReduce. [10] He received the 2014 ACM Doctoral Dissertation Award for his PhD research on large-scale computing. [11]

In 2013 Zaharia was one of the co-founders of Databricks where he serves as chief technology officer. [3]

He joined the faculty of MIT in 2015, and then became an assistant professor of computer science at Stanford University in 2016.

In 2019, Zaharia received the Presidential Early Career Award for Scientists and Engineers. [6]

In 2019 he was spearheading MLflow at Databricks, while still teaching. [12] [13] [14]

In 2023, he joined the faculty of the University of California, Berkeley as an associate professor. [15]

See also

Related Research Articles

<span class="mw-page-title-main">Srinivasan Keshav</span> Canadian computer scientist

Srinivasan Keshav is a Computer Scientist who is currently the Robert Sansom Professor of Computer Science at the University of Cambridge.

Scott J. Shenker is an American computer scientist, and professor of computer science at the University of California, Berkeley. He is also the leader of the Extensible Internet Group at the International Computer Science Institute in Berkeley, California.

David Ross Cheriton is a Canadian computer scientist, businessman, philanthropist, and venture capitalist. He is a computer science professor at Stanford University, where he founded and leads the Distributed Systems Group.

<span class="mw-page-title-main">Gary Miller (computer scientist)</span> American computer scientist

Gary Lee Miller is an American computer scientist who is a professor of computer science at Carnegie Mellon University. In 2003 he won the ACM Paris Kanellakis Award for the Miller–Rabin primality test. He was made an ACM Fellow in 2002 and won the Knuth Prize in 2013.

Michael Jay Franklin is an American software entrepreneur and computer scientist specializing in distributed and streaming database technology. He is Liew Family Chair of Computer Science and chairman for the Department of Computer Science at the University of Chicago.

<span class="mw-page-title-main">Scott Aaronson</span> American computer scientist (born 1981)

Scott Joel Aaronson is an American theoretical computer scientist and Schlumberger Centennial Chair of Computer Science at the University of Texas at Austin. His primary areas of research are computational complexity theory and quantum computing.

<span class="mw-page-title-main">Ion Stoica</span> Romanian–American computer scientist

Ion Stoica is a Romanian–American computer scientist specializing in distributed systems, cloud computing and computer networking. He is a professor of computer science at the University of California, Berkeley and co-director of AMPLab. He co-founded Conviva and Databricks with other original developers of Apache Spark.

<span class="mw-page-title-main">Data science</span> Field of study to extract insights from data

Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge and insights from potentially noisy, structured, or unstructured data.

<span class="mw-page-title-main">Apache Spark</span> Open-source data analytics cluster computing framework

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

<span class="mw-page-title-main">Databricks</span> American software company

Databricks, Inc. is a global data, analytics, and artificial intelligence company founded by the original creators of Apache Spark.

<span class="mw-page-title-main">Apache Mesos</span> Software to manage computer clusters

Apache Mesos is an open-source project to manage computer clusters. It was developed at the University of California, Berkeley.

Matei Zaharia Boilă was a conservative Romanian politician, who later became a Greek Catholic priest. Boilă was influenced by the activity of his great uncle on his mother's side of the family, Iuliu Maniu, a Prime Minister of Romania. He represented the Christian Democratic National Peasants' Party (CDNPP) in the Senate between 1992 and 2000.

The ACM SIGOPS Mark Weiser Award is awarded to an individual who has shown creativity and innovation in operating system research. The recipients began their career no earlier than 20 years prior to nomination. The special-interest-group-level award was created in 2001 and is named after Mark Weiser, the father of ubiquitous computing.

<span class="mw-page-title-main">Ali Ghodsi</span> Swedish computer scientist

Ali Ghodsi is a Swedish-American computer scientist and entrepreneur of Persian origin, specializing in distributed systems and big data. He is a co-founder and CEO of Databricks and an adjunct professor at UC Berkeley. He coauthored several influential papers, including Apache Mesos and Apache Spark SQL.

AMPLAB was a University of California, Berkeley lab focused on big data analytics located in Soda Hall. The name stands for the Algorithms, Machines and People Lab. It has been publishing papers since 2008 and was officially launched in 2011. The AMPLab was co-directed by Professor Michael J. Franklin, Michael I. Jordan, and Ion Stoica.

Reza Zadeh is an American computer scientist and technology executive working on machine learning. He is adjunct professor at Stanford University, CEO of Matroid, and a founding team member at Databricks. His work focuses on machine learning, distributed computing, and discrete applied mathematics. His awards include a KDD Best Paper Award and the Gene Golub Outstanding Thesis Award at Stanford.

Reynold Xin is a computer scientist and engineer specializing in big data, distributed systems, and cloud computing. He is a co-founder and Chief Architect of Databricks. He is best known for his work on Apache Spark, a leading open-source Big Data project. He was designer and lead developer of the GraphX, Project Tungsten, and Structured Streaming components and he co-designed DataFrames, all of which are part of the core Apache Spark distribution; he also served as the release manager for Spark's 2.0 release.

The ACM Doctoral Dissertation Award is awarded annually by the Association for Computing Machinery to the authors of the best doctoral dissertations in computer science and computer engineering. The award is accompanied by a prize of US$20,000 and winning dissertations are published in the ACM Digital Library. Honorable mentions are awarded $10,000. Financial support is provided by Google. The number of awarded dissertations may vary year-to-year.

Haoyuan (H.Y.) Li is a computer scientist and entrepreneur specializing in distributed systems, big data, and cloud computing. He is best known for proposing Virtual Distributed File System (VDFS), and creating an open-source data orchestration system, Alluxio. He is the Founder, Chairman, and CEO of Alluxio, Inc, a company commercializing the Alluxio Data Orchestration Technology. He is also an adjunct professor at Peking University. He is a frequent speaker on the topic of AI, big data, cloud computing, and open source at conferences.

Mosharaf Chowdhury is a Bangladeshi-American computer scientist known for his contributions to the fields of computer networking and large-scale systems for emerging machine learning and big data workloads. He is an Associate Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor and leads SymbioticLab. He is the creator of coflow and the co-creator of Apache Spark.

References

  1. Cai, Kenrick (21 May 2021). "Accidental Billionaires: How Seven Academics Who Didn't Want To Make A Cent Are Now Worth Billions". Forbes.
  2. Fiscutean, Andrada (August 20, 2019). "Why the US has lost to Russia in these top coding trials for almost a decade". ZDNet.
  3. 1 2 "Meet the 'nerdiest rock star': Matei Zaharia co-creator of Apache Spark | Computing". computing.co.uk. 2015-10-29. Retrieved 2019-12-03.
  4. Piatetsky, Gregory (May 2015). "Exclusive Interview: Matei Zaharia, creator of Apache Spark, on Spark, Hadoop, Flink, and Big Data in 2020".
  5. "Cei mai bogaţi oameni din lume în 2022. Şase români în topul Forbes". Adevărul (in Romanian). 6 April 2022.
  6. 1 2 Iyer, Kavya (July 26, 2019). "Twelve Stanford researchers receive Presidential Early Career Award for Scientists and Engineers". Stanford Daily.
  7. Zaharia, Matei. "Programming Contest Resources". cs.stanford.edu. Retrieved 2020-04-22.
  8. "The Story of 0 A.D." Play0ad.
  9. "Fort Wars Overview".
  10. Woodie, Alex (March 8, 2019). "A Decade Later, Apache Spark Still Going Strong". Datanami.
  11. "Matei Zaharia receives ACM Doctoral Dissertation award". MIT EECS. April 28, 2015. Archived from the original on 2015-07-09.
  12. Brust, Andrew (June 6, 2019). "AI gets rigorous: Databricks announces MLflow 1.0". ZDNet.
  13. Anadiotis, George. "Unifying cloud storage and data warehouses: Delta Lake project hosted by the Linux Foundation". ZDNet. Retrieved 2019-12-03.
  14. Woodie, Alex (2019-12-02). "Will Databricks Build the First Enterprise AI Platform?". Datanami. Retrieved 2019-12-03.
  15. Leven, Rachel (12 September 2023). "CDSS welcomes seven new faculty to the college community". University of Berkeley.