Sean Kandel

Last updated
Sean Kandel
Sean Kandel Photo.png
OccupationChief Technical Officer

Sean Kandel is Trifacta's Chief Technical Officer and Co-founder, along with Joseph M. Hellerstein and Jeffrey Heer. He is known for the development of new tools for data transformation and discovery and is the co-developed of Data Wrangler, an interactive tool for data cleaning and transformation. [1] [2] [3] [4]

Contents

Education and Research

Kandel graduated from Stanford University in 2013 with a Ph.D. in Computer Science. As a Ph.D. student in the Visualization Group at Stanford, he designed and built interactive tools for data analysis, management, and visualization.

He received a Ph.D. from Stanford University in 2013 for his thesis on Interactive systems for data transformation and assessment under primary advisors Jeffrey Heer. [5] While at Stanford, he published multiple research papers and articles with Trifacta co-founders Jeffrey Heer and Joseph Hellerstein on topics of big data analysis, data quality assessment, and visualization for data transformation, as well as other big data research. [6] [7] [8] Kandel’s major research contribution to date has been as co-developer of Data Wrangler, a research initiative between Stanford and the University of California, Berkeley. [3] The project resulted in the company Trifacta eventually selling Data Wrangler as a commercialized product. [2] [5] [6] [7]

Awards and recognition

In 2017, Kandel was a recipient of an award for Silicon Valley’s young business leaders who are impacting their industries and their communities, Silicon Valley’s 40 Under 40 list. Import.io also included Sean in their list of 40 Data Mavericks under 40 list. [9]

He frequently presents at a variety of big data conferences including Strata World and Hadoop Users Group UK on topics including data lineage, data transformation, machine learning and semi-structured data, big data project success, and other industry subject areas. [10]

Related Research Articles

<span class="mw-page-title-main">Ben Shneiderman</span> American computer scientist

Ben Shneiderman is an American computer scientist, a Distinguished University Professor in the University of Maryland Department of Computer Science, which is part of the University of Maryland College of Computer, Mathematical, and Natural Sciences at the University of Maryland, College Park, and the founding director (1983-2000) of the University of Maryland Human-Computer Interaction Lab. He conducted fundamental research in the field of human–computer interaction, developing new ideas, methods, and tools such as the direct manipulation interface, and his eight rules of design.

In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integration and data management tasks such as data wrangling, data warehousing, data integration and application integration.

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. The goal of data wrangling is to assure quality and useful data. Data analysts typically spend the majority of their time in the process of data wrangling compared to the actual analysis of the data.

Raymond Paul "Raymie" Stata is an American computer engineer and business executive.

Jock D. Mackinlay is an American information visualization expert and Vice President of Research and Design at Tableau Software. With Stuart Card, George G. Robertson and others he invented a number of information visualization techniques.

<span class="mw-page-title-main">Joseph M. Hellerstein</span> American computer scientist

Joseph M. Hellerstein is an American professor of Computer Science at the University of California, Berkeley, where he works on database systems and computer networks. He co-founded Trifacta with Jeffrey Heer and Sean Kandel in 2012, which stemmed from their research project, Wrangler.

Pentaho is business intelligence (BI) software that provides data integration, OLAP services, reporting, information dashboards, data mining and extract, transform, load (ETL) capabilities. Its headquarters are in Orlando, Florida. Pentaho was acquired by Hitachi Data Systems in 2015 and in 2017 became part of Hitachi Vantara.

Data-centric programming language defines a category of programming languages where the primary function is the management and manipulation of data. A data-centric programming language includes built-in processing primitives for accessing data stored in sets, tables, lists, and other data structures and databases, and for specific manipulation and transformation of data required by a programming application. Data-centric programming languages are typically declarative and often dataflow-oriented, and define the processing result desired; the specific processing steps required to perform the processing are left to the language compiler. The SQL relational database language is an example of a declarative, data-centric language. Declarative, data-centric programming languages are ideal for data-intensive computing applications.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

<span class="mw-page-title-main">Jean-Daniel Fekete</span>

Jean-Daniel Fekete is a French computer scientist.

<span class="mw-page-title-main">Trifacta</span>

Trifacta is a privately owned software company headquartered in San Francisco with offices in Bengaluru, Boston, Berlin and London. The company was founded in October 2012 and primarily develops data wrangling software for data exploration and self-service data preparation on cloud and on-premises data platforms.

Platfora, Inc. is a big data analytics company based in San Mateo, California. The firm’s software works with the open-source software framework Apache Hadoop to assist with data analysis, data visualization, and sharing.

Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems. These characteristics can include size or amount of data, completeness of the data, correctness of the data, possible relationships amongst data elements or files/tables in the data.

Data lineage includes the data origin, what happens to it, and where it moves over time. Data lineage provides visibility and simplifies tracing errors back to the root cause in a data analytics process.

Michael Bostock is an American computer scientist and data-visualisation specialist. He is one of the co-creators of Observable and noted as one of the key developers of D3.js, a JavaScript library used for producing dynamic, interactive, online data visualizations. He was also involved in the preceding Protovis framework.

<span class="mw-page-title-main">Jeffrey Heer</span> American computer scientist

Jeffrey Michael Heer is an American computer scientist best known for his work on information visualization and interactive data analysis. He is a professor of computer science & engineering at the University of Washington, where he directs the UW Interactive Data Lab. He co-founded Trifacta with Joe Hellerstein and Sean Kandel in 2012.

Data blending is a process whereby big data from multiple sources are merged into a single data warehouse or data set. It concerns not merely the merging of different file formats or disparate sources of data but also different varieties of data. Data blending allows business analysts to cope with the expansion of data that they need to make critical business decisions based on good quality business intelligence.

Guided analytics is a sub-field at the interface of visual analytics and predictive analytics focused on the development of interactive visual interfaces for business intelligence applications. Such interactive applications serve the analyst to take important decisions by easily extracting information from the data.

<span class="mw-page-title-main">Vega and Vega-Lite visualisation grammars</span> Graphics software tools

Vega and Vega-Lite are visualization tools implementing a grammar of graphics, similar to ggplot2. The Vega and Vega-Lite grammars extend Leland Wilkinson's Grammar of Graphics. by adding a novel grammar of interactivity to assist in the exploration of complex datasets.

Leilani Marie Battle is an American Computer Science Assistant Professor at University of Washington's Paul G. Allen School of Computer Science & Engineering. Leilani Battle is also a co-director in UW's interactive Data Lab program. She is known for her research into Graphical Visualization of Database systems that involve both small scale, and large scale, multi-dimensional data. She is also known from her Research into predictive Prefetching (computing) to speed up database queries.

References

  1. "Sean Kandel | MapR". mapr.com. Retrieved 2017-10-12.
  2. 1 2 Shoemaker, Amanda (2015-10-15). "Trifacta Wrangler | Products | Trifacta". Trifacta. Retrieved 2017-10-12.
  3. 1 2 "Data Wrangler". vis.stanford.edu. Retrieved 2017-10-12.
  4. Carreras, Joseph Hellerstein, Jeffrey Heer, Tye Rattenbury, Sean Kandel, Connor. Principles of Data Wrangling.
  5. 1 2 "Interactive systems for data transformation and assessment". purl.stanford.edu. Retrieved 2017-10-12.
  6. 1 2 Heer, Jeffrey; Kandel, Sean (September 2012). "Interactive Analysis of Big Data". XRDS. 19 (1): 50–54. doi:10.1145/2331042.2331058. ISSN   1528-4972. S2CID   15335750.
  7. 1 2 Kandel, Sean; Parikh, Ravi; Paepcke, Andreas; Hellerstein, Joseph M.; Heer, Jeffrey (2012). "Profiler". Proceedings of the International Working Conference on Advanced Visual Interfaces. AVI '12. New York, NY, USA: ACM. pp. 547–554. doi:10.1145/2254556.2254659. ISBN   9781450312875. S2CID   2804799.
  8. Heer, Jeffrey; Hellerstein, Joseph M.; Kandel, Sean. "Predictive Interaction for Data Transformation" (PDF). SemanticScholar. S2CID   17577930. Archived from the original (PDF) on 2017-10-13. Retrieved 2017-10-12.
  9. "40 Under 40". www.bizjournals.com. Retrieved 2017-10-12.
  10. Hadoop Users Group UK (2015-03-18), Sean Kandel - Data profiling: Assessing the overall content and quality of a data set , retrieved 2017-10-12