Gregory Piatetsky-Shapiro

Last updated
Gregory Piatetsky-Shapiro in NYC Gregory Piatetsky-Shapiro NYC 2016.jpg
Gregory Piatetsky-Shapiro in NYC

Gregory I. Piatetsky-Shapiro (born 7 April 1958) is a data scientist and the co-founder of the KDD conferences, and co-founder and past chair of the Association for Computing Machinery SIGKDD group for Knowledge Discovery, Data Mining and Data Science. [1] He is the founder and president of KDnuggets, [2] a discussion and learning website for Business Analytics, Data Mining and Data Science.

Contents

Early life

A Jewish refugee from Soviet Union, Gregory Piatetsky was born in Moscow, Russia to Inna Mogilevskaya and mathematician Ilya Piatetski-Shapiro. He was admitted in 1970 to Physics-Mathematics School no. 2, a leading math school in Moscow. [3] [4]

In March 1974, Piatetsky emigrated to Israel with his family, studying mathematics and computer science at Tel Aviv University for one semester at Technion. [5] He subsequently earned MS (1979) and Ph.D. (1984) degrees from NYU Courant Institute. [6]

In 1984, his first paper was published in SIGMOD, proving that secondary index selection is NP-complete by reducing it to a set cover problem. [7] In his dissertation, he proved that the greedy method for set cover has a lower bound of 1 - 1/e ~ 63% of the optimal. [8]

Career

He joined GTE Laboratories, where he worked on intelligent interfaces relating to databases. In 1989, he proposed a new project at GTE called "Knowledge Discovery in Databases". The project created advanced prototypes, including KEFIR (Key Findings Reporter), [9] a system for analysis and summarization of key changes in large databases, which was a forerunner of systems like Google Analytics Intelligence. A KEFIR prototype was applied to GTE health care data and received GTE's highest technical award. [10]

In 1997, he left GTE to join Knowledge Stream Partners (KSP), where he was Director and later Vice President and Chief Scientist. [11] In April 2000, KSP was acquired by Xchange, Inc., [12] where Piatetsky served as VP and Chief Scientist. [11]

Piatetsky left Xchange in May 2001 to become a self-employed consultant and focus on KDnuggets. [13]

KDD and SIGKDD

In 1989, Piatetsky organized the first workshop on Knowledge Discovery in Data (KDD-89), held at IJCAI-1989 in Detroit, MI. [1] This workshop had over 60 attendees, including researchers Ross Quinlan and Jaime Carbonell.[ citation needed ]

Piatetsky organized the next two KDD workshops, in 1991 and 1993. [1] With Usama Fayyad and Ramasamy (Sam) Uthurusamy, he expanded the workshops into an annual international conference on Data Mining and was the General Chair of the KDD-98 conference. [14] He served as the chair of the KDD Steering committee until 1998, when the SIGKDD group was formed as part of ACM to run the annual KDD conference and help promote research in Knowledge Discovery and Data Mining. He served as Director of SIGKDD for 2001-2005 and as SIGKDD Chair for 2005-2009. [15]

In 1997, Piatetsky and Ismail Parsa initiated the KDD Cup competition, which was the world's first open data mining contest. [16]

The annual ACM SIGKDD conference is the leading research conference on Knowledge Discovery and Data Mining, according to Microsoft Academic search [17] and Google Scholar. [18] The 21st ACM SIGKDD conference was held in Sydney, Australia in August 2015.

KDnuggets

In 1993, Piatetsky started Knowledge Discovery Nuggets (KDnuggets) as a newsletter to connect researchers who attended the KDD-93 workshop. With the emergence of the Internet and Mosaic, he and Chris Matheus eventually created the website: Knowledge Discovery Mine, [19] hosted at GTE Labs. The newsletter served as an unofficial publication of KDD workshops. When Piatetsky left GTE Labs, he created the KDnuggets website, [20] with the mission of covering the field with short, concise "nuggets". The resource started as a directory for the subjects of data mining and data science, including Software, jobs, academic positions, CFP (calls for papers), companies, courses, datasets, education, meetings, publications and webcasts.

KDnuggets' main focus is to cover the fields of Business Analytics, Data Mining, and Data Science, including interviews with key leaders. It offers a free data mining course for advanced undergraduates or first-year graduate students. [21]

@KDnuggets Twitter was

In February 2015, Piatetsky and Data ScienceTech Institute announced a partnership and he became an Honorary Member of its Scientific Advisory Board. [22]

Research and publications

In 1991, Piatetsky and William (Bud) Frawley edited their first book Knowledge Discovery in Databases. In 1996, Piatetsky, Usama Fayyad, Padhraic Smyt and Ramasamy Uthurusamy edited a follow-up Advances in Knowledge Discovery and Data Mining. [23]

Piatetsky also helped launch and co-edit the Data Mining and Knowledge Discovery journal.[ citation needed ] He authored 9 edited books and collections and over 60 technical papers, articles and book chapters, mostly focusing on data mining and knowledge discovery.[ citation needed ].

Recognition

Related Research Articles

<span class="mw-page-title-main">Data mining</span> Process of extracting and discovering patterns in large data sets

Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

A chief data officer (CDO) is a corporate officer responsible for enterprise-wide governance and utilization of information as an asset, via data processing, analysis, data mining, information trading and other means. CDOs usually report to the chief executive officer (CEO), although depending on the area of expertise this can vary. The CDO is a member of the executive management team and manager of enterprise-wide data processing and data mining.

Cross-industry standard process for data mining, known as CRISP-DM, is an open standard process model that describes common approaches used by data mining experts. It is the most widely-used analytics model.

<span class="mw-page-title-main">Weka (machine learning)</span>

Waikato Environment for Knowledge Analysis (Weka), developed at the University of Waikato, New Zealand, is free software licensed under the GNU General Public License, and the companion software to the book "Data Mining: Practical Machine Learning Tools and Techniques".

SIGKDD, representing the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining, hosts an influential annual conference.

Jiawei Han is a Chinese-American computer scientist and researcher. He currently holds the position of Michael Aiken Chair Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research focuses on data mining, text mining, database systems, information networks, data mining from spatiotemporal data, Web data, and social/information network data.

<i>Data Mining and Knowledge Discovery</i> Peer-reviewed scientific journal

Data Mining and Knowledge Discovery is a bimonthly peer-reviewed scientific journal focusing on data mining published by Springer Science+Business Media. It was started in 1996 and launched in 1997 by Usama Fayyad as founding Editor-in-Chief by Kluwer Academic Publishers. The first Editorial provides a summary of why it was started.

<span class="mw-page-title-main">Osmar R. Zaiane</span>

Osmar R. Zaiane is a researcher, computer scientist, professor at the University of Alberta specializing in data mining and machine learning. He was the secretary treasurer of the Association for Computing Machinery (ACM) Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) from 2009 to 2012 and treasurer of the ACM Special Interest Group on Health Informatics. He served as the editor-in-chief of the SIGKDD Explorations publication from 2008 to 2010. He was also the associate editor of the same publication from 2004 to 2007.

Hans-Peter Kriegel is a German computer scientist and professor at the Ludwig Maximilian University of Munich and leading the Database Systems Group in the Department of Computer Science. He was previously professor at the University of Würzburg and the University of Bremen after habilitation at the Technical University of Dortmund and doctorate from Karlsruhe Institute of Technology.

AMiner is a free online service used to index, search, and mine big scientific data.

<span class="mw-page-title-main">Foster Provost</span>

Foster Provost is an American computer scientist, information systems researcher, and Professor of Data Science and Information Systems and Ira Rennert Professor of Entrepreneurship at New York University's Stern School of Business. He is also the Director for the Data Science and AI Initiative at Stern's Fubon Center for Technology, Business and Innovation. Professor Provost has a Bachelor of Science from Duquesne University in physics and mathematics and a Master of Science and Ph.D. in computer science from the University of Pittsburgh.

Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining, data science, and analytics professionals in the industry. It consists of approximately 50 multiple choice and open-ended questions that cover seven general areas of data mining science and practice: (1) Field and goals, (2) Algorithms, (3) Models, (4) Tools, (5) Technology, (6) Challenges, and (7) Future. It is conducted as a service to the data mining community, and the results are usually announced at the PAW conferences and shared via freely available summary reports. In the 2013 survey, 1259 data miners from 75 countries participated. After 2011, Rexer Analytics moved to a biannual schedule.

Jie Tang is a full-time professor at the Department of Computer Science of Tsinghua University. He received a PhD in computer science from the same university in 2006. He is known for building the academic social network search system ArnetMiner, which was launched in March 2006 and now has attracted 2,766,356 independent IP accesses from 220 countries. His research interests include social networks and data mining.

<span class="mw-page-title-main">Usama Fayyad</span> American computer scientist

Usama M. Fayyad is an American-Jordanian data scientist and co-founder of KDD conferences and ACM SIGKDD association for Knowledge Discovery and Data Mining. He is a speaker on Business Analytics, Data Mining, Data Science, and Big Data. He recently left his role as the Chief Data Officer at Barclays Bank.

Social media mining is the process of obtaining big data from user-generated content on social media sites and mobile apps in order to extract actionable patterns, form conclusions about users, and act upon the information, often for the purpose of advertising to users or conducting research. The term is an analogy to the resource extraction process of mining for rare minerals. Resource extraction mining requires mining companies to shift through vast quantities of raw ore to find the precious minerals; likewise, social media mining requires human data analysts and automated software programs to shift through massive amounts of raw social media data in order to discern patterns and trends relating to social media usage, online behaviours, sharing of content, connections between individuals, online buying behaviour, and more. These patterns and trends are of interest to companies, governments and not-for-profit organizations, as these organizations can use these patterns and trends to design their strategies or introduce new programs, new products, processes or services.

Domain driven data mining is a data mining methodology for discovering actionable knowledge and deliver actionable insights from complex data and behaviors in a complex environment. It studies the corresponding foundations, frameworks, algorithms, models, architectures, and evaluation systems for actionable knowledge discovery.

Arthur Zimek is a professor in data mining, data science and machine learning at the University of Southern Denmark in Odense, Denmark.

<span class="mw-page-title-main">Gautam Das (computer scientist)</span> Indian computer scientist

Gautam Das is a computer scientist in the field of databases research. He is an ACM Fellow and IEEE Fellow.

<span class="mw-page-title-main">Hui Xiong</span>

Hui Xiong is a data scientist. He is a Distinguished Professor at Rutgers University and a Distinguished Guest Professor at the University of Science and Technology of China (USTC).

Wei Wang is a Chinese-born American computer scientist. She is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). Her research specializes in big data analytics and modeling, database systems, natural language processing, bioinformatics and computational biology, and computational medicine.

References

  1. 1 2 3 4 "Dr. Gregory Piatetsky-Shapiro - SIGKDD Service Award". ACM SigKDD. Retrieved 2015-09-22. Gregory Piatetsky-Shapiro has received the first ACM SIGKDD Service award for starting the KDD conferences and contributions to the KDD community, including KDnuggets newsletter. Dr. Piatetsky-Shapiro is the founder of the Knowledge Discovery in Database conference series (KDD, now the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
  2. "Gregory Piatetsky-Shapiro". www.kdnuggets.com.
  3. "Ilya Piatetski-Shapiro, In Memoriam" (PDF), Notices of the American Mathematical Society , 57 (10): 1260–1275, 2010
  4. Tel Aviv University obituary Archived 2009-12-29 at the Wayback Machine
  5. Expert interview: Exciting and Worrisome Advances in Artificial Intelligence (Fetched on 2020-02-26)
  6. NYU CS PhDs thesis list
  7. Accurate estimation of the number of tuples satisfying a condition
  8. A self-organizing database system - a different approach to query optimization
  9. Matheus, Christopher J.; Piatetsky-shapiro, Gregory; Mcneill, Dwight. "Key Findings Reporter for Analysis of Health-Care Information". CiteSeerX   10.1.1.57.445 .{{cite journal}}: Cite journal requires |journal= (help)
  10. Journeys to Data Mining. Springer, Berlin, Heidelberg. 2012. pp. 173–196. ISBN   978-3-642-28046-7.
  11. 1 2 "Gregory Piatetsky-Shapiro". www.kdnuggets.com. Retrieved 2018-02-22.
  12. "Yahoo - Exchange Applications, Now Doing Business as Xchange, Inc., Acquires eCRM Firm Knowledge Stream Partners for $52 million". www.kdnuggets.com.
  13. "About KDnuggets, Analytics, Big Data, Data Mining and Data Science leader". www.kdnuggets.com. Retrieved 2018-02-22.
  14. "KDD-98 Schedule". www.kdnuggets.com. Retrieved 2018-03-24.
  15. Membershsip, SIGKDD. "About SIGKDD". kdd.org.
  16. Blog, SIGKDD. "SIGKDD : KDD Cup 1997 : Direct marketing for lift curve optimization". www.kdd.org. Retrieved 2018-03-24.
  17. "Top conferences in data mining". Microsoft Academic Search. Archived from the original on 2015-09-17. Retrieved 2015-09-22.
  18. "Data Mining & Analysis". Google Scholar . Retrieved 2015-09-22. 2. ACM SIGKDD International Conference on Knowledge discovery and data mining (Ranked #1 is a journal, not a conference.)
  19. KDD Nugget 94:8
  20. "Machine Learning, Data Science, Big Data, Analytics, AI". www.kdnuggets.com.
  21. "Data Mining Course". www.kdnuggets.com.
  22. "Data ScienceTech Institute celebrates Dr Gregory Piatetsky-Shapiro as Honorary Member of Our Scientific Advisory Board".
  23. Fayyad, Usama M.; Piatetsky-Shapiro, Gregory; Smyth, Padhraic; Uthurusamy, Ramasamy (1996-02-01). Advances in knowledge discovery and data mining . American Association for Artificial Intelligence. ISBN   0262560976.
  24. Wu, Xindong (2007-09-28). "2007 IEEE ICDM Outstanding Service Award: Dr. Gregory Piatetsky-Shapiro". IEEE ICDM. Retrieved 2015-09-22. Dr. Piatetsky-Shapiro is the founder of the Knowledge Discovery in Database conference series (KDD, now the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).