Gerald Friedland

Last updated

Gerald Friedland is a German-American computer scientist and author specializing in multimedia computing, machine learning, and artificial intelligence. He is a principal scientist at Amazon Web Services and a professor at the Electrical Engineering and Computer Science Department of the University of California, Berkeley. He focuses on AutoML and generative AI. His work has advanced large-scale multimedia analysis, privacy-aware AI, and explainable machine learning. [1] [2]

Contents

Education

Friedland completed his education in Germany, earning his Abitur in 1998. [3] He received a Master of Science in Computer Science with a minor in Linguistics from Freie Universität Berlin in 2002. [4] His master’s thesis, "Towards a Generic Cross Platform Media Editor: An Editing Tool for E-Chalk," was recognized as the best computer science master’s thesis in German-speaking countries by the German Association for Computer Science. [5]

In 2006, Friedland earned his Ph.D. in Computer Science from Freie Universität Berlin, graduating summa cum laude. His dissertation, "Adaptive Audio and Video Processing for Electronic Chalkboard Lectures," was nominated for the university's Ernst-Reuter Award. [6] [7]

Career

Friedland began his career in academia as a Research Associate in the AI group at Freie Universität Berlin from 2002 to 2006 under Raúl Rojas. During this time, he developed the "Simple Interactive Object Extraction (SIOX)" algorithm, [8] now widely used in open-source tools like GIMP and Blender and conducted research on lecture webcasting technologies. [9]

From 2006 to 2016, Friedland worked full-time with the International Computer Science Institute (ICSI) in Berkeley, California. He held various roles, from postdoctoral student over Principal Investigator to Director of the Audio and Multimedia group. As a Principal Data Scientist at Lawrence Livermore National Laboratory (2016–2019), Friedland led a team addressing challenges in explainable AI. [10]

In 2014, he founded Audeme, a company developing cloud-independent speech recognition hardware. [11] In 2019, he co-founded Brainome, Inc. which he joined full-time until 2022 as CTO, leading the development of no-code machine learning solutions, leveraging the information-theory view of machine learning described in Information-Driven Machine Learning: Data Science as an Engineering Discipline. [4] [12]

Friedland served as director of conferences for ACM SIGMM (2017–2021), program co-chair for ACM Multimedia (2017), and associate editor for IEEE Multimedia Magazine and ACM Transactions on Multimedia Computing. [13] [14]

Research

Friedland is a computer scientist specializing in the processing and analysis of multimedia data and machine learning. [15] He is mostly known as the original author of the widely used "Simple Interactive Object Extraction" image and video segmentation algorithm, [16] [8] [17] [18] [19] [9] [20] [21] created as part of his PhD thesis, [22] [23] and as the co-author of a textbook on Multimedia Computing. [24] He also led the initiative to create and release the YFCC100M corpus (see also: List of datasets for machine learning research), [25] [26] [27] the largest freely available research corpus of consumer-produced videos and images. He co-founded the field of geolocation estimation for images and videos, sometimes also referred to as placing. [28] [29] [30] Friedland also frequently uncovers privacy risks in multimedia publishing practice [31] [32] [33] [34] [35] [36] [37] [38] and heads the development of the teachingprivacy.org [39] portal which provides educational materials for use in US high-schools as part of the AP Computer Science Principles and the Code.org initiative. Friedland is also the co-creator of MOVI, an open-source speech recognition board that allows the creation of cloudless voice interfaces [40] for Internet of things devices.

Awards

Publications

Friedland has authored six books, including:

He has also published over 100 peer-reviewed journal and conference articles on topics ranging from machine learning to multimedia computing. [15]

Related Research Articles

Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.

<span class="mw-page-title-main">Automatic image annotation</span>

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

<span class="mw-page-title-main">International Computer Science Institute</span>

The International Computer Science Institute (ICSI) is an independent, non-profit research organization located in Berkeley, California, United States. Since its founding in 1988, ICSI has maintained an affiliation agreement with the University of California, Berkeley, where several of its members hold faculty appointments.

Simple interactive object extraction (SIOX) is an algorithm for extracting foreground objects from color images and videos with very little user interaction. It has been implemented as "foreground selection" tool in the GIMP, as part of the tracer tool in Inkscape, and as function in ImageJ and Fiji (plug-in). Experimental implementations were also reported for Blender and Krita. Although the algorithm was originally designed for videos, virtually all implementations use SIOX primarily for still image segmentation. In fact, it is often said to be the current de facto standard for this task in the open-source world.

<span class="mw-page-title-main">Steve Omohundro</span> American computer scientist

Stephen Malvern Omohundro is an American computer scientist whose areas of research include Hamiltonian physics, dynamical systems, programming languages, machine learning, machine vision, and the social implications of artificial intelligence. His current work uses rational economics to develop safe and beneficial intelligent technologies for better collaborative modeling, understanding, innovation, and decision making.

ACM Multimedia (ACM-MM) is the Association for Computing Machinery (ACM)'s annual conference on multimedia, sponsored by the SIGMM special interest group on multimedia in the ACM. SIGMM specializes in the field of multimedia computing, from underlying technologies to applications, theory to practice, and servers to networks to devices.

Michael Jay Franklin is an American software entrepreneur and computer scientist specializing in distributed and streaming database technology. He is Liew Family Chair of Computer Science and chairman for the Department of Computer Science at the University of Chicago.

Susanne Boll is a Professor for Media Informatics and Multimedia Systems in the Department of Computing Science at the University of Oldenburg, Germany. and is a member of the board at the research institute OFFIS. She is a member of SIGMM and SIGCHI of the ACM as well as the German Informatics Society GI. She founded and directs the HCI Lab at the University of Oldenburg and OFFIS.

Zhang Hongjiang is a Chinese computer scientist and executive. He was CEO of Kingsoft, managing director of Microsoft Advanced Technology Center (ATC) and chief technology officer (CTO) of Microsoft China Research and Development Group (CRD). Hongjiang is currently Chairman of BAAI. In 2022, he was elected to the National Academy of Engineering for his technical contributions and leadership in the area of multimedia computing.

<span class="mw-page-title-main">Jeff Dean</span> American computer scientist and software engineer

Jeffrey AdgateDean is an American computer scientist and software engineer. Since 2018, he has been the lead of Google AI. He was appointed Google's chief scientist in 2023 after the merger of DeepMind and Google Brain into Google DeepMind.

Michael Justin Kearns is an American computer scientist, professor and National Center Chair at the University of Pennsylvania, the founding director of Penn's Singh Program in Networked & Social Systems Engineering (NETS), the founding director of Warren Center for Network and Data Sciences, and also holds secondary appointments in Penn's Wharton School and department of Economics. He is a leading researcher in computational learning theory and algorithmic game theory, and interested in machine learning, artificial intelligence, computational finance, algorithmic trading, computational social science and social networks. He previously led the Advisory and Research function in Morgan Stanley's Artificial Intelligence Center of Excellence team, and is currently an Amazon Scholar within Amazon Web Services.

Klara Nahrstedt is the Ralph and Catherine Fisher Professor of Computer Science at the University of Illinois at Urbana–Champaign, and directs the Coordinated Science Laboratory there. Her research concerns multimedia, quality of service, and middleware.

Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables.

<span class="mw-page-title-main">René Vidal</span> Chilean computer scientist (born 1974)

René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).

<span class="mw-page-title-main">Yong Rui</span> CTO of Lenovo

Yong Rui is the CTO of Lenovo Group, in charge of Lenovo's technical strategy, research and development directions, and Lenovo Research, one of Lenovo's most important innovation engines.

Shih-Fu Chang is a Taiwanese American computer scientist and electrical engineer noted for his research on multimedia information retrieval, computer vision, machine learning, and signal processing.

<span class="mw-page-title-main">Gregory D. Hager</span> American computer scientist

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

Jiebo Luo is a Chinese-American computer scientist, the Albert Arendt Hopeman Professor of Engineering and professor of computer science at the University of Rochester. He is interested in artificial intelligence, data science and computer vision.

<span class="mw-page-title-main">Edward Y. Chang</span> American computer scientist

Edward Y. Chang is a computer scientist, academic, and author. He is an adjunct professor of Computer Science at Stanford University, and Visiting Chair Professor of Bioinformatics and Medical Engineering at Asia University, since 2019.

References

  1. "Gerald Friedland | EECS at UC Berkeley".
  2. "Gerald Friedland".
  3. "Refubium - Suche".
  4. 1 2 "Brainome launches product to optimize machine learning development process". ZDNet .
  5. "Error".
  6. "Entropy discussion group". Berkeley Institute for Data Science. 23 August 2019.
  7. Friedland, Gerald "Information-Driven Machine Learning: Data Science as an Engineering Discipline", Springer-Nature, January 2024.
  8. 1 2 "SIOX".
  9. 1 2 "Fiji plugin based on the SIOX project to segment color images: Fiji/Siox_Segmentation". GitHub . June 2019.
  10. "Gerald Friedland | ICSI". www.icsi.berkeley.edu. Retrieved 2024-12-19.
  11. "An interview with Bertrand and Gerald of Audeme | The Amp Hour Electronics Podcast". theamphour.com. 2015-07-16. Retrieved 2024-12-19.
  12. Woodie, Alex (2020-11-04). "Brainome Right-Sizes Your Data Before ML Training". BigDATAwire. Retrieved 2024-12-19.
  13. "New SIGMM Leadership Announced | ACM SIGMM - the Special Interest Group on Multimedia". www.sigmm.org. Retrieved 2024-12-19.
  14. "Gerald Friedland - Home". Author DO Series. Retrieved 2024-12-19.
  15. 1 2 Google Scholar list of publications: https://scholar.google.com/citations?user=iBl-QgEAAAAJ
  16. "Algorithm - What are the standard techniques for removing a segmentation (Such as a human or bird) from a video?".
  17. "Using GIMP's Foreground select tool". 31 August 2013.
  18. "Paintshopprotutorials.co.uk".
  19. "Kutout - an application for cutting out images | Hook - Labs". Archived from the original on 2017-07-24. Retrieved 2017-07-16.
  20. "SIOX: Simple Interactive Object Extraction".
  21. Shoou Jiah Yiu, Gerald Friedland: "Method and system for identifying objects in images" US Patent Application US20170132469A1
  22. Gerald Friedland: "Adaptive Audio- und Videoverarbeitung für elektronische Kreidetafelvorlesungen", Freie Universitaet Berlin, October 2006. http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000002354
  23. Gerald Friedland: "Adaptive Audio and Video Processing for Electronic Chalkboard Lectures", Lulu Publishing, ISBN   978-1430303886, December 2006. 2016 reprint: ISBN   978-3-659-97771-8, Lambert Publishing, November 2016.
  24. Friedland, Gerald and Jain, Ramesh "Multimedia Computing", Cambridge University Press, October 2014.
  25. Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, Li-Jia Li. "YFCC100M: The New Data in Multimedia Research". Communications of the ACM, Vol. 59 No. 2, Pages 64-73
  26. YFCC100M: YFCC100M
  27. The Multimedia Commons
  28. Gerald Friedland, Oriol Vinyals, and Trevor Darrell: "Multimodal Location Estimation", in Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, October 2010, pp. 1245-1251.
  29. Choi, Jaeyoung, Friedland, Gerald "Multimodal Location Estimation of Videos and Images", Springer Publishing October 2014
  30. Nils Peters, Howard Lei, Gerald Friedland: "Room identification using acoustic features in a recording", US Patent US20140161270A1
  31. Web Photos That Reveal Secrets, Like Where you Live (New York Times, Aug 11, 2010)
  32. Tips to Turn Off Geo-Tagging on Your Cell Phone (ABC News, Aug 20, 2010)
  33. Could you fall victim to crime simply by geotagging location info to your photos? (Digital Trends, Jul 22, 2013)
  34. Ways to Avoid Email Tracking (New York Times, Dec 25, 2014)
  35. BodyWorn, the police-worn camera that aims to reduce crime (Fox News, May 19, 2015)
  36. Paris ISIS Attacks: Tech Industry Says 'Anti-Terror' Back Doors Would Make US Less Safe (International Business Times, Nov 18, 2015)
  37. Why our Crazy Smart AI still sucks at Transcribing our Speech (Wired Magazine, Apr 8, 2016)
  38. Transcribing Audio Sucks—So Make Machines Like Trint Do It (Wired Magazine, Apr 26, 2017)
  39. "Teaching Privacy".
  40. Gerald Friedland Bertrand Irissou: Method of facilitating construction of a voice dialog interface for an electronic system, US Patent Application US15382163.