Gerald Friedland

Last updated

Gerald Friedland (born 1978, Berlin) is a Principal Scientist at Amazon Web Services and an adjunct professor at the Electrical Engineering and Computer Science Department of the University of California, Berkeley. [1] [2]

Contents

Education

Gerald Friedland completed his Masters and Doctorate degrees in computer science from Free University of Berlin in 2002 and 2006, respectively. [3] His PhD advisor was Raúl Rojas. He then moved to the International Computer Science Institute where he completed his a postdoc under Nelson Morgan before continuing to be a research scientist and group leader there. [4] He then worked as a Principal Data Scientist at Lawrence Livermore National Lab before co-founding Brainome, Inc. [5] He is a faculty fellow of the Berkeley Institute for Data Science where he has been running a discussion group [6] since 2018, understanding the implications of using information theory as universal tool for modeling. This resulted in a book published in 2024. [7]

Career

Friedland is a computer scientist specializing in the processing and analysis of multimedia data and machine learning. [8] He is mostly known as the original author of the widely used "Simple Interactive Object Extraction" image and video segmentation algorithm, [9] [10] [11] [12] [13] [14] [15] [16] created as part of his PhD thesis, [17] [18] and as the co-author of a textbook on Multimedia Computing. [19] He also led the initiative to create and release the YFCC100M corpus (see also: List of datasets for machine learning research), [20] [21] [22] the largest freely available research corpus of consumer-produced videos and images. He co-founded the field of geolocation estimation for images and videos, sometimes also referred to as placing. [23] [24] [25] Friedland also frequently uncovers privacy risks in multimedia publishing practice [26] [27] [28] [29] [30] [31] [32] [33] and heads the development of the teachingprivacy.org [34] portal which provides educational materials for use in US high-schools as part of the AP Computer Science Principles and the Code.org initiative. Friedland is also the co-creator of MOVI, an open-source speech recognition board that allows the creation of cloudless voice interfaces [35] for Internet of things devices.

Related Research Articles

Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images into descriptions of world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.

A CAPTCHA is a type of challenge–response test used in computing to determine whether the user is human in order to deter bot attacks and spam.

<span class="mw-page-title-main">Automatic image annotation</span>

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

<span class="mw-page-title-main">International Computer Science Institute</span>

The International Computer Science Institute (ICSI) is an independent, non-profit research organization located in Berkeley, California, United States. Since its founding in 1988, ICSI has maintained an affiliation agreement with the University of California, Berkeley, where several of its members hold faculty appointments.

Simple interactive object extraction (SIOX) is an algorithm for extracting foreground objects from color images and videos with very little user interaction. It has been implemented as "foreground selection" tool in the GIMP, as part of the tracer tool in Inkscape, and as function in ImageJ and Fiji (plug-in). Experimental implementations were also reported for Blender and Krita. Although the algorithm was originally designed for videos, virtually all implementations use SIOX primarily for still image segmentation. In fact, it is often said to be the current de facto standard for this task in the open-source world.

<span class="mw-page-title-main">Thomas Huang</span> Chinese-American engineer and computer scientist (1936–2020)

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

Michael S. Lew is a scientist in multimedia information search and retrieval at Leiden University, Netherlands. He has published over a dozen books and 150 scientific articles in the areas of content based image retrieval, computer vision, and deep learning. Notably, he had the most cited paper in the ACM Transactions on Multimedia, one of the top 10 most cited articles in the history of the ACM SIGMM, and the most cited article from the ACM International Conference on Multimedia Information Retrieval in 2008 and also in 2010. He was the opening keynote speaker for the 9th International Conference on Visual Information Systems, the Editor-in-Chief of the International Journal of Multimedia Information Retrieval (Springer), the co-founder of influential conferences such as the International Conference on Image and Video Retrieval, and the IEEE Workshop on Human Computer Interaction. He was also a founding member of the international advisory committee for the TRECVID video retrieval evaluation project, chair of the steering committee for the ACM International Conference on Multimedia Retrieval and a member of the ACM SIGMM Executive Committee. In addition, his work on convolutional fusion networks in deep learning won the best paper award at the 23rd International Conference on Multimedia Modeling. His work is frequently cited in both scientific and popular news sources.

<span class="mw-page-title-main">Steve Omohundro</span> American computer scientist

Stephen Malvern Omohundro is an American computer scientist whose areas of research include Hamiltonian physics, dynamical systems, programming languages, machine learning, machine vision, and the social implications of artificial intelligence. His current work uses rational economics to develop safe and beneficial intelligent technologies for better collaborative modeling, understanding, innovation, and decision making.

Ricky J. Sethi is an Assistant Professor of Computer Science at Fitchburg State University and the Director of Research for The Madsci Network. He was appointed as a National Science Foundation (NSF) Computing Innovation Fellow by the Computing Community Consortium and the Computing Research Association. He has contributed significantly in the fields of machine learning, computer vision, social computing, and science education/eLearning.

Ingemar J. Cox is Professor and Director of Research in the Department of Computer Science at University College London, where he is Head of the Future Media Group, and he is Professor in the Machine Learning department at the University of Copenhagen. Between 2003 and 2008, he was Director of UCL's Adastral Park Campus.

In computer vision, rigid motion segmentation is the process of separating regions, features, or trajectories from a video sequence into coherent subsets of space and time. These subsets correspond to independent rigidly moving objects in the scene. The goal of this segmentation is to differentiate and extract the meaningful rigid motion from the background and analyze it. Image segmentation techniques labels the pixels to be a part of pixels with certain characteristics at a particular time. Here, the pixels are segmented depending on its relative movement over a period of time i.e. the time of the video sequence.

Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables.

<span class="mw-page-title-main">Saliency map</span> Type of image

In computer vision, a saliency map is an image that highlights either the region on which people's eyes focus first or the most relevant regions for machine learning models. The goal of a saliency map is to reflect the degree of importance of a pixel to the human visual system or an otherwise opaque ML model.

<span class="mw-page-title-main">Subhasis Chaudhuri</span>

Subhasis Chaudhuri is an Indian electrical engineer and former director at the Indian Institute of Technology, Bombay. He is a former K. N. Bajaj Chair Professor of the Department of Electrical Engineering of IIT Bombay. He is known for his pioneering studies on computer vision and is an elected fellow of all the three major Indian science academies viz. the National Academy of Sciences, India, Indian Academy of Sciences, and Indian National Science Academy. He is also a fellow of Institute of Electrical and Electronics Engineers, and the Indian National Academy of Engineering. The Council of Scientific and Industrial Research, the apex agency of the Government of India for scientific research, awarded him the Shanti Swarup Bhatnagar Prize for Science and Technology, one of the highest Indian science awards, in 2004 for his contributions to Engineering Sciences.

<span class="mw-page-title-main">René Vidal</span> Chilean computer scientist (born 1974)

René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).

Shih-Fu Chang is a Taiwanese American computer scientist and electrical engineer noted for his research on multimedia information retrieval, computer vision, machine learning, and signal processing.

<span class="mw-page-title-main">Gregory D. Hager</span> American computer scientist

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

Jiebo Luo is a Chinese-American computer scientist, the Albert Arendt Hopeman Professor of Engineering and Professor of Computer Science at the University of Rochester. He is interested in artificial intelligence, data science and computer vision.

Jiaya Jia is a tenured professor of the Department of Computer Science and Engineering at The Chinese University of Hong Kong (CUHK). He is an IEEE Fellow, the associate editor-in-chief of one of IEEE’s flagship and premier journals- Transactions on Pattern Analysis and Machine Intelligence (TPAMI), as well as on the editorial board of International Journal of Computer Vision (IJCV).

<span class="mw-page-title-main">Edward Y. Chang</span> American computer scientist

Edward Y. Chang is a computer scientist, academic, and author. He is an adjunct professor of Computer Science at Stanford University, and Visiting Chair Professor of Bioinformatics and Medical Engineering at Asia University, since 2019.

References

  1. "Gerald Friedland | EECS at UC Berkeley".
  2. "Gerald Friedland".
  3. "Refubium - Suche".
  4. "Error".
  5. "Brainome launches product to optimize machine learning development process". ZDNet .
  6. "Entropy discussion group". 23 August 2019.
  7. Friedland, Gerald "Information-Driven Machine Learning: Data Science as an Engineering Discipline", Springer-Nature, January 2024.
  8. Google Scholar list of publications: https://scholar.google.com/citations?user=iBl-QgEAAAAJ
  9. "Algorithm - What are the standard techniques for removing a segmentation (Such as a human or bird) from a video?".
  10. "SIOX".
  11. "Using GIMP's Foreground select tool". 31 August 2013.
  12. "Paintshopprotutorials.co.uk".
  13. "Kutout - an application for cutting out images | Hook - Labs". Archived from the original on 2017-07-24. Retrieved 2017-07-16.
  14. "Fiji plugin based on the SIOX project to segment color images: Fiji/Siox_Segmentation". GitHub . June 2019.
  15. "SIOX: Simple Interactive Object Extraction".
  16. Shoou Jiah Yiu, Gerald Friedland: "Method and system for identifying objects in images" US Patent Application US20170132469A1
  17. Gerald Friedland: "Adaptive Audio- und Videoverarbeitung für elektronische Kreidetafelvorlesungen", Freie Universitaet Berlin, October 2006. http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000002354
  18. Gerald Friedland: "Adaptive Audio and Video Processing for Electronic Chalkboard Lectures", Lulu Publishing, ISBN   978-1430303886, December 2006. 2016 reprint: ISBN   978-3-659-97771-8, Lambert Publishing, November 2016.
  19. Friedland, Gerald and Jain, Ramesh "Multimedia Computing", Cambridge University Press, October 2014.
  20. Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, Li-Jia Li. "YFCC100M: The New Data in Multimedia Research". Communications of the ACM, Vol. 59 No. 2, Pages 64-73
  21. YFCC100M: YFCC100M
  22. The Multimedia Commons
  23. Gerald Friedland, Oriol Vinyals, and Trevor Darrell: "Multimodal Location Estimation", in Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, October 2010, pp. 1245-1251.
  24. Choi, Jaeyoung, Friedland, Gerald "Multimodal Location Estimation of Videos and Images", Springer Publishing October 2014
  25. Nils Peters, Howard Lei, Gerald Friedland: "Room identification using acoustic features in a recording", US Patent US20140161270A1
  26. Web Photos That Reveal Secrets, Like Where you Live (New York Times, Aug 11, 2010)
  27. Tips to Turn Off Geo-Tagging on Your Cell Phone (ABC News, Aug 20, 2010)
  28. Could you fall victim to crime simply by geotagging location info to your photos? (Digital Trends, Jul 22, 2013)
  29. Ways to Avoid Email Tracking (New York Times, Dec 25, 2014)
  30. BodyWorn, the police-worn camera that aims to reduce crime (Fox News, May 19, 2015)
  31. Paris ISIS Attacks: Tech Industry Says 'Anti-Terror' Back Doors Would Make US Less Safe (International Business Times, Nov 18, 2015)
  32. Why our Crazy Smart AI still sucks at Transcribing our Speech (Wired Magazine, Apr 8, 2016)
  33. Transcribing Audio Sucks—So Make Machines Like Trint Do It (Wired Magazine, Apr 26, 2017)
  34. "Teaching Privacy".
  35. Gerald Friedland Bertrand Irissou: Method of facilitating construction of a voice dialog interface for an electronic system, US Patent Application US15382163.