Keras

Last updated
Keras
Original author(s) François Chollet
Developer(s) ONEIROS
Initial release27 March 2015;9 years ago (2015-03-27)
Stable release
3.7.0 [1] / 26 November 2024;56 days ago (26 November 2024)
Repository
Written in Python
Platform Cross-platform
Type Frontend for TensorFlow, JAX or PyTorch (and more)
License Apache 2.0
Website keras.io   OOjs UI icon edit-ltr-progressive.svg

Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later supporting more. "Keras 3 is a full rewrite of Keras [and can be used] as a low-level cross-framework language to develop custom components such as layers, models, or metrics that can be used in native workflows in JAX, TensorFlow, or PyTorch — with one codebase." [2] Keras 3 will be the default Keras version for TensorFlow 2.16 onwards, but Keras 2 can still be used. [3]

Contents

History

The name 'Keras' derives from the Ancient Greek word κέρας (Keras) meaning 'horn'. [4]

Designed to enable fast experimentation with deep neural networks, Keras focuses on being user-friendly, modular, and extensible. It was developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System), [5] and its primary author and maintainer is François Chollet, a Google engineer. Chollet is also the author of the Xception deep neural network model. [6]

Up until version 2.3, Keras supported multiple backends, including TensorFlow, Microsoft Cognitive Toolkit, Theano, and PlaidML. [7] [8] [9]

As of version 2.4, only TensorFlow was supported. Starting with version 3.0 (as well as its preview version, Keras Core), however, Keras has become multi-backend again, supporting TensorFlow, JAX, and PyTorch. [10] It now also supports OpenVINO!.

Features

Keras contains numerous implementations of commonly used neural-network building blocks such as layers, objectives, activation functions, optimizers, and a host of tools for working with image and text data to simplify programming in deep neural network area. [11] The code is hosted on GitHub, and community support forums include the GitHub issues page, and a Slack channel.[ citation needed ]

In addition to standard neural networks, Keras has support for convolutional and recurrent neural networks. It supports other common utility layers like dropout, batch normalization, and pooling. [12]

Keras allows users to produce deep models on smartphones (iOS and Android), on the web, or on the Java Virtual Machine. [8] It also allows use of distributed training of deep-learning models on clusters of graphics processing units (GPU) and tensor processing units (TPU). [13]

See also

Related Research Articles

Theano is a Python library and optimizing compiler for manipulating and evaluating mathematical expressions, especially matrix-valued ones. In Theano, computations are expressed using a NumPy-esque syntax and compiled to run efficiently on either CPU or GPU architectures.

<span class="mw-page-title-main">Torch (machine learning)</span> Deep learning software

Torch is an open-source machine learning library, a scientific computing framework, and a scripting language based on Lua. It provides LuaJIT interfaces to deep learning algorithms implemented in C. It was created by the Idiap Research Institute at EPFL. Torch development moved in 2017 to PyTorch, a port of the library to Python.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

<span class="mw-page-title-main">TensorFlow</span> Machine learning software library

TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks. It is one of the most popular deep learning frameworks, alongside others such as PyTorch and PaddlePaddle. It is free and open-source software released under the Apache License 2.0.

The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications.

<span class="mw-page-title-main">Tensor Processing Unit</span> AI accelerator ASIC by Google

Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and in 2018 made them available for third-party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale.

spaCy Software library for natural language processing

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the most popular deep learning frameworks, alongside others such as TensorFlow and PaddlePaddle, offering free and open-source software released under the modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.

SqueezeNet is a deep neural network for image classification released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy. Their best-performing model achieved the same accuracy as AlexNet on ImageNet classification, but has a size 510x less than it.

PlaidML is a portable tensor compiler. Tensor compilers bridge the gap between the universal mathematical descriptions of deep learning operations, such as convolution, and the platform and chip-specific code needed to perform those operations with good performance. Internally, PlaidML makes use of the Tile eDSL to generate OpenCL, OpenGL, LLVM, or CUDA code. It enables deep learning on devices where the available computing hardware is either not well supported or the available software stack contains only proprietary components. For example, it does not require the usage of CUDA or cuDNN on Nvidia hardware, while achieving comparable performance.

Inception is a family of convolutional neural network (CNN) for computer vision, introduced by researchers at Google in 2014 as GoogLeNet. The series was historically important as an early CNN that separates the stem, body, and head (prediction), an architectural design that persists in all modern CNN.

<span class="mw-page-title-main">Flux (machine-learning framework)</span> Open-source machine-learning software library

Flux is an open-source machine-learning software library and ecosystem written in Julia. Its current stable release is v0.15.0 . It has a layer-stacking-based interface for simpler models, and has a strong support on interoperability with other Julia packages instead of a monolithic design. For example, GPU support is implemented transparently by CuArrays.jl. This is in contrast to some other machine learning frameworks which are implemented in other languages with Julia bindings, such as TensorFlow.jl, and thus are more limited by the functionality present in the underlying implementation, which is often in C or C++. Flux joined NumFOCUS as an affiliated project in December of 2021.

OpenVINO is an open-source software toolkit for optimizing and deploying deep learning models. It enables programmers to develop scalable and efficient AI solutions with relatively few lines of code. It supports several popular model formats and categories, such as large language models, computer vision, and generative AI.

fast.ai is a non-profit research group focused on deep learning and artificial intelligence. It was founded in 2016 by Jeremy Howard and Rachel Thomas with the goal of democratizing deep learning. They do this by providing a massive open online course (MOOC) named "Practical Deep Learning for Coders," which has no other prerequisites except for knowledge of the programming language Python.

<span class="mw-page-title-main">Owl Scientific Computing</span> Numerical programming library for the OCaml programming language

Owl Scientific Computing is a software system for scientific and engineering computing developed in the Department of Computer Science and Technology, University of Cambridge. The System Research Group (SRG) in the department recognises Owl as one of the representative systems developed in SRG in the 2010s. The source code is licensed under the MIT License and can be accessed from the GitHub repository.

Graph neural networks (GNN) are specialized artificial neural networks that are designed for tasks whose inputs are graphs.

XLA is an open-source compiler for machine learning developed by the OpenXLA project. XLA is designed to improve the performance of machine learning models by optimizing the computation graphs at a lower level, making it particularly useful for large-scale computations and high-performance machine learning models. Key features of XLA include:

The Latent Diffusion Model (LDM) is a diffusion model architecture developed by the CompVis group at LMU Munich.

MobileNet is a family of convolutional neural network (CNN) architectures designed for image classification, object detection, and other computer vision tasks. They are designed for small size, low latency, and low power consumption, making them suitable for on-device inference and edge computing on resource-constrained devices like mobile phones and embedded systems. They were originally designed to be run efficiently on mobile devices with TensorFlow Lite.

References

  1. "Release 3.7.0". 26 November 2024. Retrieved 25 December 2024.
  2. "Keras: Deep Learning for humans". keras.io. Retrieved 2024-04-30.
  3. "What's new in TensorFlow 2.16" . Retrieved 2024-04-30.
  4. Team, Keras. "Keras documentation: About Keras 3". keras.io. Retrieved 2024-02-10.
  5. "Keras Documentation". keras.io. Retrieved 2016-09-18.
  6. Chollet, François (2016). "Xception: Deep Learning with Depthwise Separable Convolutions". arXiv: 1610.02357 [cs.CV].
  7. "Keras backends". keras.io. Retrieved 2018-02-23.
  8. 1 2 "Why use Keras?". keras.io. Retrieved 2020-03-22.
  9. "R interface to Keras". keras.rstudio.com. Retrieved 2020-03-22.
  10. Chollet, François; Usui, Lauren (2023). "Introducing Keras Core: Keras for TensorFlow, JAX, and PyTorch". Keras.io. Retrieved 2023-07-11.
  11. Ciaramella, Alberto; Ciaramella, Marco (2024). Introduction to Artificial Intelligence: from data analysis to generative AI. ISBN   9788894787603.
  12. "Core - Keras Documentation". keras.io. Retrieved 2018-11-14.
  13. "Using TPUs | TensorFlow". TensorFlow. Archived from the original on 2019-06-04. Retrieved 2018-11-14.