Chainer

Last updated
Chainer
Original author(s) Seiya Tokui
Developer(s) Community, Preferred Networks, Inc.
Initial releaseJune 9, 2015;9 years ago (2015-06-09). [1] [2]
Stable release
7.8.1 [3] / 5 January 2022;2 years ago (5 January 2022)
Repository
Written in Python
Platform cross-platform
Available inPython
Type Deep learning library
License MIT
Website chainer.org

Chainer is an open source deep learning framework written purely in Python on top of NumPy and CuPy Python libraries. The development is led by Japanese venture company Preferred Networks in partnership with IBM, Intel, Microsoft, and Nvidia. [4] [5] [6] [7]

Contents

Chainer is notable for its early adoption of "define-by-run" scheme, as well as its performance on large scale systems. [1] The first version was released in June 2015 and has gained large popularity in Japan since then. [1] [2] Furthermore, in 2017, it was listed by KDnuggets in top 10 open source machine learning Python projects. [8]

In December 2019, Preferred Networks announced the transition of its development effort from Chainer to PyTorch and it will only provide maintenance patches after releasing v7. [9]

Define-by-run

Chainer was the first deep learning framework to introduce the define-by-run approach. [10] [11] The traditional procedure to train a network was in two phases: define the fixed connections between mathematical operations (such as matrix multiplication and nonlinear activations) in the network, and then run the actual training calculation. This is called the define-and-run or static-graph approach. Theano and TensorFlow are among the notable frameworks that took this approach. In contrast, in the define-by-run or dynamic-graph approach, the connection in a network is not determined when the training is started. The network is determined during the training as the actual calculation is performed.

One of the advantages of this approach is that it is intuitive and flexible. [12] If the network has complicated control flows such as conditionals and loops, in the define-and-run approach, specially designed operations for such constructs are needed. On the other hand, in the define-by-run approach, programming language's native constructs such as if statements and for loops can be used to describe such flow. This flexibility is especially useful to implement recurrent neural networks. [13] [14]

Another advantage is ease of debugging. [12] In the define-and-run approach, if an error (such as numeric error) has occurred in the training calculation, it is often difficult to inspect the fault, because the code written to define the network and the actual place of the error are separated. In the define-by-run approach, you can just suspend the calculation with the language's built-in debugger and inspect the data that flows on your code of the network.

Define-by-run has gained popularity since the introduction by Chainer and is now implemented in many other frameworks, including PyTorch [15] and TensorFlow. [12]

Extension libraries

Chainer has four extension libraries, ChainerMN, ChainerRL, ChainerCV and ChainerUI. ChainerMN enables Chainer to be used on multiple GPUs with performance significantly faster than other deep learning frameworks. [1] A supercomputer running Chainer on 1024 GPUs processed 90 epochs of ImageNet dataset on ResNet-50 network in 15 minutes, which is four times faster than the previous record held by Facebook. [16] [17] ChainerRL adds state of art deep reinforcement learning algorithms, and ChainerUI is a management and visualization tool.

Applications

Chainer is used as the framework for PaintsChainer, a service which does automatic colorization of black and white, line only, draft drawings with minimal user input. [18] [19]

See also

Related Research Articles

<span class="mw-page-title-main">Torch (machine learning)</span> Deep learning software

Torch is an open-source machine learning library, a scientific computing framework, and a scripting language based on Lua. It provides LuaJIT interfaces to deep learning algorithms implemented in C. It was created by the Idiap Research Institute at EPFL. Torch development moved in 2017 to PyTorch, a port of the library to Python.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

<span class="mw-page-title-main">TensorFlow</span> Machine learning software library

TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks. It is one of the most popular deep learning frameworks, alongside others such as PyTorch and PaddlePaddle. It is free and open-source software released under the Apache License 2.0.

The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications.

An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.

<span class="mw-page-title-main">Keras</span> Neural network library

Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later supporting more. "Keras 3 is a full rewrite of Keras [and can be used] as a low-level cross-framework language to develop custom components such as layers, models, or metrics that can be used in native workflows in JAX, TensorFlow, or PyTorch — with one codebase." Keras 3 will be the default Keras version for TensorFlow 2.16 onwards, but Keras 2 can still be used.

Apache MXNet is an open-source deep learning software framework that trains and deploys deep neural networks. It aims to be scalable, allows fast model training, and supports a flexible programming model and multiple programming languages. The MXNet library is portable and can scale to multiple GPUs and machines. It was co-developed by Carlos Guestrin at the University of Washington, along with GraphLab.

spaCy Software library for natural language processing

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the most popular deep learning frameworks, alongside others such as TensorFlow and PaddlePaddle, offering free and open-source software released under the modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.

The Open Neural Network Exchange (ONNX) [] is an open-source artificial intelligence ecosystem of technology companies and research organizations that establish open standards for representing machine learning algorithms and software tools to promote innovation and collaboration in the AI sector. ONNX is available on GitHub.

Deep Learning Studio is a software tool that aims to simplify the creation of deep learning models used in artificial intelligence. It is compatible with a number of open-source programming frameworks popularly used in artificial neural networks, including MXNet and Google's TensorFlow.

The bfloat16 floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a shortened (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing. It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.

Differentiable programming is a programming paradigm in which a numeric computer program can be differentiated throughout via automatic differentiation. This allows for gradient-based optimization of parameters in the program, often via gradient descent, as well as other learning approaches that are based on higher order derivative information. Differentiable programming has found use in a wide variety of areas, particularly scientific computing and machine learning. One of the early proposals to adopt such a framework in a systematic fashion to improve upon learning algorithms was made by the Advanced Concepts Team at the European Space Agency in early 2016.

OpenVINO is an open-source software toolkit for optimizing and deploying deep learning models. It enables programmers to develop scalable and efficient AI solutions with relatively few lines of code. It supports several popular model formats and categories, such as large language models, computer vision, and generative AI.

fast.ai is a non-profit research group focused on deep learning and artificial intelligence. It was founded in 2016 by Jeremy Howard and Rachel Thomas with the goal of democratizing deep learning. They do this by providing a massive open online course (MOOC) named "Practical Deep Learning for Coders," which has no other prerequisites except for knowledge of the programming language Python.

<span class="mw-page-title-main">Neural Network Intelligence</span> Microsoft open source library

NNI is a free and open-source AutoML toolkit developed by Microsoft. It is used to automate feature engineering, model compression, neural architecture search, and hyper-parameter tuning.

<span class="mw-page-title-main">Owl Scientific Computing</span> Numerical programming library for the OCaml programming language

Owl Scientific Computing is a software system for scientific and engineering computing developed in the Department of Computer Science and Technology, University of Cambridge. The System Research Group (SRG) in the department recognises Owl as one of the representative systems developed in SRG in the 2010s. The source code is licensed under the MIT License and can be accessed from the GitHub repository.

CuPy is an open source library for GPU-accelerated computing with Python programming language, providing support for multi-dimensional arrays, sparse matrices, and a variety of numerical algorithms implemented on top of them. CuPy shares the same API set as NumPy and SciPy, allowing it to be a drop-in replacement to run NumPy/SciPy code on GPU. CuPy supports Nvidia CUDA GPU platform, and AMD ROCm GPU platform starting in v9.0.

References

  1. 1 2 3 4 "Big-in-Japan AI code 'Chainer' shows how Intel will gun for GPUs". The Register. 2017-04-07. Retrieved 2017-12-24.
  2. 1 2 "Deep Learning のフレームワーク Chainer を公開しました" (in Japanese). 2015-06-09. Retrieved 2017-12-24.
  3. "Release 7.8.1". 5 January 2022. Retrieved 3 October 2022.
  4. "Chainer Homepage" . Retrieved 2017-12-24.
  5. "IBM Wants to be "Red Hat" of Deep Learning". HPCwire. 2017-01-26. Retrieved 2017-09-08.
  6. "Intel Collaborating with Preferred Networks in Japan on Deep Learning". 2017-04-06. Retrieved 2017-12-24.
  7. "Microsoft partners with Preferred Networks to bring Chainer deep learning technology to Azure - MSPoweruser". MSPoweruser. 2017-05-23. Retrieved 2017-09-08.
  8. "Top 20 Python Machine Learning Open Source Projects". KDnuggets. 2017-11-24.
  9. "Preferred Networks Migrates its Deep Learning Research Platform to PyTorch". Preferred Networks, Inc. 2019-12-05. Retrieved 2019-12-27.
  10. Tokui, Seiya; et al. (2015). "Chainer: a next-generation open source framework for deep learning". 29th Annual Conference on Neural Information Processing Systems (NIPS). 5.
  11. Shimada, Naoki (September 14, 2017). Deep Learning with Chainer. Gijutsu-Hyohron. p. 61. ISBN   978-4774191867.
  12. 1 2 3 "Eager Execution: An imperative, define-by-run interface to TensorFlow". Google Research Blog.
  13. "Deep Learning With Dynamic Computation Graphs (ICLR 2017)". Metadata.
  14. Hido, Shohei (8 November 2016). "Complex neural networks made easy by Chainer". O'Reilly Media. Retrieved 26 June 2018.
  15. Perez, Carlos E. (20 January 2017). "PyTorch, Dynamic Computational Graphs and Modular Deep Learning". Medium.
  16. "Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes" (PDF). Retrieved 2017-12-24.
  17. Greene, Tristan (20 November 2017). "Facebook's nerds bested by Japan's in the race to train AI". The Next Web. Retrieved 24 November 2017.
  18. Know, Now You (2017-02-15). "This neural network-based software will add colour to your drawings for free". Techly. Retrieved 2017-09-08.
  19. "Drawing app "pixiv Sketch" and automatic coloring service "PaintsChainer" collaborate to provide a new function for automatic coloring of illustrations!". 2017-05-24. Retrieved 2017-12-24.