Deeplearning4j

Last updated
Eclipse Deeplearning4j
Original author(s) Alex D. Black, Adam Gibson, Vyacheslav Kokorin, Josh Patterson
Developer(s) Kondiut K. K. and contributors
Preview release
1.0.0-beta7 / 13 May 2020;4 years ago (2020-05-13) [1]
Repository
Written in Java, CUDA, C, C++,
Operating system Linux, macOS, Windows, Android, iOS
Platform CUDA, x86, ARM, PowerPC
Available inEnglish
Type Natural language processing, deep learning, machine vision, artificial intelligence
License Apache License 2.0
Website www.deeplearning4j.org   OOjs UI icon edit-ltr-progressive.svg

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). [2] [3] It is a framework with wide support for deep learning algorithms. [4] Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark. [5]

Contents

Deeplearning4j is open-source software released under Apache License 2.0, [6] developed mainly by a machine learning group headquartered in San Francisco. [7] It is supported commercially by the startup Skymind, which bundles DL4J, TensorFlow, Keras and other deep learning libraries in an enterprise distribution called the Skymind Intelligence Layer. [8] Deeplearning4j was contributed to the Eclipse Foundation in October 2017. [9] [10]

Introduction

Deeplearning4j relies on the widely used programming language Java, though it is compatible with Clojure and includes a Scala application programming interface (API). It is powered by its own open-source numerical computing library, ND4J, and works with both central processing units (CPUs) and graphics processing units (GPUs). [11] [12]

Deeplearning4j has been used in several commercial and academic applications. The code is hosted on GitHub. [13] A support forum is maintained on Gitter. [14]

The framework is composable, meaning shallow neural nets such as restricted Boltzmann machines, convolutional nets, autoencoders, and recurrent nets can be added to one another to create deep nets of varying types. It also has extensive visualization tools, [15] and a computation graph. [16]

Distributed

Training with Deeplearning4j occurs in a cluster. Neural nets are trained in parallel via iterative reduce, which works on Hadoop-YARN and on Spark. [7] [17] Deeplearning4j also integrates with CUDA kernels to conduct pure GPU operations, and works with distributed GPUs.

Scientific computing for the JVM

Deeplearning4j includes an n-dimensional array class using ND4J that allows scientific computing in Java and Scala, similar to the functions that NumPy provides to Python. It's effectively based on a library for linear algebra and matrix manipulation in a production environment.

DataVec vectorization library for machine-learning

DataVec vectorizes various file formats and data types using an input/output format system similar to Hadoop's use of MapReduce; that is, it turns various data types into columns of scalars termed vectors. DataVec is designed to vectorize CSVs, images, sound, text, video, and time series. [18] [19]

Text and NLP

Deeplearning4j includes a vector space modeling and topic modeling toolkit, implemented in Java and integrating with parallel GPUs for performance. It is designed to handle large text sets.

Deeplearning4j includes implementations of term frequency–inverse document frequency (tf–idf), deep learning, and Mikolov's word2vec algorithm, [20] doc2vec, and GloVe, reimplemented and optimized in Java. It relies on t-distributed stochastic neighbor embedding (t-SNE) for word-cloud visualizations.

Real-world use cases and integrations

Real-world use cases for Deeplearning4j include network intrusion detection and cybersecurity, fraud detection for the financial sector, [21] [22] anomaly detection in industries such as manufacturing, recommender systems in e-commerce and advertising, [23] and image recognition. [24] Deeplearning4j has integrated with other machine-learning platforms such as RapidMiner, Prediction.io, [25] and Weka. [26]

Machine Learning Model Server

Deeplearning4j serves machine-learning models for inference in production using the free developer edition of SKIL, the Skymind Intelligence Layer. [27] [28] A model server serves the parametric machine-learning models that makes decisions about data. It is used for the inference stage of a machine-learning workflow, after data pipelines and model training. A model server is the tool that allows data science research to be deployed in a real-world production environment.

What a Web server is to the Internet, a model server is to AI. Where a Web server receives an HTTP request and returns data about a Web site, a model server receives data, and returns a decision or prediction about that data: e.g. sent an image, a model server might return a label for that image, identifying faces or animals in photographs.

The SKIL model server is able to import models from Python frameworks such as Tensorflow, Keras, Theano and CNTK, overcoming a major barrier in deploying deep learning models.

Benchmarks

Deeplearning4j is as fast as Caffe for non-trivial image recognition tasks using multiple GPUs. [29] For programmers unfamiliar with HPC on the JVM, there are several parameters that must be adjusted to optimize neural network training time. These include setting the heap space, the garbage collection algorithm, employing off-heap memory and pre-saving data (pickling) for faster ETL. [30] Together, these optimizations can lead to a 10x acceleration in performance with Deeplearning4j.

API Languages: Java, Scala, Python, Clojure & Kotlin

Deeplearning4j can be used via multiple API languages including Java, Scala, Python, Clojure and Kotlin. Its Scala API is called ScalNet. [31] Keras serves as its Python API. [32] And its Clojure wrapper is known as DL4CLJ. [33] The core languages performing the large-scale mathematical operations necessary for deep learning are C, C++ and CUDA C.

Tensorflow, Keras & Deeplearning4j

Tensorflow, Keras and Deeplearning4j work together. Deeplearning4j can import models from Tensorflow and other Python frameworks if they have been created with Keras. [34]

See also

Related Research Articles

<span class="mw-page-title-main">Java virtual machine</span> Virtual machine that runs Java programs

A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.

Programming languages can be grouped by the number and types of paradigms supported.

<span class="mw-page-title-main">Scala (programming language)</span> General-purpose programming language

Scala is a strong statically typed high-level general-purpose programming language that supports both object-oriented programming and functional programming. Designed to be concise, many of Scala's design decisions are intended to address criticisms of Java.

<span class="mw-page-title-main">Clojure</span> Dialect of the Lisp programming language on the Java platform

Clojure is a dynamic and functional dialect of the Lisp programming language on the Java platform.

<span class="mw-page-title-main">Vert.x</span>

Eclipse Vert.x is a polyglot event-driven application framework that runs on the Java Virtual Machine.

<span class="mw-page-title-main">Akka (toolkit)</span> Open-source runtime

Akka is a source-available toolkit and runtime simplifying the construction of concurrent and distributed applications on the JVM. Akka supports multiple programming models for concurrency, but it emphasizes actor-based concurrency, with inspiration drawn from Erlang.

<span class="mw-page-title-main">Apache Spark</span> Open-source data analytics cluster computing framework

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

<span class="mw-page-title-main">TensorFlow</span> Machine learning software library

TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks.

The following table compares notable software frameworks, libraries and computer programs for deep learning.

<span class="mw-page-title-main">XGBoost</span> Gradient boosting machine learning library

XGBoost is an open-source software library which provides a regularizing gradient boosting framework for C++, Java, Python, R, Julia, Perl, and Scala. It works on Linux, Microsoft Windows, and macOS. From the project description, it aims to provide a "Scalable, Portable and Distributed Gradient Boosting Library". It runs on a single machine, as well as the distributed processing frameworks Apache Hadoop, Apache Spark, Apache Flink, and Dask.

<span class="mw-page-title-main">Keras</span> Neural network library

Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later supporting more. "Keras 3 is a full rewrite of Keras [and can be used] as a low-level cross-framework language to develop custom components such as layers, models, or metrics that can be used in native workflows in JAX, TensorFlow, or PyTorch — with one codebase." Keras 3 will be the default Keras version for TensorFlow 2.16 onwards, but Keras 2 can still be used.

Apache MXNet is an open-source deep learning software framework that trains and deploys deep neural networks. It aims to be scalable, allows fast model training, and supports a flexible programming model and multiple programming languages. The MXNet library is portable and can scale to multiple GPUs and machines. It was co-developed by Carlos Guestrin at the University of Washington, along with GraphLab.

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

In computer vision, SqueezeNet is the name of a deep neural network for image classification that was released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy.

PlaidML is a portable tensor compiler. Tensor compilers bridge the gap between the universal mathematical descriptions of deep learning operations, such as convolution, and the platform and chip-specific code needed to perform those operations with good performance. Internally, PlaidML makes use of the Tile eDSL to generate OpenCL, OpenGL, LLVM, or CUDA code. It enables deep learning on devices where the available computing hardware is either not well supported or the available software stack contains only proprietary components. For example, it does not require the usage of CUDA or cuDNN on Nvidia hardware, while achieving comparable performance.

<span class="mw-page-title-main">Continuous Bernoulli distribution</span> Probability distribution

In probability theory, statistics, and machine learning, the continuous Bernoulli distribution is a family of continuous probability distributions parameterized by a single shape parameter , defined on the unit interval , by:

References

  1. "Releases · eclipse/deeplearning4j". github.com. Retrieved 2021-04-03.
  2. Metz, Cade (2014-06-02). "The Mission to Bring Google's AI to the Rest of the World". Wired.com . Retrieved 2014-06-28.
  3. Vance, Ashlee (2014-06-03). "Deep Learning for (Some of) the People". Bloomberg Businessweek . Archived from the original on June 4, 2014. Retrieved 2014-06-28.
  4. Novet, Jordan (2015-11-14). "Want an open-source deep learning framework? Take your pick". VentureBeat . Retrieved 2015-11-24.
  5. "Adam Gibson, DeepLearning4j on Spark and Data Science on JVM with nd4j, SF Spark @Galvanize 20150212". SF Spark Meetup. 2015-02-12. Retrieved 2015-03-01.
  6. "Github Repository". GitHub . April 2020.
  7. 1 2 "deeplearning4j.org".
  8. "Skymind Intelligence Layer Community Edition". Archived from the original on 2017-11-07. Retrieved 2017-11-02.
  9. "Eclipse Deeplearning4j Project Page". 22 June 2017.
  10. "Skymind's Deeplearning4j, the Eclipse Foundation, and scientific computing in the JVM". Jaxenter. 13 November 2017. Retrieved 2017-11-15.
  11. Novet, Jordan (September 28, 2016). "Deep learning startup Skymind raises $3 million, launches Intelligence Layer distribution". VentureBeat .
  12. Novet, Jordan (June 2, 2014). "Skymind launches with open-source, plug-and-play deep learning features for your app". VentureBeat .
  13. "deeplearning4j/deeplearning4j". 29 April 2023. Retrieved 29 April 2023 via GitHub.
  14. "Element". app.gitter.im. Retrieved 29 April 2023.
  15. "Deeplearning4j Visualization Tools". Archived from the original on 2017-08-10. Retrieved 2016-08-17.
  16. "Deeplearning4j Computation Graph". Archived from the original on 2017-08-10. Retrieved 2016-08-17.
  17. "Iterative reduce". GitHub . 15 March 2020.
  18. "DataVec ETL for Machine Learning". Archived from the original on 2017-10-02. Retrieved 2016-09-18.
  19. "Anomaly Detection for Time Series Data with Deep Learning". InfoQ. Retrieved 29 April 2023.
  20. "Google Code Archive - Long-term storage for Google Code Project Hosting". code.google.com. Retrieved 29 April 2023.
  21. "Archived copy". Archived from the original on 2016-03-10. Retrieved 2016-02-22.{{cite web}}: CS1 maint: archived copy as title (link)
  22. "skymind.ai". skymind.ai. Retrieved 29 April 2023.
  23. "Archived copy". Archived from the original on 2016-03-10. Retrieved 2016-02-22.{{cite web}}: CS1 maint: archived copy as title (link)
  24. "skymind.ai". skymind.ai. Retrieved 29 April 2023.
  25. "DeepLearning4J(Stable) | RapidMiner China". www.rapidminerchina.com. Archived from the original on 18 May 2016. Retrieved 22 May 2022.
  26. "Home - WekaDeeplearning4j". deeplearning.cms.waikato.ac.nz. Retrieved 29 April 2023.
  27. "Products". Archived from the original on 2017-09-21. Retrieved 2017-09-20.
  28. "Model Server for Deep Learning and AI - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-09-21. Retrieved 2017-09-20.
  29. "GitHub - deeplearning4j/Dl4j-benchmark: Repo to track dl4j benchmark code". GitHub . 19 December 2019.
  30. "Deeplearning4j Benchmarks - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-08-09. Retrieved 2017-01-30.
  31. "Scala, Spark and Deeplearning4j - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-02-25. Retrieved 2017-02-25.
  32. "Running Keras with Deeplearning4j - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-02-25. Retrieved 2017-02-25.
  33. "Deep Learning with Clojure - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-02-25. Retrieved 2017-02-25.
  34. "Tensorflow & Deeplearning4j - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM". Archived from the original on 2017-09-08. Retrieved 2017-09-07.