SqueezeNet

SqueezeNet
Original author(s)	Forrest Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, Bill Dally, Kurt Keutzer
Initial release	22 February 2016;8 years ago
Stable release	v1.1 (June 6, 2016;8 years ago)
Repository	github.com/DeepScale/SqueezeNet
Type	Deep neural network
License	BSD license

Last updated December 13, 2024

SqueezeNet is a deep neural network for image classification released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy. Their best-performing model achieved the same accuracy as AlexNet on ImageNet classification, but has a size 510x less than it.^[1]

Version history

SqueezeNet was originally released on February 22, 2016.^[2] This original version of SqueezeNet was implemented on top of the Caffe deep learning software framework. Shortly thereafter, the open-source research community ported SqueezeNet to a number of other deep learning frameworks. On February 26, 2016, Eddie Bell released a port of SqueezeNet for the Chainer deep learning framework.^[3] On March 2, 2016, Guo Haria released a port of SqueezeNet for the Apache MXNet framework.^[4] On June 3, 2016, Tammy Yang released a port of SqueezeNet for the Keras framework.^[5] In 2017, companies including Baidu, Xilinx, Imagination Technologies, and Synopsys demonstrated SqueezeNet running on low-power processing platforms such as smartphones, FPGAs, and custom processors.^[6]^[7]^[8]^[9]

As of 2018, SqueezeNet ships "natively" as part of the source code of a number of deep learning frameworks such as PyTorch, Apache MXNet, and Apple CoreML.^[10]^[11]^[12] In addition, third party developers have created implementations of SqueezeNet that are compatible with frameworks such as TensorFlow.^[13] Below is a summary of frameworks that support SqueezeNet.

Framework	SqueezeNet Support	References
Apache MXNet	Native	^[11]
Apple CoreML	Native	^[12]
Caffe2	Native	^[14]
Keras	3rd party	^[5]
MATLAB Deep Learning Toolbox	Native	^[15]
ONNX	Native	^[16]
PyTorch	Native	^[10]
TensorFlow	3rd party	^[13]
Wolfram Mathematica	Native	^[17]

Relationship to other networks

AlexNet

SqueezeNet was originally described in SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size.^[1] AlexNet is a deep neural network that has 240 MB of parameters, and SqueezeNet has just 5 MB of parameters. This small model size can more easily fit into computer memory and can more easily be transmitted over a computer network. However, it's important to note that SqueezeNet is not a "squeezed version of AlexNet." Rather, SqueezeNet is an entirely different DNN architecture than AlexNet.^[18] What SqueezeNet and AlexNet have in common is that both of them achieve approximately the same level of accuracy when evaluated on the ImageNet image classification validation dataset.

Model compression

Model compression (e.g. quantization and pruning of model parameters) can be applied to a deep neural network after it has been trained.^[19] In the SqueezeNet paper, the authors demonstrated that a model compression technique called Deep Compression can be applied to SqueezeNet to further reduce the size of the parameter file from 5 MB to 500 KB.^[1] Deep Compression has also been applied to other DNNs, such as AlexNet and VGG.^[20]

Variants

Some of the members of the original SqueezeNet team have continued to develop resource-efficient deep neural networks for a variety of applications. A few of these works are noted in the following table. As with the original SqueezeNet model, the open-source research community has ported and adapted these newer "squeeze"-family models for compatibility with multiple deep learning frameworks.


DNN Model	Application	Original Implementation	Other Implementations
SqueezeDet^[21]^[22]	Object Detection on Images	TensorFlow^[23]	Caffe,^[24] Keras^[25]^[26]^[27]
SqueezeSeg^[28]	Semantic Segmentation of LIDAR	TensorFlow^[29]
SqueezeNext^[30]	Image Classification	Caffe^[31]	TensorFlow,^[32] Keras,^[33] PyTorch^[34]
SqueezeNAS^[35]^[36]	Neural Architecture Search for Semantic Segmentation	PyTorch^[37]

In addition, the open-source research community has extended SqueezeNet to other applications, including semantic segmentation of images and style transfer.^[38]^[39]^[40]

Related Research Articles

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently have been replaced -- in some cases -- by newer deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks. It is one of the most popular deep learning frameworks, alongside others such as PyTorch and PaddlePaddle. It is free and open-source software released under the Apache License 2.0.

The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications.

Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later supporting more. "Keras 3 is a full rewrite of Keras [and can be used] as a low-level cross-framework language to develop custom components such as layers, models, or metrics that can be used in native workflows in JAX, TensorFlow, or PyTorch — with one codebase." Keras 3 will be the default Keras version for TensorFlow 2.16 onwards, but Keras 2 can still be used.

Apache MXNet is an open-source deep learning software framework that trains and deploys deep neural networks. It aims to be scalable, allows fast model training, and supports a flexible programming model and multiple programming languages. The MXNet library is portable and can scale to multiple GPUs and machines. It was co-developed by Carlos Guestrin at the University of Washington, along with GraphLab.

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

AlexNet is a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor at the University of Toronto in 2012. It had 60 million parameters and 650,000 neurons.

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the most popular deep learning frameworks, alongside others such as TensorFlow and PaddlePaddle, offering free and open-source software released under the modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.

Deep image prior is a type of convolutional neural network used to enhance a given image with no prior training data other than the image itself. A neural network is randomly initialized and used as prior to solve inverse problems such as noise reduction, super-resolution, and inpainting. Image statistics are captured by the structure of a convolutional image generator rather than by any previously learned capabilities.

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par with or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used:

U-Net is a convolutional neural network that was developed for image segmentation. The network is based on a fully convolutional neural network whose architecture was modified and extended to work with fewer training images and to yield more precise segmentation. Segmentation of a 512 × 512 image takes less than a second on a modern (2015) GPU using the U-Net architecture.

DeepScale, Inc. was an American technology company headquartered in Mountain View, California, that developed perceptual system technologies for automated vehicles. On October 1, 2019, the company was acquired by Tesla, Inc.

PlaidML is a portable tensor compiler. Tensor compilers bridge the gap between the universal mathematical descriptions of deep learning operations, such as convolution, and the platform and chip-specific code needed to perform those operations with good performance. Internally, PlaidML makes use of the Tile eDSL to generate OpenCL, OpenGL, LLVM, or CUDA code. It enables deep learning on devices where the available computing hardware is either not well supported or the available software stack contains only proprietary components. For example, it does not require the usage of CUDA or cuDNN on Nvidia hardware, while achieving comparable performance.

Kurt Keutzer is an American computer scientist.

Forrest N. Iandola is an American computer scientist specializing in efficient AI.

<span class="mw-page-title-main">Horovod (machine learning)</span>

Horovod is a free and open-source software framework for distributed deep learning training using TensorFlow, Keras, PyTorch, and Apache MXNet. Horovod is hosted under the Linux Foundation AI. Horovod has the goal of improving the speed, scale, and resource allocation when training a machine learning model.

NNI is a free and open-source AutoML toolkit developed by Microsoft. It is used to automate feature engineering, model compression, neural architecture search, and hyper-parameter tuning.

Model compression is a machine learning technique for reducing the size of trained models. Large models can achieve high accuracy, but often at the cost of significant resource requirements. Compression techniques aim to compress models without significant performance reduction. Smaller models require less storage space, and consume less memory and compute during inference.

References

1 2 3 Iandola, Forrest N; Han, Song; Moskewicz, Matthew W; Ashraf, Khalid; Dally, William J; Keutzer, Kurt (2016). "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size". arXiv: 1602.07360 [cs.CV].
↑ "SqueezeNet". GitHub. 2016-02-22. Retrieved 2018-05-12.
↑ Bell, Eddie (2016-02-26). "An implementation of SqueezeNet in Chainer". GitHub. Retrieved 2018-05-12.
↑ Haria, Guo (2016-03-02). "SqueezeNet for MXNet". GitHub. Retrieved 2018-05-12.
1 2 Yang, Tammy (2016-06-03). "SqueezeNet Keras Implementation". GitHub. Retrieved 2018-05-12.
↑ Chirgwin, Richard (2017-09-26). "Baidu puts open source deep learning into smartphones". The Register. Retrieved 2018-04-07.
↑ Bush, Steve (2018-01-25). "Neural network SDK for PowerVR GPUs". Electronics Weekly. Retrieved 2018-04-07.
↑ Yoshida, Junko (2017-03-13). "Xilinx AI Engine Steers New Course". EE Times. Retrieved 2018-05-13.
↑ Boughton, Paul (2017-08-28). "Deep learning computer vision algorithms ported to processor IP". Engineer Live. Retrieved 2018-04-07.
1 2 "squeezenet.py". GitHub: PyTorch. Retrieved 2018-05-12.
1 2 "squeezenet.py". GitHub: Apache MXNet. Retrieved 2018-04-07.
1 2 "CoreML". Apple. Retrieved 2018-04-10.
1 2 Poster, Domenick. "Tensorflow implementation of SqueezeNet". GitHub. Retrieved 2018-05-12.
↑ Inkawhich, Nathan. "SqueezeNet Model Quickload Tutorial". GitHub: Caffe2. Retrieved 2018-04-07.
↑ "SqueezeNet for MATLAB Deep Learning Toolbox". Mathworks. Retrieved 2018-10-03.
↑ Fang, Lu. "SqueezeNet for ONNX". Open Neural Network eXchange.
↑ "SqueezeNet V1.1 Trained on ImageNet Competition Data". Wolfram Neural Net Repository. Retrieved 2018-05-12.
↑ "SqueezeNet". Short Science. Retrieved 2018-05-13.
↑ Han, Song; Mao, Huizi; Dally, William J. (2016-02-15), Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, arXiv: 1510.00149
↑ Han, Song (2016-11-06). "Compressing and regularizing deep neural networks". O'Reilly. Retrieved 2018-05-08.
↑ Wu, Bichen; Wan, Alvin; Iandola, Forrest; Jin, Peter H.; Keutzer, Kurt (2016). "SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving". arXiv: 1612.01051 [cs.CV].
↑ Nunes Fernandes, Edgar (2017-03-02). "Introducing SqueezeDet: low power fully convolutional neural network framework for autonomous driving". The Intelligence of Information. Retrieved 2019-03-31.
↑ Wu, Bichen (2016-12-08). "SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving". GitHub. Retrieved 2018-12-26.
↑ Kuan, Xu (2017-12-20). "Caffe SqueezeDet". GitHub. Retrieved 2018-12-26.
↑ Padmanabha, Nischal (2017-03-20). "SqueezeDet on Keras". GitHub. Retrieved 2018-12-26.
↑ Ehmann, Christopher (2018-05-29). "Fast object detection with SqueezeDet on Keras". Medium. Retrieved 2019-03-31.
↑ Ehmann, Christopher (2018-05-02). "A deeper look into SqueezeDet on Keras". Medium. Retrieved 2019-03-31.
↑ Wu, Bichen; Wan, Alvin; Yue, Xiangyu; Keutzer, Kurt (2017). "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud". arXiv: 1710.07368 [cs.CV].
↑ Wu, Bichen (2017-12-06). "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud". GitHub. Retrieved 2018-12-26.
↑ Gholami, Amir; Kwon, Kiseok; Wu, Bichen; Tai, Zizheng; Yue, Xiangyu; Jin, Peter; Zhao, Sicheng; Keutzer, Kurt (2018). "SqueezeNext: Hardware-Aware Neural Network Design". arXiv: 1803.10615 [cs.CV].
↑ Gholami, Amir (2018-04-18). "SqueezeNext". GitHub. Retrieved 2018-12-29.
↑ Verhulsdonck, Tijmen (2018-07-09). "SqueezeNext Tensorflow: A tensorflow Implementation of SqueezeNext". GitHub. Retrieved 2018-12-29.
↑ Sémery, Oleg (2018-09-24). "SqueezeNext, implemented in Keras". GitHub . Retrieved 2018-12-29.
↑ Lu, Yi (2018-06-21). "SqueezeNext.PyTorch". GitHub. Retrieved 2018-12-29.
↑ Shaw, Albert; Hunter, Daniel; Iandola, Forrest; Sidhu, Sammy (2019). "SqueezeNAS: Fast neural architecture search for faster semantic segmentation". arXiv: 1908.01748 [cs.LG].
↑ Yoshida, Junko (2019-08-25). "Does Your AI Chip Have Its Own DNN?". EE Times. Retrieved 2019-09-12.
↑ Shaw, Albert (2019-08-27). "SqueezeNAS". GitHub. Retrieved 2019-09-12.
↑ Treml, Michael; et al. (2016). "Speeding up Semantic Segmentation for Autonomous Driving". NIPS MLITS Workshop. Retrieved 2019-07-21.
↑ Zeng, Li (2017-03-22). "SqueezeNet Neural Style on PyTorch". GitHub. Retrieved 2019-07-21.
↑ Wu, Bichen; Keutzer, Kurt (2017). "The Impact of SqueezeNet" (PDF). UC Berkeley. Retrieved 2019-07-21.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:1-1] 1 2 3 Iandola, Forrest N; Han, Song; Moskewicz, Matthew W; Ashraf, Khalid; Dally, William J; Keutzer, Kurt (2016). "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size". arXiv: 1602.07360 [cs.CV].

[2] "SqueezeNet". GitHub. 2016-02-22. Retrieved 2018-05-12.

[3] Bell, Eddie (2016-02-26). "An implementation of SqueezeNet in Chainer". GitHub. Retrieved 2018-05-12.

[4] Haria, Guo (2016-03-02). "SqueezeNet for MXNet". GitHub. Retrieved 2018-05-12.

[:0-5] 1 2 Yang, Tammy (2016-06-03). "SqueezeNet Keras Implementation". GitHub. Retrieved 2018-05-12.

[6] Chirgwin, Richard (2017-09-26). "Baidu puts open source deep learning into smartphones". The Register. Retrieved 2018-04-07.

[7] Bush, Steve (2018-01-25). "Neural network SDK for PowerVR GPUs". Electronics Weekly. Retrieved 2018-04-07.

[8] Yoshida, Junko (2017-03-13). "Xilinx AI Engine Steers New Course". EE Times. Retrieved 2018-05-13.

[9] Boughton, Paul (2017-08-28). "Deep learning computer vision algorithms ported to processor IP". Engineer Live. Retrieved 2018-04-07.

[:2-10] 1 2 "squeezenet.py". GitHub: PyTorch. Retrieved 2018-05-12.

[:3-11] 1 2 "squeezenet.py". GitHub: Apache MXNet. Retrieved 2018-04-07.

[:4-12] 1 2 "CoreML". Apple. Retrieved 2018-04-10.

[:5-13] 1 2 Poster, Domenick. "Tensorflow implementation of SqueezeNet". GitHub. Retrieved 2018-05-12.

[14] Inkawhich, Nathan. "SqueezeNet Model Quickload Tutorial". GitHub: Caffe2. Retrieved 2018-04-07.

[15] "SqueezeNet for MATLAB Deep Learning Toolbox". Mathworks. Retrieved 2018-10-03.

[16] Fang, Lu. "SqueezeNet for ONNX". Open Neural Network eXchange.

[17] "SqueezeNet V1.1 Trained on ImageNet Competition Data". Wolfram Neural Net Repository. Retrieved 2018-05-12.

[18] "SqueezeNet". Short Science. Retrieved 2018-05-13.

[19] Han, Song; Mao, Huizi; Dally, William J. (2016-02-15), Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, arXiv: 1510.00149

[20] Han, Song (2016-11-06). "Compressing and regularizing deep neural networks". O'Reilly. Retrieved 2018-05-08.

[:15-21] Wu, Bichen; Wan, Alvin; Iandola, Forrest; Jin, Peter H.; Keutzer, Kurt (2016). "SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving". arXiv: 1612.01051 [cs.CV].

[22] Nunes Fernandes, Edgar (2017-03-02). "Introducing SqueezeDet: low power fully convolutional neural network framework for autonomous driving". The Intelligence of Information. Retrieved 2019-03-31.

[23] Wu, Bichen (2016-12-08). "SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving". GitHub. Retrieved 2018-12-26.

[24] Kuan, Xu (2017-12-20). "Caffe SqueezeDet". GitHub. Retrieved 2018-12-26.

[25] Padmanabha, Nischal (2017-03-20). "SqueezeDet on Keras". GitHub. Retrieved 2018-12-26.

[26] Ehmann, Christopher (2018-05-29). "Fast object detection with SqueezeDet on Keras". Medium. Retrieved 2019-03-31.

[27] Ehmann, Christopher (2018-05-02). "A deeper look into SqueezeDet on Keras". Medium. Retrieved 2019-03-31.

[:132-28] Wu, Bichen; Wan, Alvin; Yue, Xiangyu; Keutzer, Kurt (2017). "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud". arXiv: 1710.07368 [cs.CV].

[29] Wu, Bichen (2017-12-06). "SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud". GitHub. Retrieved 2018-12-26.

[:142-30] Gholami, Amir; Kwon, Kiseok; Wu, Bichen; Tai, Zizheng; Yue, Xiangyu; Jin, Peter; Zhao, Sicheng; Keutzer, Kurt (2018). "SqueezeNext: Hardware-Aware Neural Network Design". arXiv: 1803.10615 [cs.CV].

[31] Gholami, Amir (2018-04-18). "SqueezeNext". GitHub. Retrieved 2018-12-29.

[32] Verhulsdonck, Tijmen (2018-07-09). "SqueezeNext Tensorflow: A tensorflow Implementation of SqueezeNext". GitHub. Retrieved 2018-12-29.

[33] Sémery, Oleg (2018-09-24). "SqueezeNext, implemented in Keras". GitHub . Retrieved 2018-12-29.

[34] Lu, Yi (2018-06-21). "SqueezeNext.PyTorch". GitHub. Retrieved 2018-12-29.

[35] Shaw, Albert; Hunter, Daniel; Iandola, Forrest; Sidhu, Sammy (2019). "SqueezeNAS: Fast neural architecture search for faster semantic segmentation". arXiv: 1908.01748 [cs.LG].

[36] Yoshida, Junko (2019-08-25). "Does Your AI Chip Have Its Own DNN?". EE Times. Retrieved 2019-09-12.

[37] Shaw, Albert (2019-08-27). "SqueezeNAS". GitHub. Retrieved 2019-09-12.

[38] Treml, Michael; et al. (2016). "Speeding up Semantic Segmentation for Autonomous Driving". NIPS MLITS Workshop. Retrieved 2019-07-21.

[39] Zeng, Li (2017-03-22). "SqueezeNet Neural Style on PyTorch". GitHub. Retrieved 2019-07-21.

[40] Wu, Bichen; Keutzer, Kurt (2017). "The Impact of SqueezeNet" (PDF). UC Berkeley. Retrieved 2019-07-21.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

v t e Differentiable computing
General	Differentiable programming Information geometry Statistical manifold Automatic differentiation Neuromorphic computing Pattern recognition Ricci calculus Computational learning theory Inductive bias
Hardware	IPU TPU VPU Memristor SpiNNaker
Software libraries	TensorFlow PyTorch Keras scikit-learn Theano JAX Flux.jl MindSpore
Portals Computer programming Technology