Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

statsml/compress-net-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

3 Commits

Repository files navigation

This is a collection of papers aiming at reducing model sizes or the ASIC/FPGA accelerator for Machine Learning, especially deep neural network related applications. (Inspired by Embedded-Neural-Network.)

You can use the following materials as your entrypoint:

Terminologies

  • Structural pruning (compression): compress CNNs based on removing "less important" filter.

Network Compression

Reduce Precision

Deep neural networks are robust to weight binarization and other non-linear distortions showed that DNN can be robust to more than just weight binarization.

Linear Quantization

Non-linear Quantization

Reduce Number of Operations and Model Size

Exploiting Activation Statistics

  • To be updated.

Network Pruning

Network Prune: a large amount of the weights in a network are redundant and can be removed (i.e., set to zero).

Bayesian network pruning

  • [1711]. Interpreting Convolutional Neural Networks Through Compression - [notes][arXiv]
  • [1705]. Structural compression of convolutional neural networks based on greedy filter pruning - [notes][arXiv]

Compact Network Architectures

Knowledge Distillation

A Bit Hardware

Contributors

Releases

No releases published

Packages

No packages published

Contributors 2

AltStyle によって変換されたページ (->オリジナル) /