Paper Reading Help

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning

Abstract

  • most approaches tend to prune filters in a layer-wise fixed manner, which is incapable of dynamically recovering the previously removed filter, as well as jointly optimize the pruned network across layers.

  • In this paper, we propose a novel global & dynamic pruning (GDP) scheme to prune redundant filters for CNN acceleration.

  • In particular, GDP first globally prunes the unsalient filters across all layers by proposing a global discriminative function based on prior knowledge of each filter.

  • Second, it dynamically updates the filter saliency all over the pruned sparse network, and then recovers the mistakenly pruned filter, followed by a retraining phase to improve the model accuracy.

  • Specially, we effectively solve the corresponding non-convex optimization problem of the proposed GDP via stochastic gradient descent with greedy alternative updating.

Global Dynamic Pruning

The Proposed Pruning Scheme

Then, conventional convolution layer computation could be rewritten like this:

if the k-th filter is salient in the l-th layer, and 0 otherwise.

denotes the Khatri-Rao product operator.

We propose to solve the following optimization problem:

The problem is NP-hard, because of the || · ||_0 operator

h(·) is a global discriminative function to determine the saliency values of filters, which depends on the prior knowledge of W^∗ .

The output entry of function h(·) is binary, i.e., to be 1 if the corresponding filter is salient, and 0 otherwise.

Solver

Since every filter has a mask, we update W∗ as below:

image_20240221_165000.png

The Global Mask

simplified as

Then,

Since the filter W^{k∗} _l

is a d^2C_{l−1}-dimensional vector, we construct a function to measure the saliency score of a filter.

Last modified: 10 March 2024