Accelerating Convolutional Networks via Global & Dynamic Filter Pruning

Abstract

most approaches tend to prune filters in a layer-wise fixed manner, which is incapable of dynamically recovering the previously removed filter, as well as jointly optimize the pruned network across layers.
In this paper, we propose a novel global & dynamic pruning (GDP) scheme to prune redundant filters for CNN acceleration.
In particular, GDP first globally prunes the unsalient filters across all layers by proposing a global discriminative function based on prior knowledge of each filter.
Second, it dynamically updates the filter saliency all over the pruned sparse network, and then recovers the mistakenly pruned filter, followed by a retraining phase to improve the model accuracy.
Specially, we effectively solve the corresponding non-convex optimization problem of the proposed GDP via stochastic gradient descent with greedy alternative updating.

Then, conventional convolution layer computation could be rewritten like this:

if the k-th filter is salient in the l-th layer, and 0 otherwise.

denotes the Khatri-Rao product operator.

We propose to solve the following optimization problem:

The problem is NP-hard, because of the || · ||_0 operator

h(·) is a global discriminative function to determine the saliency values of filters, which depends on the prior knowledge of W^∗ .

The output entry of function h(·) is binary, i.e., to be 1 if the corresponding filter is salient, and 0 otherwise.

Since every filter has a mask, we update W∗ as below:

simplified as

Then,

Since the filter W^{k∗} _l

is a d^2C_{l−1}-dimensional vector, we construct a function to measure the saliency score of a filter.

Last modified: 10 March 2024