Abstract: NN 剪枝保留重要权重这件事情中,权重并不重要,重要的是剪枝完后的网络结构主要决定了准确度。
1) training a large, over-parameterized model is often not necessary to obtain an efficient final model,
2) learned “important” weights of the large model are typically not useful for the small pruned model,
3) the pruned architecture itself, rather than a set of inherited “important” weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.
剪枝一般的pipline:
文章结论1: Fine-tuning 基本没用。重训可以得到一样的效果。
后面实验都差不多,遍历了一遍所有主流剪枝方法,同样差不多的比较,结论一致。
这样的方法可以用于评估哪个剪枝方法得到的网络架构更好。