Abstract: NN 剪枝保留重要权重这件事情中，权重并不重要，重要的是剪枝完后的网络结构主要决定了准确度。
1) training a large, over-parameterized model is often not necessary to obtain an efficient final model,
2) learned “important” weights of the large model are typically not useful for the small pruned model,
3) the pruned architecture itself, rather than a set of inherited “important” weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.
文章结论1： Fine-tuning 基本没用。重训可以得到一样的效果。