What this paper is about
Convolutional neural networks are commonly developed at a fixed resource budget, and then scaled up for better accuracy when more resources are available.[S1] This paper reports a systematic study of model scaling and states that carefully balancing network depth, width, and resolution can lead to better performance.[S1] Based on that observation, the paper proposes a scaling method that uniformly scales depth, width, and resolution using a simple compound coefficient.[S1] The paper demonstrates the effectiveness of this compound scaling method by scaling up MobileNets and ResNet models.[S1] The paper also reports using neural architecture search to design a new baseline network and then scaling that baseline to obtain a family of models called EfficientNets.[S1] The paper states that the EfficientNet family achieves much better accuracy and efficiency than previous convolutional networks.[S1] The paper reports that EfficientNet-B7 achieves state-of-the-art 84.3% top-1 accuracy.[S1]
Core claims to remember
The paper states that convolutional networks are often built for a fixed resource budget and are later scaled up if more resources become available.[S1] The paper reports that model scaling can be studied systematically rather than treated as an ad hoc step after a model is designed.[S1] The paper reports identifying that carefully balancing depth, width, and resolution leads to better performance.[S1] The paper proposes a compound scaling method that uniformly scales all three of depth, width, and resolution using a simple coefficient.[S1] The paper demonstrates the method by applying it to scale up MobileNets and ResNet architectures.[S1] The paper reports using neural architecture search to produce a new baseline network and then scaling it to create a family of EfficientNet models.[S1] The paper states that these EfficientNets deliver much better accuracy and efficiency than previous convolutional networks.[S1] The paper reports that EfficientNet-B7 reaches 84.3% top-1 accuracy and describes this as state of the art.[S1]
Limitations and caveats
The paper’s demonstrations of the proposed compound scaling method include scaling up MobileNets and ResNet models.[S1] The paper also reports a second step that uses neural architecture search to design a new baseline network before scaling it into the EfficientNet family.[S1] The paper reports a specific top-line result for EfficientNet-B7 of 84.3% top-1 accuracy.[S1] The paper presents its contribution in the context of scaling convolutional networks across depth, width, and resolution.[S1]
How to apply this in study or projects
Read the paper’s description of the common workflow where convolutional networks are developed at a fixed resource budget and then scaled up when more resources are available.[S1] Trace the paper’s systematic study of model scaling and the reported finding that balancing depth, width, and resolution can improve performance.[S1] Write down the paper’s definition of its compound scaling method as uniformly scaling depth, width, and resolution using a simple coefficient.[S1] Follow the paper’s reported demonstrations by examining how the compound scaling method is applied to scale up MobileNets and ResNet.[S1] Compare the paper’s two routes to larger models by separating the scaling of existing architectures from the route that uses neural architecture search to design a new baseline and then scales it into the EfficientNet family.[S1] Record the paper’s reported headline metric for EfficientNet-B7, including the stated 84.3% top-1 accuracy and the paper’s characterization of it as state of the art.[S1]