What this paper is about
The paper titled “YOLOv3: An Incremental Improvement” presents updates to YOLO in the form of multiple small design changes intended to improve the detector.[S1] The paper reports that it also trains a new network that it describes as “pretty swell,” and it characterizes this network as a little bigger than the previous one but more accurate.[S1] The paper states that the updated system remains fast even with the accuracy improvements.[S1] The paper provides concrete speed and accuracy numbers for a 320×320 input size, reporting a runtime of twenty-two milliseconds at twenty-eight point two mAP.[S1] The paper compares this 320×320 result to SSD, stating that YOLOv3 is as accurate as SSD but three times faster at that setting.[S1] The paper also discusses results under what it calls the old “. [S1] 5 IOU mAP detection metric,” and it states that YOLOv3 is quite good when evaluated that way.[S1] The paper reports a mAP@50 result of fifty-seven point nine in fifty-one milliseconds on a Titan X.[S1] The paper compares that mAP@50 and timing point to RetinaNet, citing a RetinaNet value of fifty-seven point five mAP@50 in one hundred ninety-eight milliseconds and describing performance as similar but three point eight times faster for YOLOv3 in that comparison.[S1] The paper states that the code is online at https://pjreddie.com/yolo/.[S1]
Core claims to remember
The paper’s theme is incremental improvement, and it explicitly presents its contributions as “a bunch of little design changes” that make YOLO better.[S1] The paper reports that it trains a new network and summarizes it as slightly larger than the previous version while being more accurate.[S1] The paper states that the resulting model remains fast, and it reinforces this by providing runtimes alongside accuracy metrics.[S1] At an input size of 320×320, the paper reports that YOLOv3 runs in twenty-two milliseconds and reaches twenty-eight point two mAP.[S1] The paper states that this 320×320 result is as accurate as SSD and three times faster than SSD.[S1] The paper calls out the “old . [S1] 5 IOU mAP detection metric” and states that YOLOv3 is quite good under that metric.[S1] Under mAP@50, the paper reports that YOLOv3 achieves fifty-seven point nine in fifty-one milliseconds on a Titan X.[S1] For context, the paper reports a RetinaNet reference point of fifty-seven point five mAP@50 in one hundred ninety-eight milliseconds and describes this as similar performance but three point eight times faster for YOLOv3.[S1] The paper states that all the code is online at the project page it provides, which it presents as the usual practice for the project.[S1]
Limitations and caveats
The paper’s speed numbers are reported for particular evaluation settings, including an explicit 320×320 input size for the twenty-two millisecond and twenty-eight point two mAP point.[S1] The paper reports another timing point “on a Titan X” for the mAP@50 comparison, which ties that reported latency to that specific hardware context.[S1] The paper distinguishes between metrics by separately discussing “28.2 mAP” at 320×320 and also discussing results under the “old . [S1] 5 IOU mAP detection metric” as mAP@50.[S1] The paper states that the new network is a little bigger than last time, which is a reported tradeoff alongside the reported accuracy gains.[S1]
How to apply this in study or projects
Read the paper’s list of “little design changes” and translate each change into a one-line statement of what was altered, because the paper explicitly presents its updates in that form.[S1] Extract the reported 320×320 operating point as a single reference row with input size, runtime in milliseconds, and mAP, because the paper reports that operating point as “22 ms at 28.2 mAP. [S1] ”[S1] Write down the paper’s stated comparison claim to SSD at 320×320 as two paired statements, because the paper reports “as accurate as SSD” and “three times faster. ”[S1] Separate notes by metric naming exactly as the paper does, because the paper explicitly calls out “the old . 5 IOU mAP detection metric” and also reports a “mAP@50” value.[S1] Reproduce the paper’s Titan X comparison table entry as two system lines, one for YOLOv3 at “57.9 mAP@50 in 51 ms” and one for RetinaNet at “57.5 mAP@50 in 198 ms,” because the paper provides those exact paired numbers.[S1] Use the provided project URL as the canonical location for implementation details, because the paper states that all the code is online at https://pjreddie.com/yolo/.[S1]