Learn Before
Normalized Anchor Box Coordinates
The multibox_prior function returns anchor box coordinates in a normalized form, where the -axis values are divided by the image width and the -axis values are divided by the image height. This means all coordinate values fall within the range [0, 1], making them independent of the absolute pixel dimensions of the input image. Each anchor box is stored as four values representing the upper-left (x, y) corner and the lower-right (x, y) corner in this normalized coordinate system. To recover the original pixel coordinates for tasks such as visualization, a scaling tensor bbox_scale of the form (w, h, w, h) is element-wise multiplied with the normalized coordinates to restore them to the image's native resolution.
0
1
Tags
D2L
Dive into Deep Learning @ D2L