Concept
Indexing Anchor Box Coordinates
The output tensor containing all generated anchor boxes for an image typically has an initial shape of ( ext{batch size}, ext{total anchor boxes}, 4). To easily access the anchor boxes centered on a specific pixel, this tensor can be reshaped to ( ext{image height}, ext{image width}, ext{anchor boxes per pixel}, 4). Once reshaped, the coordinates of any individual anchor box can be directly retrieved by indexing into the tensor using its (y, x) spatial location and its specific index among the multiple anchor boxes assigned to that pixel.
0
1
Updated 2026-05-20
Tags
D2L
Dive into Deep Learning @ D2L