A Realistic Protocol for Evaluation of Weakly Supervised Object Localization

Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Eric Granger

LIVIA, ILLS, Dept. of Systems Engineering, ETS Montreal, Canada

WACV 2025 (

)

Abstract

Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper, a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained

Impact of Noisy Bounding Boxes on Model Selection

Experiments and Results

Acknowledgements

This research was supported by the Natural Sciences and Engineering Research Council of Canada. We also thank the Digital Research Alliance of Canada for their computing resources.