Existing measures for evaluating the performance of tracking algorithms are difficult to interpret, which makes it hard to identify the best approach for a particular situation. As we show, a dummy algorithm which does not actually track scores well under most existing measures. Although some measures characterize specific error sources quite well, combining them into a single aggregate measure for comparing approaches or tuning parameters is not straightforward. In this work we propose ‘mean time between failures’ as a viable summary of solution quality- especially when the goal is to follow objects for as long as possible. In addition to being sensitive to all tracking errors, the performance numbers are directly interpretable: how long can an algorithm operate before a mistake has likely occurred (the object is lost, its identity is confused, etc.)? We illustrate the merits of this measure by assessing solutions from different algorithms on a challenging dataset.
The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.