Abstract
In this paper, we propose metric Hashing Forests (mHF) which is a supervised variant of random forests tailored for the task of nearest neighbor retrieval through hashing. This is achieved by training independent hashing trees that parse and encode the feature space such that local class neighborhoods are preserved and encoded with similar compact binary codes. At the level of each internal node, locality preserving projections are employed to project data to a latent subspace, where separability between dissimilar points is enhanced. Following which, we define an oblique split that maximally preserves this separability and facilitates defining local neighborhoods of similar points. By incorporating the inverse-lookup search scheme within the mHF, we can then effectively mitigate pairwise neuron similarity comparisons, which allows for scalability to massive databases with little additional time overhead. Exhaustive experimental validations on 22,265 neurons curated from over 120 different archives demonstrate the superior efficacy of mHF in terms of its retrieval performance and precision of classification in contrast to state-of-the-art hashing and metric learning based methods. We conclude that the proposed method can be utilized effectively for similarity-preserving retrieval and categorization in large neuron databases.