Paula Harder, Venkatesh Ramesh, et al.
EGU 2023
Building feature extraction approaches that can effectively characterize natural environment sounds is challenging due to the dynamic nature. In this paper, we develop a framework for feature extraction and obtaining semantic inferences from such data. In particular, we propose a new pooling strategy for deep architectures, that can preserve the temporal dynamics in the resulting representation. By constructing an ensemble of semantic embeddings, we employ an l1-reconstruction based prediction algorithm for estimating the relevant tags. We evaluate our approach on challenging environmental sound recognition datasets, and show that the proposed features outperform traditional spectral features.
Paula Harder, Venkatesh Ramesh, et al.
EGU 2023
Leonid Karlinsky, Joseph Shtok, et al.
AAAI 2021
Karthikeyan Natesan Ramamurthy, Kush R. Varshney, et al.
SSP 2014
Abhishek Kumar, Kahini Wadhawan, et al.
NeurIPS 2018