Publications

Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization