Trilingual semantic embeddings of visually grounded speech with self-attention mechanisms

Publication
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Related