SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation

Xutai Ma1, Juan Pino2, Philipp Koehn1
1Johns Hopkins University, 2Facebook


We investigate how to adapt simultaneous text translation methods such as wait-$k$ and monotonic multihead attention to end-to-end simultaneous speech translation by introducing a pre-decision module. A detailed analysis is provided on the latency-quality trade-offs of combining fixed and flexible pre-decision with fixed and flexible policies. We also design a novel computation-aware latency metric, adapted from Average Lagging.