arxiv:2312.13585

Speech Translation with Large Language Models: An Industrial Practice

Published on Dec 21, 2023

Upvote

Authors:

Rong Ye ,

Shanbo Cheng ,

Abstract

LLM-ST, a speech translation model combining a pre-trained LLM with a speech encoder and multi-task instruction tuning, achieves superior performance with CoT prompting on English and Chinese datasets.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Given the great success of large language models (LLMs) across various tasks, in this paper, we introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained LLM. By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations, even from long audio inputs. Furthermore, our findings indicate that the implementation of Chain-of-Thought (CoT) prompting can yield advantages in the context of LLM-ST. Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST, establishing a new benchmark in the field of speech translation. Demo: https://speechtranslation.github.io/llm-st/.

View arXiv page View PDF Add to collection

Community

ReneeYe

Paper author Dec 22, 2023

Webpage and more examples: https://speechtranslation.github.io/llm-st/
:)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2312.13585

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2312.13585 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2312.13585 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2312.13585 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.