Skip to content

S2SGreedySearcher : Do not continue decoding when EOS token was generated for all samples from a batch#1899

Merged
Adel-Moumen merged 1 commit into
speechbrain:developfrom
Jeronymous:efficient-seq2seq
Mar 24, 2023
Merged

S2SGreedySearcher : Do not continue decoding when EOS token was generated for all samples from a batch#1899
Adel-Moumen merged 1 commit into
speechbrain:developfrom
Jeronymous:efficient-seq2seq

Conversation

@Jeronymous

Copy link
Copy Markdown
Contributor

This can speed up things when validating a seq2seq model

@TParcollet

Copy link
Copy Markdown
Collaborator

@Adel-Moumen what do you think? It could be added to the minor?

@Adel-Moumen Adel-Moumen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for spotting/solving the bug!

@Adel-Moumen Adel-Moumen merged commit ddeafbb into speechbrain:develop Mar 24, 2023
@TParcollet

Copy link
Copy Markdown
Collaborator

This PR will be shipped in the new release today, this was perfect timing ahah.

@mravanelli

mravanelli commented Mar 24, 2023 via email

Copy link
Copy Markdown
Collaborator

@Adel-Moumen

Copy link
Copy Markdown
Collaborator

Yes @mravanelli, I conducted the experiment using Whisper on LibriSpeech.

I was previously aware of this bug and had attempted a somewhat similar approach that partially addressed the issue. The current fix fully resolves the problem by monitoring the batches that are reaching or have reached the eos.

@mravanelli

mravanelli commented Mar 24, 2023 via email

Copy link
Copy Markdown
Collaborator

@Adel-Moumen

Copy link
Copy Markdown
Collaborator

Yep.

Just so everyone knows about this change:

We went from 41 mins and 4 seconds with Whisper Tiny on LibriSpeech test-clean to 2 mins and 36 seconds. That's like a ~20x speed boost. Both of them led to the same WER of 7.50.

@Jeronymous Jeronymous deleted the efficient-seq2seq branch March 31, 2023 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants