Attempt to fix warning calls with Logger by Adel-Moumen · Pull Request #2932 · speechbrain/speechbrain

Adel-Moumen · 2025-05-31T11:19:21Z

What does this PR do?

Attempt to fix warnings #2046 and #2925.

Unfortunately, all my attempts to fix this problem failed. Each worker seems to call the warning_once method and trigger the message. Even when we try to cache the message. My understanding is that each worker is created a new logger, which doesn't share the same cached message. I decided to fallback to logger.debug to avoid having this messages. And now, thinking about it, it might make more sense. We should only inspect the different between output tokens and target vocab size if we are seeing NaNs.

If we agree to merge this PR, we will need to re-run #2925 to remove the outputs.

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

pplantinga

I tried this using the speaker-id template, and the dataloader still seems to spit out multiple messages, and I wasn't able to figure out why.

Maybe its not worth fixing this bug, as it seems a bit complicated and maybe not worth it for the couple of errors we have in the dataloader?

pplantinga · 2025-05-31T22:39:18Z

 )
 from speechbrain.utils.logger import get_logger

+warnings.simplefilter("once")


I'm not sure this is something we should be changing globally, because this will change the warning behavior of everyone who imports speechbrain. With the lazy loading, it might not matter as much, but it would be a very weird and hard to debug error if someone's warning behavior changed from importing speechbrain and then further lazy loading the dataio encoder.

Plus, the warnings module is never used below so it doesn't even have an effect.

pplantinga · 2025-05-31T23:08:37Z

                )
        else:
-            logger.warning_once(
+            logger.debug(


This avoids spamming the console and pushes the spam to the log, but now doesn't print on the console at all, which I'm not sure we want. The underlying issue is still there, if I change this to "warning_once" again, then it still prints a bunch of times. It seems as though the multi-threading of the dataloader is creating new loggers each time.

pplantinga · 2025-05-31T23:11:54Z

-        This method is decorated with `functools.lru_cache(None)`, ensuring that the warning
-        message is logged only once regardless of how many times the method is called.
+        This method uses a set to track issued warnings, ensuring that the same warning is not logged multiple times.
+        This is a workaround for the fact that `functools.lru_cache(None)` is not thread-safe.


This approach still doesn't seem thread-safe, or at least when I tried it with the tutorial it still print multiple messages.

Adel-Moumen · 2025-06-03T08:52:29Z

I tried this using the speaker-id template, and the dataloader still seems to spit out multiple messages, and I wasn't able to figure out why.

Maybe its not worth fixing this bug, as it seems a bit complicated and maybe not worth it for the couple of errors we have in the dataloader?

Yeah. I don'tthink its worth it. I can try to revisit this PR latter on as I am bit surprised by how difficult it is to solve this problem haha.

Adel-Moumen added 10 commits May 31, 2025 12:10

fix warnings

93c6373

only main proc

8baeb77

test

f94aff2

remove test

9d9c45d

remove dynamic message

b93383e

switch to warnings.warn

3e221ec

warnings once

46d2664

switch to debug mode

9f8e199

remove unused var

67f44c2

precommit

aa0d6a0

Adel-Moumen marked this pull request as ready for review May 31, 2025 13:28

Adel-Moumen requested a review from pplantinga May 31, 2025 13:28

pplantinga reviewed May 31, 2025

View reviewed changes

Adel-Moumen closed this Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Attempt to fix warning calls with Logger#2932

Attempt to fix warning calls with Logger#2932
Adel-Moumen wants to merge 10 commits into
speechbrain:developfrom
Adel-Moumen:fix_warning_calls

Adel-Moumen commented May 31, 2025 •

edited

Loading

Uh oh!

pplantinga left a comment

Uh oh!

pplantinga May 31, 2025

Uh oh!

pplantinga May 31, 2025

Uh oh!

pplantinga May 31, 2025

Uh oh!

pplantinga May 31, 2025

Uh oh!

Adel-Moumen commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Adel-Moumen commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

pplantinga May 31, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga May 31, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga May 31, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga May 31, 2025

Choose a reason for hiding this comment

Uh oh!

Adel-Moumen commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adel-Moumen commented May 31, 2025 •

edited

Loading