Attempt to fix warning calls with Logger#2932
Conversation
pplantinga
left a comment
There was a problem hiding this comment.
I tried this using the speaker-id template, and the dataloader still seems to spit out multiple messages, and I wasn't able to figure out why.
Maybe its not worth fixing this bug, as it seems a bit complicated and maybe not worth it for the couple of errors we have in the dataloader?
| ) | ||
| from speechbrain.utils.logger import get_logger | ||
|
|
||
| warnings.simplefilter("once") |
There was a problem hiding this comment.
I'm not sure this is something we should be changing globally, because this will change the warning behavior of everyone who imports speechbrain. With the lazy loading, it might not matter as much, but it would be a very weird and hard to debug error if someone's warning behavior changed from importing speechbrain and then further lazy loading the dataio encoder.
There was a problem hiding this comment.
Plus, the warnings module is never used below so it doesn't even have an effect.
| ) | ||
| else: | ||
| logger.warning_once( | ||
| logger.debug( |
There was a problem hiding this comment.
This avoids spamming the console and pushes the spam to the log, but now doesn't print on the console at all, which I'm not sure we want. The underlying issue is still there, if I change this to "warning_once" again, then it still prints a bunch of times. It seems as though the multi-threading of the dataloader is creating new loggers each time.
| This method is decorated with `functools.lru_cache(None)`, ensuring that the warning | ||
| message is logged only once regardless of how many times the method is called. | ||
| This method uses a set to track issued warnings, ensuring that the same warning is not logged multiple times. | ||
| This is a workaround for the fact that `functools.lru_cache(None)` is not thread-safe. |
There was a problem hiding this comment.
This approach still doesn't seem thread-safe, or at least when I tried it with the tutorial it still print multiple messages.
Yeah. I don'tthink its worth it. I can try to revisit this PR latter on as I am bit surprised by how difficult it is to solve this problem haha. |
What does this PR do?
Attempt to fix warnings #2046 and #2925.
Unfortunately, all my attempts to fix this problem failed. Each worker seems to call the
warning_oncemethod and trigger the message. Even when we try to cache the message. My understanding is that each worker is created a new logger, which doesn't share the same cached message. I decided to fallback tologger.debugto avoid having this messages. And now, thinking about it, it might make more sense. We should only inspect the different between output tokens and target vocab size if we are seeing NaNs.If we agree to merge this PR, we will need to re-run #2925 to remove the outputs.
Before submitting
PR review
Reviewer checklist