Skip to content

ENH: add _clear_internal_caches function to simplify memory leak debugging#31321

Open
MaartenBaert wants to merge 1 commit into
numpy:mainfrom
MaartenBaert:clear-internal-caches
Open

ENH: add _clear_internal_caches function to simplify memory leak debugging#31321
MaartenBaert wants to merge 1 commit into
numpy:mainfrom
MaartenBaert:clear-internal-caches

Conversation

@MaartenBaert

Copy link
Copy Markdown
Contributor

PR summary

This PR adds a _clear_internal_caches function to the multiarray module, analogous to sys._clear_internal_caches in cpython.
I added this to simplify debugging the memory leak in #31320 and thought it might make sense to submit this as a separate PR. I'm not sure what the best place for this is and what other caches exist in numpy that may need clearing, I just included the two that were showing up as false positives in my own memory leak testing.

AI Disclosure

The code was written by Claude Sonnet 4.6.

…gging

This is analogous to sys._clear_internal_caches in cpython
@seberg

seberg commented Apr 28, 2026

Copy link
Copy Markdown
Member

We do have a couple of more caches like this, or Python object caches. I OK with adding such a function if it helps you especially if you want to use it for testing downstream.
I suppose it would make most sense if we cover most such caches.

If we want to semi-expose it, we should probably import it to a module other than _multiarray_umath and document it very briefly. lib.array_utils doesn't feel quite right, but maybe almost better than some others?

@ngoldbaum do you have a quick thought?

@ngoldbaum

Copy link
Copy Markdown
Member

Maybe it would be better to simply disable these caches with an environment variable that gets checked once when NumPy is imported? That would add one extra comparison with an integer in the allocation fast path but I doubt that's a big deal.

We already have a USE_ALLOC_CACHE preprocessor directive:

#ifdef Py_GIL_DISABLED
# define USE_ALLOC_CACHE 0
/*
* The cache makes ASAN use-after-free or MSAN use-of-uninitialized-memory
* warnings less useful.
*/
#elif defined(__has_feature)
# if __has_feature(address_sanitizer) || __has_feature(memory_sanitizer)
# define USE_ALLOC_CACHE 0
# endif
#endif
#ifndef USE_ALLOC_CACHE
# define USE_ALLOC_CACHE 1
#endif

Of course that doesn't help anyone who can't build their own NumPy, but also I'd be surprised if anyone doing this sort of memory leak debugging couldn't build their own NumPy. I'd be fine with adding a more generic NPY_USE_INTERNAL_CACHES environment variable and using that for all the caches that one might want to disable for memory debugging.

@MaartenBaert

Copy link
Copy Markdown
Contributor Author

I'd be fine with adding a more generic NPY_USE_INTERNAL_CACHES environment variable and using that for all the caches that one might want to disable for memory debugging.

That works for me! Though if USE_ALLOC_CACHE is exposed as a compile-time option, that works too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants