Thread limit introspection API, part 1: API scope#213
Conversation
There was a problem hiding this comment.
I find it a bit disturbing that calling threadpoolctl.threadpool_info can temporarily mutate the libraries global state in a non-thread safe way.
The alternative would be to hardcode expected scopes in LibController for all known libraries (and return "unknown" otherwise) and only call _determine_api_scope in tests or by passing a non-default kwarg value: threadpool_ctl.threadpool_info(api_scopes="effective") (and api_scopes="expected" would be the default) for instance.
|
Another interesting finding by looking at the CI results: BLIS is |
|
I think it's useful to have it in |
|
I am fine with introducing an extra CLI flag to enable the extra inspection work. |
…ected outcomes across a variety of libraries.
|
My feeling is:
|
| which can be a process-wide change. As such, it is not always thread-safe. | ||
| An attempt will be made to restore all settings to their previous state, | ||
| but the result may be subtly different, e.g. if "unset" has different | ||
| semantics than "set to the default returned value". |
There was a problem hiding this comment.
Thinking about it a bit more, we could make this inspection side effect free by calling this function in an isolated via subprocess.run if we really wanted.
The problem would be to make sure that the same native threadpool libraries that are loaded in the current process at the time of the inspection call are also loaded in the subprocess. To do so we could manually call ctypes.CDLL but that might be a bit brittle.
There was a problem hiding this comment.
Thinking out loud... also relevant to the "should we hardcode this if introspection isn't available" question...
My current thought is that introspection is a useful debugging tool, but not something that would actually be used. Whatever it does, the library is stuck with it. And so I think it's OK if it's just opt-in, mostly for CLI users doing bug reports or when doing performance debugging.
And so something simpler and minimalist seems sufficient.
I may be wrong, it may be that knowing which it is will be helpful. And if so we can maybe go for more elaborate approaches like subprocess later on.
|
|
||
| - "api_scope": When setting the number of threads, what is affected. | ||
| Possible values are "process", "current_thread". | ||
| """ |
There was a problem hiding this comment.
When extra_info is False I would be in favor of retrieving the expected info from a statically hardcoded data based of known semantics for common BLAS and OpenMP implementation and return UNKNOWN otherwise.
Then we can have tests to check that threadpool_info(extra_info=True) always returns the same as threadpool_info(extra_info=False) on all the environments tested by our CI.
There was a problem hiding this comment.
If we do so, we should probably rename extra_info to inspect_scope or something like that.
There was a problem hiding this comment.
See my comments about subprocess implementation elsewhere, I think this is probably unnecessary.
There was a problem hiding this comment.
Basically I am thinking of this for now at least mostly as debugging and bug report tool, rather than something the API will use.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
2e6e276 to
95d77a0
Compare
That's a fun error. Probably due to conda/conda#16327 |
|
I addressed or replied to all review comments. CI is broken for now, will try again later. |
Fixes #211.
Add info to determine the scope of the thread-limit-setting API (thread local, process-wide, unknown). Introspecting the semantics of the thread pool are going to be a separate PR, assuming that can be done reasonably.