Skip to content

Thread limit introspection API, part 1: API scope#213

Open
itamarst wants to merge 17 commits into
joblib:masterfrom
itamarst:211-thread-limit-introspection-api
Open

Thread limit introspection API, part 1: API scope#213
itamarst wants to merge 17 commits into
joblib:masterfrom
itamarst:211-thread-limit-introspection-api

Conversation

@itamarst

Copy link
Copy Markdown
Contributor

Fixes #211.

Add info to determine the scope of the thread-limit-setting API (thread local, process-wide, unknown). Introspecting the semantics of the thread pool are going to be a separate PR, assuming that can be done reasonably.

@ogrisel ogrisel left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit disturbing that calling threadpoolctl.threadpool_info can temporarily mutate the libraries global state in a non-thread safe way.

The alternative would be to hardcode expected scopes in LibController for all known libraries (and return "unknown" otherwise) and only call _determine_api_scope in tests or by passing a non-default kwarg value: threadpool_ctl.threadpool_info(api_scopes="effective") (and api_scopes="expected" would be the default) for instance.

Comment thread tests/test_threadpoolctl.py Outdated
Comment thread threadpoolctl.py
Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py Outdated
@ogrisel

ogrisel commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Another interesting finding by looking at the CI results: BLIS is current_thread scoped contrary to other BLAS implementations.

@itamarst

Copy link
Copy Markdown
Contributor Author

I think it's useful to have it in python -m threadpoolctl for debugging and user bug reports, so maybe have the side-effect-y introspection gated behind a keyword argument to info()? Or it can just be some add-on processing that the command-line version does using lower-level functions.

@ogrisel

ogrisel commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

I am fine with introducing an extra CLI flag to enable the extra inspection work.

@itamarst

Copy link
Copy Markdown
Contributor Author

My feeling is:

  • As you said it shouldn't be on by default in the Python API because of risk of side-effects.
  • It should be on by default in the CLI because it won't actually run any user code, it's just for debugging, so may as well give all the available info.

@itamarst itamarst marked this pull request as ready for review June 30, 2026 20:25
Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py
which can be a process-wide change. As such, it is not always thread-safe.
An attempt will be made to restore all settings to their previous state,
but the result may be subtly different, e.g. if "unset" has different
semantics than "set to the default returned value".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it a bit more, we could make this inspection side effect free by calling this function in an isolated via subprocess.run if we really wanted.

The problem would be to make sure that the same native threadpool libraries that are loaded in the current process at the time of the inspection call are also loaded in the subprocess. To do so we could manually call ctypes.CDLL but that might be a bit brittle.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud... also relevant to the "should we hardcode this if introspection isn't available" question...

My current thought is that introspection is a useful debugging tool, but not something that would actually be used. Whatever it does, the library is stuck with it. And so I think it's OK if it's just opt-in, mostly for CLI users doing bug reports or when doing performance debugging.

And so something simpler and minimalist seems sufficient.

I may be wrong, it may be that knowing which it is will be helpful. And if so we can maybe go for more elaborate approaches like subprocess later on.

Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py Outdated
Comment thread threadpoolctl.py

- "api_scope": When setting the number of threads, what is affected.
Possible values are "process", "current_thread".
"""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When extra_info is False I would be in favor of retrieving the expected info from a statically hardcoded data based of known semantics for common BLAS and OpenMP implementation and return UNKNOWN otherwise.

Then we can have tests to check that threadpool_info(extra_info=True) always returns the same as threadpool_info(extra_info=False) on all the environments tested by our CI.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do so, we should probably rename extra_info to inspect_scope or something like that.

@itamarst itamarst Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments about subprocess implementation elsewhere, I think this is probably unnecessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I am thinking of this for now at least mostly as debugging and bug report tool, rather than something the API will use.

@itamarst itamarst force-pushed the 211-thread-limit-introspection-api branch from 2e6e276 to 95d77a0 Compare July 1, 2026 15:23
@itamarst

itamarst commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

++ /home/runner/miniconda3/bin/conda shell.posix reactivate
Traceback (most recent call last):
  File "/home/runner/miniconda3/bin/conda", line 12, in <module>
    from conda.cli import main
ModuleNotFoundError: No module named 'conda'

That's a fun error.

Probably due to conda/conda#16327

@itamarst

itamarst commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

I addressed or replied to all review comments. CI is broken for now, will try again later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API to introspect the scope of what thread pool limit set/get API actually does

2 participants