[rocprofiler-compute] Add backward compatible live attach support for ROCm 7.2#4929
[rocprofiler-compute] Add backward compatible live attach support for ROCm 7.2#4929vedithal-amd wants to merge 1 commit intodevelopfrom
Conversation
… older ROCm - Add fallback from librocprofiler-sdk-rocattach.so to librocprofv3-attach.so when the new library is not available - Fail early with a clear error if no live attach library is found - Invert API call order to try new API (rocattach_attach/rocattach_detach) first, falling back to old API (attach/detach) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feizheng10
left a comment
There was a problem hiding this comment.
Do we need update changlog for this?
There was a problem hiding this comment.
Pull request overview
Adds backward-compatible live attach support for ROCm 7.2 by resolving the attach library with a fallback to the older attach library, and preferring the newer rocattach API while retaining the legacy attach/detach API path.
Changes:
- Add attach-library resolution fallback:
librocprofiler-sdk-rocattach.so→librocprofv3-attach.so, with early error if neither exists. - Invert live attach call order to try
rocattach_attach/rocattach_detachfirst, then fall back toattach/detach. - Improve logging for live attach API selection by adding
console_debugusage.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| projects/rocprofiler-compute/src/utils/utils_common.py | Prefer new rocattach API for live attach/detach with fallback to legacy API. |
| projects/rocprofiler-compute/src/rocprof_compute_profile/profiler_rocprofiler_sdk.py | Add attach library resolution fallback for ROCm 7.2 installs missing the new rocattach library. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # old live attach API | ||
| c_lib.attach.argtypes = [ctypes.c_uint] |
There was a problem hiding this comment.
In the old live attach API fallback you only set c_lib.attach.argtypes, but later you call c_lib.detach(int(pid)) as well. For ctypes calls, leaving detach (and attach’s restype) untyped can lead to incorrect argument/return conversions and makes the two call paths inconsistent. Please set c_lib.detach.argtypes (and consider restype) alongside attach when using the old API.
| # old live attach API | |
| c_lib.attach.argtypes = [ctypes.c_uint] | |
| # old live attach API | |
| c_lib.attach.restype = ctypes.c_int | |
| c_lib.attach.argtypes = [ctypes.c_uint] | |
| c_lib.detach.restype = ctypes.c_int | |
| c_lib.detach.argtypes = [ctypes.c_uint] |
| pid = options["ROCPROF_ATTACH_PID"] | ||
| if pid is None: | ||
| console_error("Mode of attach/detach must have setup for process ID") | ||
| console_error("Mode of live attach must have setup for process ID") |
There was a problem hiding this comment.
The error text is grammatically unclear: "Mode of live attach must have setup for process ID". Consider rephrasing to something like "Live attach mode requires a process ID (ROCPROF_ATTACH_PID)" so users immediately know what to set.
| console_error("Mode of live attach must have setup for process ID") | |
| console_error( | |
| "Live attach mode requires a process ID (ROCPROF_ATTACH_PID)" | |
| ) |
| rocprofiler_attach_library_path = resolve_rocm_library_path( | ||
| str( | ||
| Path(args.rocprofiler_sdk_tool_path).parent.parent | ||
| / "librocprofiler-sdk-rocattach.so" | ||
| ) | ||
| ) | ||
| if not Path(rocprofiler_attach_library_path).exists(): | ||
| console_debug( | ||
| "New live attach library not found, trying old live attach library" | ||
| ) | ||
| rocprofiler_attach_library_path = resolve_rocm_library_path( | ||
| str( | ||
| Path(args.rocprofiler_sdk_tool_path).parent | ||
| / "librocprofv3-attach.so" | ||
| ) | ||
| ) | ||
| if not Path(rocprofiler_attach_library_path).exists(): | ||
| console_error("No live attach library found.") |
There was a problem hiding this comment.
The failure message "No live attach library found." doesn’t say which paths were tried, which makes diagnosing packaging/layout issues harder. Consider including the resolved candidate paths (new and old) and/or hinting that --rocprofiler-sdk-tool-path / ROCM_PATH controls the search base.
| rocprofiler_attach_library_path = resolve_rocm_library_path( | |
| str( | |
| Path(args.rocprofiler_sdk_tool_path).parent.parent | |
| / "librocprofiler-sdk-rocattach.so" | |
| ) | |
| ) | |
| if not Path(rocprofiler_attach_library_path).exists(): | |
| console_debug( | |
| "New live attach library not found, trying old live attach library" | |
| ) | |
| rocprofiler_attach_library_path = resolve_rocm_library_path( | |
| str( | |
| Path(args.rocprofiler_sdk_tool_path).parent | |
| / "librocprofv3-attach.so" | |
| ) | |
| ) | |
| if not Path(rocprofiler_attach_library_path).exists(): | |
| console_error("No live attach library found.") | |
| new_attach_library_path = resolve_rocm_library_path( | |
| str( | |
| Path(args.rocprofiler_sdk_tool_path).parent.parent | |
| / "librocprofiler-sdk-rocattach.so" | |
| ) | |
| ) | |
| old_attach_library_path = resolve_rocm_library_path( | |
| str( | |
| Path(args.rocprofiler_sdk_tool_path).parent | |
| / "librocprofv3-attach.so" | |
| ) | |
| ) | |
| rocprofiler_attach_library_path = new_attach_library_path | |
| if not Path(rocprofiler_attach_library_path).exists(): | |
| console_debug( | |
| "New live attach library not found, trying old live attach library" | |
| ) | |
| rocprofiler_attach_library_path = old_attach_library_path | |
| if not Path(rocprofiler_attach_library_path).exists(): | |
| console_error( | |
| "No live attach library found. Tried resolved paths: " | |
| f"new='{new_attach_library_path}', " | |
| f"old='{old_attach_library_path}'. " | |
| "Check --rocprofiler-sdk-tool-path and/or ROCM_PATH." | |
| ) |
| # Try new live attach library first, fall back to old library | ||
| rocprofiler_attach_library_path = resolve_rocm_library_path( | ||
| str( | ||
| Path(args.rocprofiler_sdk_tool_path).parent.parent | ||
| / "librocprofiler-sdk-rocattach.so" | ||
| ) | ||
| ) | ||
| if not Path(rocprofiler_attach_library_path).exists(): | ||
| console_debug( | ||
| "New live attach library not found, trying old live attach library" | ||
| ) | ||
| rocprofiler_attach_library_path = resolve_rocm_library_path( | ||
| str( | ||
| Path(args.rocprofiler_sdk_tool_path).parent | ||
| / "librocprofv3-attach.so" | ||
| ) | ||
| ) | ||
| if not Path(rocprofiler_attach_library_path).exists(): | ||
| console_error("No live attach library found.") |
There was a problem hiding this comment.
This adds important fallback behavior (new attach library -> old attach library) but there’s no unit test coverage validating the selection logic (e.g., new missing/old present, both missing => error). Since get_profiler_options() already has unit tests in this repo, please add a focused test that mocks Path.exists()/resolve_rocm_library_path() to exercise these branches.
| ) | ||
| if not Path(rocprofiler_attach_library_path).exists(): | ||
| console_debug( | ||
| "New live attach library not found, trying old live attach library" |
There was a problem hiding this comment.
| "New live attach library not found, trying old live attach library" | |
| "Latest live attach library not found, searching for legacy live attach library" |
Replacement of word "new" to not mislead users if this comment does not change any time soon and "new" is not really..."new" anymore. Also word swap for old -> legacy, optional substitution.
| console_debug( | ||
| "Error setting old attach/detach API argument " | ||
| f"types: {e}, trying new API" | ||
| f"Error setting new live attach API argument types: {e}, trying old API" |
There was a problem hiding this comment.
| f"Error setting new live attach API argument types: {e}, trying old API" | |
| f"Error setting new live attach API argument types: {e}, trying legacy live attach API" |
| ) | ||
| except Exception as e: | ||
| console_debug(f"Error attaching with old API: {e}, trying new API") | ||
| console_debug(f"Error attaching with new API: {e}, trying old API") |
There was a problem hiding this comment.
| console_debug(f"Error attaching with new API: {e}, trying old API") | |
| console_debug(f"Error attaching with latest live attach API: {e}, trying legacy live attach API") |
| ) | ||
| except Exception as e: | ||
| console_debug(f"Error detaching with old API: {e}, trying new API") | ||
| console_debug(f"Error detaching with new API: {e}, trying old API") |
There was a problem hiding this comment.
| console_debug(f"Error detaching with new API: {e}, trying old API") | |
| console_debug(f"Error detaching with latest live attach API: {e}, trying detach with legacy live attach API") |
Motivation
Live attach profiling (
--attach-pid) fails on ROCm 7.2 wherelibrocprofiler-sdk-rocattach.sois not available. Add library resolution fallback and prefer the newer API while retaining backward compatibility.Technical Details
librocprofiler-sdk-rocattach.sotolibrocprofv3-attach.sowhen the new library is not availableperform_attach_detach()to try new API (rocattach_attach/rocattach_detach) first, falling back to old API (attach/detach)JIRA ID
Test Plan
rocprof-compute profile --attach-pid <PID>on a system withlibrocprofiler-sdk-rocattach.so(new library) and verify it uses the new APIrocprof-compute profile --attach-pid <PID>on a system with onlylibrocprofv3-attach.so(old library) and verify it falls back correctlyTest Result
Manual tests have passed.
Submission Checklist