[Cpp API Compatibility] Remove deepep legacy APIs#78549
[Cpp API Compatibility] Remove deepep legacy APIs#78549SigureMo merged 11 commits intoPaddlePaddle:developfrom
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR continues the C++ API compatibility cleanup by removing DeepEP-specific legacy stream/event APIs from the compat headers, aiming to converge on a PyTorch-shaped surface.
Changes:
- Tighten CUDA stream pool initialization by adding error checking and align
getStreamFromPooloverload shape to support anint priorityentry point. - Remove deprecated legacy APIs that exposed raw CUDA streams (
CUDAStream::raw_stream(),Event::record(cudaStream_t), andTensor::record_stream(cudaStream_t / at::cuda::CUDAStream)). - Expose
c10::Streamintonamespace atto supportat::Streamusage.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| paddle/phi/api/include/compat/c10/cuda/CUDAStream.h | Adds C10_CUDA_CHECK for priority range init; adds priority-based getStreamFromPool overload and maps bool overload to it. |
| paddle/phi/api/include/compat/c10/core/Stream.h | Introduces at::Stream aliasing/export of c10::Stream. |
| paddle/phi/api/include/compat/c10/core/Event.h | Removes legacy record(cudaStream_t) overload. |
| paddle/phi/api/include/compat/ATen/ops/record_stream.h | Removes legacy record_stream overloads taking at::cuda::CUDAStream and cudaStream_t. |
| paddle/phi/api/include/compat/ATen/core/TensorBody.h | Removes corresponding legacy record_stream declarations from Tensor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| namespace at { | ||
| using c10::Stream; | ||
| } |
There was a problem hiding this comment.
Adding namespace at { using c10::Stream; } here conflicts with an existing at::Stream definition in paddle/phi/api/include/compat/ATen/core/TensorBody.h (currently using Stream = c10::Stream;). Since TensorBody.h includes <c10/core/Stream.h>, this will trigger a redefinition error when compiling any TU that includes ATen/core/TensorBody.h.
To fix: keep a single canonical at::Stream definition (either remove the Stream alias from TensorBody.h, or drop this new at::Stream export and rely on the existing one).
| namespace at { | |
| using c10::Stream; | |
| } |
|
/re-run all-failed |
1 similar comment
|
/re-run all-failed |
dc7610a to
f71e518
Compare
|
/re-run all-failed |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #78549 +/- ##
===========================================
Coverage ? 100.00%
===========================================
Files ? 1
Lines ? 6
Branches ? 0
===========================================
Hits ? 6
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f71e518 to
4f6abb2
Compare
|
merge 了下 develop 确认下 #78389 合入后能够监控 PR 的改动,此次 CI 应当挂掉 待 PaddlePaddle/PaddleFleet#712 合入后 rerun,CI 应当再次通过 |
|
好啦,按照预期挂啦,可以解冲突啦 |
|
/re-run all-failed |
|
https://github.com/PaddlePaddle/Paddle/actions/runs/24225463645/job/70726471802?pr=78631
|
确实挂了,我看看能不能先合一个在 Mac-CPU 跳过测试的 pr |
|
/re-run all-failed |
|
上次我给 PaddleFleet checkout DeepGEMM 了,等 PFCCLab/DeepEP#11 合入后再 checkout DeepEP 吧 |
24247eb to
87acb07
Compare
|
/re-run all-failed |
87acb07 to
c4e8072
Compare
|
/re-run all-failed |
…de and may cause confusion
c4e8072 to
0a92651
Compare
This reverts commit 0a92651.
|
/re-run all-failed |
4b86d21 to
ae1fdd3
Compare
|
/re-run all-failed |
2 similar comments
|
/re-run all-failed |
|
/re-run all-failed |
|
这个确认不会影响 Hybrid EP 编译了么?另外目前 DeepEP 编译是否会依赖这里的相关符号?ABI 是否有不兼容情况? |
hybrid-ep编译还没本地验证过,deep-ep之前编译没问题,我再编译验证下,ABI兼容检查 0a92651 这里过了 |
嗯嗯,Hybrid EP 验证没问题就可以合了,节前 Hybrid EP 合入后因为 Paddle CI 有问题暂时 revert 了,#78846 修复了,节后会再合入一次 |
奇怪,DeepEP 的改动在 Hybrid EP 也 apply 了呀,Hybrid EP 没有一样报错吗 这个应该可以通过设置 |
应该是可以的,本地hybrid-ep编译用的是写好的build.sh脚本,里面配置好了PADDLE_CUDA_ARCH_LIST,deep-ep因为没检测到PADDLE_CUDA_ARCH_LIST环境变量就自动推断了 |

PR Category
Execute Infrastructure
PR Types
Deprecations
Description
拆分自 #78484
是否引起精度变化
否