Skip to content

Fix ROCm discovery on Ubuntu 26#934

Draft
luraess wants to merge 2 commits into
mainfrom
lr/ub26
Draft

Fix ROCm discovery on Ubuntu 26#934
luraess wants to merge 2 commits into
mainfrom
lr/ub26

Conversation

@luraess

@luraess luraess commented Jun 19, 2026

Copy link
Copy Markdown
Member

Ubuntu 26 ships ROCm to be installed with apt in a location AMDGPU does not yet expects and with modified LLD naming which causes issues #920 (comment).

@luraess

luraess commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

Ubuntu 26.04 installs ROCm 7.1.0 into /usr/lib/x86_64-linux-gnu/ (Debian multiarch) rather than /opt/rocm/, causing all ROCm components to be reported as unavailable.

Two fixes:

  • check_rocm_path: probe Debian/Ubuntu multiarch subdirectories (lib/<triplet>) so check_rocm_path("/usr") resolves correctly on Ubuntu 26.04.
  • find_ld_lld: replace split-by-position version parsing with a regex, fixing a warning caused by Ubuntu's lld emitting "Ubuntu LLD 21.0.0 ..." instead of "AMD LLD X.Y.Z ...".

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMDGPU.jl Benchmarks

Details
Benchmark suite Current: 5da2d37 Previous: 756602c Ratio
amdgpu/synchronization/context/device 670 ns 600 ns 1.12
amdgpu/synchronization/stream/blocking 270 ns 240 ns 1.13
amdgpu/synchronization/stream/nonblocking 390 ns 340 ns 1.15
array/accumulate/Float32/1d 86161 ns 86251 ns 1.00
array/accumulate/Float32/dims=1 405695 ns 393845 ns 1.03
array/accumulate/Float32/dims=1L 135282 ns 131681 ns 1.03
array/accumulate/Float32/dims=2 127641 ns 103022 ns 1.24
array/accumulate/Float32/dims=2L 2805630 ns 2827930 ns 0.99
array/accumulate/Int64/1d 101512 ns 96412 ns 1.05
array/accumulate/Int64/dims=1 287204 ns 285244 ns 1.01
array/accumulate/Int64/dims=1L 166992 ns 160812 ns 1.04
array/accumulate/Int64/dims=2 127752 ns 120772 ns 1.06
array/accumulate/Int64/dims=2L 2984092 ns 3014433 ns 0.99
array/broadcast 55101 ns 128932 ns 0.43
array/construct 1650 ns 1680 ns 0.98
array/copy 40031 ns 39371 ns 1.02
array/copyto!/cpu_to_gpu 114432 ns 114832 ns 1.00
array/copyto!/gpu_to_cpu 184063 ns 152432 ns 1.21
array/copyto!/gpu_to_gpu 66940 ns 88321 ns 0.76
array/iteration/findall/bool 180313 ns 181912 ns 0.99
array/iteration/findall/int 189253 ns 190933 ns 0.99
array/iteration/findfirst/bool 118742 ns 114451 ns 1.04
array/iteration/findfirst/int 115342 ns 116331 ns 0.99
array/iteration/findmin/1d 169232 ns 166203 ns 1.02
array/iteration/findmin/2d 155882 ns 156173 ns 1.00
array/iteration/logical 351165 ns 346025 ns 1.01
array/iteration/scalar 292754 ns 289864 ns 1.01
array/permutedims/2d 73431 ns 64761 ns 1.13
array/permutedims/3d 74201 ns 73791 ns 1.01
array/permutedims/4d 76741 ns 76481 ns 1.00
array/random/rand/Float32 50831 ns 51540 ns 0.99
array/random/rand/Int64 57701 ns 56210 ns 1.03
array/random/rand!/Float32 86451 ns 142162 ns 0.61
array/random/rand!/Int64 93101 ns 141832 ns 0.66
array/random/randn/Float32 99532 ns 86921 ns 1.15
array/random/randn!/Float32 167102 ns 152202 ns 1.10
array/reductions/mapreduce/Float32/1d 132822 ns 132902 ns 1.00
array/reductions/mapreduce/Float32/dims=1 94811 ns 95052 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 773211 ns 777081 ns 1.00
array/reductions/mapreduce/Float32/dims=2 96132 ns 96731 ns 0.99
array/reductions/mapreduce/Float32/dims=2L 301244 ns 299584 ns 1.01
array/reductions/mapreduce/Int64/1d 132672 ns 133322 ns 1.00
array/reductions/mapreduce/Int64/dims=1 95232 ns 78081 ns 1.22
array/reductions/mapreduce/Int64/dims=1L 781741 ns 783471 ns 1.00
array/reductions/mapreduce/Int64/dims=2 96332 ns 96252 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 297824 ns 308254 ns 0.97
array/reductions/reduce/Float32/1d 132732 ns 132802 ns 1.00
array/reductions/reduce/Float32/dims=1 94522 ns 94832 ns 1.00
array/reductions/reduce/Float32/dims=1L 773801 ns 774621 ns 1.00
array/reductions/reduce/Float32/dims=2 96842 ns 96802 ns 1.00
array/reductions/reduce/Float32/dims=2L 294304 ns 307245 ns 0.96
array/reductions/reduce/Int64/1d 133122 ns 129672 ns 1.03
array/reductions/reduce/Int64/dims=1 94951 ns 78151 ns 1.21
array/reductions/reduce/Int64/dims=1L 782072 ns 781931 ns 1.00
array/reductions/reduce/Int64/dims=2 95571 ns 96192 ns 0.99
array/reductions/reduce/Int64/dims=2L 298104 ns 298414 ns 1.00
array/reverse/1d 44211 ns 44380 ns 1.00
array/reverse/1dL 75291 ns 74131 ns 1.02
array/reverse/1dL_inplace 168092 ns 108282 ns 1.55
array/reverse/1d_inplace 78201 ns 86471 ns 0.90
array/reverse/2d 51710 ns 50661 ns 1.02
array/reverse/2dL 101402 ns 100341 ns 1.01
array/reverse/2dL_inplace 103552 ns 117622 ns 0.88
array/reverse/2d_inplace 112932 ns 95391 ns 1.18
array/sorting/1d 349634 ns 341945 ns 1.02
integration/byval/reference 39080 ns 38830 ns 1.01
integration/byval/slices=1 40310 ns 40880 ns 0.99
integration/byval/slices=2 130932 ns 158462 ns 0.83
integration/byval/slices=3 238703 ns 238013 ns 1.00
integration/volumerhs 5040093 ns 4942659 ns 1.02
kernel/indexing 62441 ns 43630 ns 1.43
kernel/indexing_checked 61321 ns 128022 ns 0.48
kernel/launch 1290 ns 1290 ns 1
kernel/rand 197003 ns 106671 ns 1.85
latency/import 1489766868 ns 1501349912 ns 0.99
latency/precompile 11930072145 ns 12041117438 ns 0.99
latency/ttfp 10850312414 ns 10491950084 ns 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant