Enable GPU-accelerated nucleotide ungapped prefilter by KimBioInfoStudio · Pull Request #1081 · soedinglab/MMseqs2

KimBioInfoStudio · 2026-03-01T05:49:37Z

Summary

Map nucleotide PSSM rows from NucleotideMatrix encoding (A=0, C=1, T=2, G=3, X=4) to ConvertAA_20 positions (A=0, C=1, G=5, T=16, X=20), so existing 21-row GPU kernels score nucleotide sequences correctly
Always allocate 21-row PSSM for GPU path regardless of alphabet size
Allow GPU nucleotide search in ungapped prefilter mode (reject only gapped rescore mode)

Approach

The GPU database (makepaddedseqdb) encodes sequences using SubstitutionMatrix::aa2num which follows ConvertAA_20 alphabetical order. The GPU PSSM kernels use hardcoded 21-row shared memory. Rather than modifying CUDA kernels or database encoding, we remap the nucleotide PSSM rows in ungappedprefilter.cpp to match ConvertAA_20 positions. This keeps full compatibility with getUnpadded(), gpuserver, and the existing protein GPU path.

2 files changed, 29 insertions, 11 deletions. Zero CUDA kernel or libmarv modifications.

Test plan

Tested on DGX Spark (Blackwell B200, CUDA 13.0, aarch64):

Nucleotide GPU vs CPU: 30,000 scores identical (1000 queries × 10K targets, --comp-bias-corr 0 --mask 0)
Nucleotide small dataset: perfect match scores verified (e.g. 16bp exact match = 32)
Protein GPU regression: no change (patched vs unpatched GPU output identical)
Performance: 6.1× speedup over 20-core CPU (1000 queries × 10K targets)

Benchmark (1000q × 10K targets)	GPU	CPU (20 cores)	Speedup
Nucleotide ungapped prefilter	940ms	5780ms	6.1×

🤖 Generated with Claude Code

Map nucleotide PSSM rows to ConvertAA_20 positions so the existing GPU kernels (which expect 21-row amino acid encoding) can score nucleotide sequences without any CUDA kernel changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

KimBioInfoStudio mentioned this pull request Mar 9, 2026

GPU prefilter produces incorrect results when masking is enabled #1083

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable GPU-accelerated nucleotide ungapped prefilter#1081

Enable GPU-accelerated nucleotide ungapped prefilter#1081
KimBioInfoStudio wants to merge 1 commit intosoedinglab:masterfrom
KimBioInfoStudio:gpu-nucleotide-search

KimBioInfoStudio commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KimBioInfoStudio commented Mar 1, 2026

Summary

Approach

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant