Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Q5_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2) ggml changes relating to the ggml tensor library for machine learning
#19707 opened Feb 18, 2026 by Manogna-Sree Loading…
Q6_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2) ggml changes relating to the ggml tensor library for machine learning
#19706 opened Feb 18, 2026 by Manogna-Sree Loading…
ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning
#19700 opened Feb 18, 2026 by yomaytk Loading…
Add Mistral Voxtral Mini 4B Realtime 2602 4B streaming ASR support examples model Model specific python python script changes
#19698 opened Feb 17, 2026 by Acceldium Loading…
New option GGML_CUDA_FORCE_CUBLAS_COMPUTE_32F to use fp32 as compute type in cuBLAS documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#19697 opened Feb 17, 2026 by wallentri88 Loading…
server : fix V-L embedding model support
#19694 opened Feb 17, 2026 by oliveagle Loading…
test(server): add multi-image and no-image vision API tests examples python python script changes server
#19691 opened Feb 17, 2026 by jorgeutd Loading…
3 tasks done
model : Add tokenizer from LFM2.5-Audio-1.5B model Model specific python python script changes
#19687 opened Feb 17, 2026 by tdakhran Loading…
CUDA: fix kernel selection logic for tile FA ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#19686 opened Feb 17, 2026 by JohannesGaessler Loading…
DOCS: Fix broken links for preparing models in Backends Ascend NPU issues specific to Ascend NPUs documentation Improvements or additions to documentation SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#19684 opened Feb 17, 2026 by MaciejDromin Loading…
model: support GLM-OCR examples model Model specific python python script changes
#19677 opened Feb 16, 2026 by ngxson Loading…
Add Pylint workflow for Python code analysis devops improvements to build systems and github actions
#19671 opened Feb 16, 2026 by kerrrang9214-tech Draft
Allow partial success of seq_rm for hybrid memory
#19670 opened Feb 16, 2026 by Nekotekina Loading…
Add Kimi Linear to unified delta net model Model specific
#19668 opened Feb 16, 2026 by ymcki Loading…
models : dedup qwen35 graphs model Model specific
#19660 opened Feb 16, 2026 by ggerganov Draft
2 tasks
avx2: compute ksigns instead of loading from table ggml changes relating to the ggml tensor library for machine learning
#19657 opened Feb 16, 2026 by dfriehs Loading…
common : fix Step-3.5-Flash format detection and thinking support testing Everything test related
#19635 opened Feb 15, 2026 by jesseposner Loading…
Vulkan Scalar Flash Attention Refactor ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#19625 opened Feb 14, 2026 by 0cc4m Draft
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.