-
Notifications
You must be signed in to change notification settings - Fork 15k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Q5_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2)
ggml
changes relating to the ggml tensor library for machine learning
#19707
opened Feb 18, 2026 by
Manogna-Sree
Loading…
Q6_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2)
ggml
changes relating to the ggml tensor library for machine learning
#19706
opened Feb 18, 2026 by
Manogna-Sree
Loading…
common : fix gpt-oss Jinja error with content and thinking on tool-call messages
#19704
opened Feb 18, 2026 by
abhijitb11
Loading…
ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support.
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
#19700
opened Feb 18, 2026 by
yomaytk
Loading…
Add Mistral Voxtral Mini 4B Realtime 2602 4B streaming ASR support
examples
model
Model specific
python
python script changes
#19698
opened Feb 17, 2026 by
Acceldium
Loading…
New option GGML_CUDA_FORCE_CUBLAS_COMPUTE_32F to use fp32 as compute type in cuBLAS
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19697
opened Feb 17, 2026 by
wallentri88
Loading…
test(server): add multi-image and no-image vision API tests
examples
python
python script changes
server
#19691
opened Feb 17, 2026 by
jorgeutd
Loading…
3 tasks done
model : Add tokenizer from LFM2.5-Audio-1.5B
model
Model specific
python
python script changes
#19687
opened Feb 17, 2026 by
tdakhran
Loading…
CUDA: fix kernel selection logic for tile FA
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19686
opened Feb 17, 2026 by
JohannesGaessler
Loading…
DOCS: Fix broken links for preparing models in Backends
Ascend NPU
issues specific to Ascend NPUs
documentation
Improvements or additions to documentation
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#19684
opened Feb 17, 2026 by
MaciejDromin
Loading…
Add Pylint workflow for Python code analysis
devops
improvements to build systems and github actions
#19671
opened Feb 16, 2026 by
kerrrang9214-tech
•
Draft
Add Kimi Linear to unified delta net
model
Model specific
#19668
opened Feb 16, 2026 by
ymcki
Loading…
llama : use output_resolve_row() in get_logits_ith/get_embeddings_ith
#19663
opened Feb 16, 2026 by
danbev
Loading…
avx2: compute ksigns instead of loading from table
ggml
changes relating to the ggml tensor library for machine learning
#19657
opened Feb 16, 2026 by
dfriehs
Loading…
common : fix Step-3.5-Flash format detection and thinking support
testing
Everything test related
#19635
opened Feb 15, 2026 by
jesseposner
Loading…
[server] save generated text for the /slots endpoint (for LLAMA_SERVER_SLOTS_DEBUG=1)
examples
server
#19622
opened Feb 14, 2026 by
matteoserva
Loading…
Fix gpt-oss tool calling: pass tool args and tool responses as json
#19620
opened Feb 14, 2026 by
matteoserva
Loading…
fix: GLM 4.5 streaming tool-call parsing + grammar error handling
examples
server
testing
Everything test related
#19612
opened Feb 14, 2026 by
Gunther-Schulz
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.