Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
gg/per-layer-kv
fc5f3346
·
readme : add API change notice
·
Dec 07, 2023
mixtral
e1241d9b
·
metal : switch to execution barriers + fix one of the barriers
·
Dec 13, 2023
ceb/fix-cuda-warning-flags
c8554b80
·
Merge branch 'master' of
https://github.com/ggerganov/llama.cpp
into ceb/fix-cuda-warning-flags
·
Dec 13, 2023
ceb/fix-badspecial-silentfail
b0547d21
·
gguf-py : fail fast on nonsensical special token IDs
·
Dec 15, 2023
gg/phi-2
d2f1e0da
·
Merge branch 'cuda-cublas-opts' into gg/phi-2
·
Dec 17, 2023
pr/4484
f86b9d15
·
lookup : minor
·
Dec 17, 2023
gg/swiftui-bench
86506662
·
llama.swiftui : improve bench
·
Dec 17, 2023
ceb/fix-logit-check
1b058171
·
decode : fix logits_valid for old API
·
Dec 17, 2023
gg/phi-2-2
a462159c
·
cuda : ggml_cuda_op_mul_mat_cublas support F32 precision
·
Dec 18, 2023
gg/plamo-test
3c734f49
·
plamo : testing
·
Dec 18, 2023
gg/cublas-f32
a40f6110
·
ggml : force F32 precision for ggml_mul_mat
·
Dec 19, 2023
ceb/fix-draft-model-default
7c87353e
·
common : remove incorrect --model-draft default
·
Dec 21, 2023
gg/ggml_scale
ab1b7516
·
Merge branch 'master' into gg/ggml_scale
·
Dec 21, 2023
gg/test-arm
f32f30bc
·
test
·
Dec 26, 2023
gg/gpu-prec-tests
f64e4f04
·
ggml : testing GPU FP precision via quantized CPY
·
Dec 30, 2023
gg/hf-auto-dl
120a1a55
·
llama : auto download HF models if URL provided
·
Jan 02, 2024
gg/avoid-mutex
b5af7ad8
·
llama : refactor quantization to avoid <mutex> header
·
Jan 02, 2024
cuda-cublas-opts
4cc78d38
·
ggml : force F32 precision for ggml_mul_mat
·
Jan 02, 2024
gg/metal-opt-mul-mat-id
9f51f3e6
·
metal : opt mul_mm_id
·
Jan 02, 2024
gg/remove-gqa-check-4657
7cfde781
·
llama : remove redundant GQA check
·
Jan 06, 2024
Prev
1
2
3
4
5
6
7
8
9
10
…
13
Next