Branches · Till-Ole Herbst / Llama.Cpp · GitLab

gg/phi-2-2

a462159c · cuda : ggml_cuda_op_mul_mat_cublas support F32 precision · Dec 18, 2023
ceb/fix-logit-check

1b058171 · decode : fix logits_valid for old API · Dec 17, 2023
gg/swiftui-bench

86506662 · llama.swiftui : improve bench · Dec 17, 2023
pr/4484

f86b9d15 · lookup : minor · Dec 17, 2023
gg/phi-2

d2f1e0da · Merge branch 'cuda-cublas-opts' into gg/phi-2 · Dec 17, 2023
ceb/fix-badspecial-silentfail

b0547d21 · gguf-py : fail fast on nonsensical special token IDs · Dec 15, 2023
ceb/fix-cuda-warning-flags

c8554b80 · Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/fix-cuda-warning-flags · Dec 13, 2023
mixtral

e1241d9b · metal : switch to execution barriers + fix one of the barriers · Dec 13, 2023
gg/per-layer-kv

fc5f3346 · readme : add API change notice · Dec 07, 2023
gg/quantum-k-cache

af99c6fb · llama : remove memory_f16 and kv_f16 flags · Dec 05, 2023
gg/pad-kv-cache

3cb1c348 · metal : try to improve batched decoding · Dec 01, 2023
gg/soft-max-ext

eb594c0f · alloc : fix build with debug · Dec 01, 2023
ceb/libstdcpp-assertions

5b74310e · build : enable libstdc++ assertions for debug builds · Nov 30, 2023
assert-restore-abort

bb39b879 · ggml : restore abort() in GGML_ASSERT · Nov 27, 2023
gg/fix-cpu-blas

87f4102a · llama : revert n_threads_batch logic · Nov 27, 2023
ceb/perf-faster-multigpu

6272b676 · use stride=128 if built for tensor cores · Nov 27, 2023
lookahead

8d8b76d4 · lookahead : add comments · Nov 26, 2023
server-oai-compat

21b70bab · straightforward /v1/models endpoint · Nov 24, 2023
kv-cache-opts

f8e9f114 · common : add -dkvc arg for enabling kv cache dumps · Nov 23, 2023
ceb/fix-yarn-neox

f8249026 · YaRN : correction to GPT-NeoX implementation · Nov 15, 2023

Prev
1
…
4
5
6
7
8
9
10
11
12
13
Next