Branches · Till-Ole Herbst / Llama.Cpp · GitLab

deploy

dab42893 · scripts : working curl pipe · Oct 31, 2023
test-mmv

29fe5169 · wip · Oct 31, 2023
llama-refactor

afb39292 · Merge branch 'master' into llama-refactor · Oct 31, 2023
llm-reuse-constants

7420bef8 · wip wip wip · Nov 01, 2023
llm-build-context

a8796f96 · llm : cleanup + comments · Nov 01, 2023
metal-soft-max

46868a49 · metal : multi-simd softmax · Nov 01, 2023
revert-pool

3ef358ff · Revert "cuda : use CUDA memory pool with async memory allocation/deallocation... · Nov 04, 2023
fix-tensor-split-zero

47d604fa · fix issues · Nov 05, 2023
llama-metadata

d0445a2e · better documentation · Nov 10, 2023
ceb/fix-yarn-neox

f8249026 · YaRN : correction to GPT-NeoX implementation · Nov 15, 2023
kv-cache-opts

f8e9f114 · common : add -dkvc arg for enabling kv cache dumps · Nov 23, 2023
server-oai-compat

21b70bab · straightforward /v1/models endpoint · Nov 24, 2023
lookahead

8d8b76d4 · lookahead : add comments · Nov 26, 2023
ceb/perf-faster-multigpu

6272b676 · use stride=128 if built for tensor cores · Nov 27, 2023
gg/fix-cpu-blas

87f4102a · llama : revert n_threads_batch logic · Nov 27, 2023
assert-restore-abort

bb39b879 · ggml : restore abort() in GGML_ASSERT · Nov 27, 2023
ceb/libstdcpp-assertions

5b74310e · build : enable libstdc++ assertions for debug builds · Nov 30, 2023
gg/soft-max-ext

eb594c0f · alloc : fix build with debug · Dec 01, 2023
gg/pad-kv-cache

3cb1c348 · metal : try to improve batched decoding · Dec 01, 2023
gg/quantum-k-cache

af99c6fb · llama : remove memory_f16 and kv_f16 flags · Dec 05, 2023

Prev
1
2
3
4
5
6
7
8
9
…
13
Next