Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
deploy
dab42893
·
scripts : working curl pipe
·
Oct 31, 2023
test-mmv
29fe5169
·
wip
·
Oct 31, 2023
llama-refactor
afb39292
·
Merge branch 'master' into llama-refactor
·
Oct 31, 2023
llm-reuse-constants
7420bef8
·
wip wip wip
·
Nov 01, 2023
llm-build-context
a8796f96
·
llm : cleanup + comments
·
Nov 01, 2023
metal-soft-max
46868a49
·
metal : multi-simd softmax
·
Nov 01, 2023
revert-pool
3ef358ff
·
Revert "cuda : use CUDA memory pool with async memory allocation/deallocation...
·
Nov 04, 2023
fix-tensor-split-zero
47d604fa
·
fix issues
·
Nov 05, 2023
llama-metadata
d0445a2e
·
better documentation
·
Nov 10, 2023
ceb/fix-yarn-neox
f8249026
·
YaRN : correction to GPT-NeoX implementation
·
Nov 15, 2023
kv-cache-opts
f8e9f114
·
common : add -dkvc arg for enabling kv cache dumps
·
Nov 23, 2023
server-oai-compat
21b70bab
·
straightforward /v1/models endpoint
·
Nov 24, 2023
lookahead
8d8b76d4
·
lookahead : add comments
·
Nov 26, 2023
ceb/perf-faster-multigpu
6272b676
·
use stride=128 if built for tensor cores
·
Nov 27, 2023
gg/fix-cpu-blas
87f4102a
·
llama : revert n_threads_batch logic
·
Nov 27, 2023
assert-restore-abort
bb39b879
·
ggml : restore abort() in GGML_ASSERT
·
Nov 27, 2023
ceb/libstdcpp-assertions
5b74310e
·
build : enable libstdc++ assertions for debug builds
·
Nov 30, 2023
gg/soft-max-ext
eb594c0f
·
alloc : fix build with debug
·
Dec 01, 2023
gg/pad-kv-cache
3cb1c348
·
metal : try to improve batched decoding
·
Dec 01, 2023
gg/quantum-k-cache
af99c6fb
·
llama : remove memory_f16 and kv_f16 flags
·
Dec 05, 2023
Prev
1
2
3
4
5
6
7
8
9
…
13
Next