Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
gg/phi-2-2
a462159c
·
cuda : ggml_cuda_op_mul_mat_cublas support F32 precision
·
Dec 18, 2023
ceb/fix-logit-check
1b058171
·
decode : fix logits_valid for old API
·
Dec 17, 2023
gg/swiftui-bench
86506662
·
llama.swiftui : improve bench
·
Dec 17, 2023
pr/4484
f86b9d15
·
lookup : minor
·
Dec 17, 2023
gg/phi-2
d2f1e0da
·
Merge branch 'cuda-cublas-opts' into gg/phi-2
·
Dec 17, 2023
ceb/fix-badspecial-silentfail
b0547d21
·
gguf-py : fail fast on nonsensical special token IDs
·
Dec 15, 2023
ceb/fix-cuda-warning-flags
c8554b80
·
Merge branch 'master' of
https://github.com/ggerganov/llama.cpp
into ceb/fix-cuda-warning-flags
·
Dec 13, 2023
mixtral
e1241d9b
·
metal : switch to execution barriers + fix one of the barriers
·
Dec 13, 2023
gg/per-layer-kv
fc5f3346
·
readme : add API change notice
·
Dec 07, 2023
gg/quantum-k-cache
af99c6fb
·
llama : remove memory_f16 and kv_f16 flags
·
Dec 05, 2023
gg/pad-kv-cache
3cb1c348
·
metal : try to improve batched decoding
·
Dec 01, 2023
gg/soft-max-ext
eb594c0f
·
alloc : fix build with debug
·
Dec 01, 2023
ceb/libstdcpp-assertions
5b74310e
·
build : enable libstdc++ assertions for debug builds
·
Nov 30, 2023
assert-restore-abort
bb39b879
·
ggml : restore abort() in GGML_ASSERT
·
Nov 27, 2023
gg/fix-cpu-blas
87f4102a
·
llama : revert n_threads_batch logic
·
Nov 27, 2023
ceb/perf-faster-multigpu
6272b676
·
use stride=128 if built for tensor cores
·
Nov 27, 2023
lookahead
8d8b76d4
·
lookahead : add comments
·
Nov 26, 2023
server-oai-compat
21b70bab
·
straightforward /v1/models endpoint
·
Nov 24, 2023
kv-cache-opts
f8e9f114
·
common : add -dkvc arg for enabling kv cache dumps
·
Nov 23, 2023
ceb/fix-yarn-neox
f8249026
·
YaRN : correction to GPT-NeoX implementation
·
Nov 15, 2023
Prev
1
…
4
5
6
7
8
9
10
11
12
13
Next