Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
cuda-70b-2
f7bb5e91
·
CUDA: GQA implementation
·
Jul 22, 2023
ggml-backends
d273bfd2
·
allocator: cleanup, more comments
·
Jul 22, 2023
ggml-backends-metal
d45c1631
·
metal : rewrite to fit new backend interface correctly (WIP)
·
Jul 20, 2023
refactor-mpi
04923631
·
mpi : fix after master merge
·
Jul 09, 2023
llama_server_completions
26cc1bd7
·
llama : uniform variable names + struct init
·
Jul 05, 2023
llama_server_timings
ff6e39f1
·
use javascript generators as much cleaner API
·
Jul 05, 2023
test-mac-os-ci
f46db27e
·
ci : disable FMA on Mac OS
·
Jul 05, 2023
try-fix-metal
5cc672a9
·
metal : try to utilize more of the shared memory using smaller views
·
Jun 26, 2023
avoid-gnu-source
78fafcaf
·
ggml : do not use _GNU_SOURCE gratuitously
·
Jun 25, 2023
fix_clblast
20054a38
·
Fix directory name
·
May 27, 2023
chunks
a1cdd29c
·
ggml : rms_norm in chunks
·
May 20, 2023
steering
95dc4d72
·
Merge 'origin/master' into steering
·
May 19, 2023
f16c
40ec4882
·
ggml : use F16C conversion when available
·
May 17, 2023
dequantize-matmul-3-gg
a3e6d622
·
cuda : alternative q4_q8 kernel
·
May 12, 2023
remove-vzip
e116eb63
·
ggml : speed-up Q5_0 + Q5_1 at 4 threads
·
May 11, 2023
jed/spm-clblast
4baa8563
·
Fix build
·
May 06, 2023
ci_cublas
31ff9e2e
·
ci : add cublas to windows release
·
May 03, 2023
q4_3-range-fix
102cd980
·
ggml : Q4_3c using 2x "Full range" approach
·
Apr 23, 2023
q4_0-q4_2-range-fix
71e6ae37
·
ggml : continue from #729 (wip)
·
Apr 22, 2023
gg/rmse_quantization
a0242a83
·
Minor, plus rebase on master
·
Apr 22, 2023
Prev
1
…
8
9
10
11
12
13
Next