Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
fix-convert-modelname
284870c8
·
Merge branch 'master' into fix-convert-modelname
·
May 14, 2024
sl/async-weight-copy
5de9b743
·
sched : support async weight copy
·
May 16, 2024
gg/test-bench
a085a832
·
tmp
·
May 16, 2024
gg/test-embd
6b2f4964
·
wip
·
May 17, 2024
ci-android
007f2ece
·
cmake : provide binary dir
·
May 18, 2024
gg/kv-determinism
a041ced0
·
wip
·
May 20, 2024
sl/dio-test
e9095e60
·
async direct io per tensor test
·
May 22, 2024
compilade/gguf-py-fix-old-numpy
8334b5be
·
gguf-py : do not use internal numpy types
·
May 22, 2024
sl/cuda-uma
518b7526
·
cuda uma test
·
May 23, 2024
compilade/gguf-py-fix-q-shape
c5fe1d6c
·
gguf-py : remove unused import
·
May 23, 2024
7507-main-intel-dockerfile
dd14d818
·
Update main-intel.Dockerfile base image to 2024.1.0
·
May 24, 2024
sycl-refactor
50dffa13
·
seperate dpct helper functions
·
May 24, 2024
compilade/lazier-moe-convert-hf
11f78c6a
·
convert-hf : adapt ArcticModel to use yield too
·
May 25, 2024
gg/metal-disable-fa-256
1c6cde92
·
metal : disable FA kernel for HS=256
·
May 27, 2024
fix_q_xxs_mul_mat
4b177010
·
Fix q_xxs using mul_mat_q
·
May 27, 2024
compilade/refactor-kv-cache-gg
ddc59e8e
·
wipwipwiwpip
·
May 27, 2024
sl/cuda-fattn-par-test
1ca802a3
·
parallelize fattn compilation test
·
May 28, 2024
gg/cache-token-to-piece
8a8f8b95
·
llama : print a log of the total cache size
·
May 29, 2024
sl/blas-backend
d7cc6bc0
·
Merge branch 'master' into sl/blas-backend
·
May 31, 2024
sycl-global-variables
d32a8f61
·
backup
·
May 31, 2024
Prev
1
…
8
9
10
11
12
13
Next