Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
master
default
protected
3d7ebf63
·
Vulkan Mixture of Experts (MoE) support (#7628)
·
Jun 03, 2024
sycl-remove-global-variables
9d5fc839
·
replace global variables with context[2/2]
·
Jun 03, 2024
compilade/refactor-kv-cache
8fb57ac0
·
llama : use im2col and mul_mat to perform convolution for Mamba
·
Jun 03, 2024
gg/rope-refactor
ddac1ef6
·
cuda : fix array size + indents
·
Jun 02, 2024
compilade/convert-hf-model-part-prefix
3af93718
·
convert-hf : match model part name prefix and suffix
·
Jun 01, 2024
gg/gpt-params-refactor
f3256085
·
common : rework usage print (wip)
·
May 31, 2024
sl/rpc-backend-cpy
5f8720fb
·
add rpc-server to Makefile
·
May 31, 2024
gg/server-update-js
956af155
·
server : update js
·
May 31, 2024
gg/ci-loongson
77c16ee0
·
tests : disable json test due to lack of python on the CI node
·
May 31, 2024
sycl-global-variables
d32a8f61
·
backup
·
May 31, 2024
sl/blas-backend
d7cc6bc0
·
Merge branch 'master' into sl/blas-backend
·
May 31, 2024
gg/cache-token-to-piece
8a8f8b95
·
llama : print a log of the total cache size
·
May 29, 2024
sl/cuda-fattn-par-test
1ca802a3
·
parallelize fattn compilation test
·
May 28, 2024
compilade/refactor-kv-cache-gg
ddc59e8e
·
wipwipwiwpip
·
May 27, 2024
fix_q_xxs_mul_mat
4b177010
·
Fix q_xxs using mul_mat_q
·
May 27, 2024
gg/metal-disable-fa-256
1c6cde92
·
metal : disable FA kernel for HS=256
·
May 27, 2024
compilade/lazier-moe-convert-hf
11f78c6a
·
convert-hf : adapt ArcticModel to use yield too
·
May 25, 2024
sycl-refactor
50dffa13
·
seperate dpct helper functions
·
May 24, 2024
7507-main-intel-dockerfile
dd14d818
·
Update main-intel.Dockerfile base image to 2024.1.0
·
May 24, 2024
compilade/gguf-py-fix-q-shape
c5fe1d6c
·
gguf-py : remove unused import
·
May 23, 2024
Prev
1
2
3
4
5
…
13
Next