Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
gg/kv-compress
14d75706
·
llama : add llama_kv_cache_compress (EXPERIMENTAL)
·
Feb 27, 2024
ik/i-quants-64
f0cbb6dd
·
iq1_s: turn off SIMD implementation for QK_K = 64 (it does not work)
·
Feb 28, 2024
gg/fix-starcoder2
9862d59c
·
llama : change starcoder2 rope type
·
Mar 01, 2024
ceb/convert-vocab-fallback
f8ab5391
·
convert : update help string
·
Mar 01, 2024
ik/iq3_s_faster
d4dfc250
·
Fix ARM_NEON
·
Mar 02, 2024
ceb/convert-hf-refactor
0b673ca1
·
s/_MODEL_CLASSES/_model_classes/
·
Mar 02, 2024
ci/server/fix-slow-test
eb0bf32c
·
server: tests: schedule slow dispatch only on release or on demand
·
Mar 02, 2024
gg/fix-embeddings-wip
4ec0e9ab
·
wip
·
Mar 04, 2024
ik/iq3_s_multiplier
31cecc87
·
iq3_s_mult_shuffle: use lookup table on Metal
·
Mar 05, 2024
revert-5901-fix_set_gpu
b5b02703
·
Revert "[SYCL] fix error when set main gpu to non-zero (#5901)"
·
Mar 07, 2024
gg/bert-f16
0ba20ed9
·
llama : compute BERT graph with F16 K, V
·
Mar 07, 2024
gritlm-pr
b54afce9
·
mostly style fixes; fix KQ_mask comment
·
Mar 09, 2024
sycl_q3s_q1s
989e15b3
·
Merge branch 'master' into sycl_q3s_q1s
·
Mar 11, 2024
gg/try-fix-sycl-iq1_s
76be02ae
·
sycl : fix grid type
·
Mar 11, 2024
ik/even_better_iq1s
5440a127
·
iq1_s: fix dequantize on the CPU
·
Mar 11, 2024
ik/try_fix_iq1s_sycl
9f805264
·
Attempt 2
·
Mar 12, 2024
gg/metal-embed
abf0afd0
·
ci : fix iOS builds to use embedded library
·
Mar 14, 2024
gg/repeng
0a9bc301
·
control-vectors : minor code style updates
·
Mar 14, 2024
jg/flash-attn
7fca4586
·
pragma unroll, use_mask template parameter
·
Mar 19, 2024
compilade/fix-server-tests-penalty
9a424a38
·
server : fix tests expecting old repeat penalty
·
Mar 19, 2024
Prev
1
…
5
6
7
8
9
10
11
12
13
Next