Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
b3072
549279d8
·
llama : avoid double token-to-piece cache (#7654)
·
Jun 03, 2024
b3071
9e405b6e
·
kompute : implement op_getrows_f32 (#6403)
·
Jun 03, 2024
b3070
3413ae21
·
fix bug introduced in using calloc (#7701)
·
Jun 02, 2024
b3067
9422c5e3
·
[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)
·
Jun 02, 2024
b3066
e141ce62
·
Fix FlashAttention debug test, FP32 assert (#7684)
·
Jun 01, 2024
b3065
2e666832
·
server : new UI (#7633)
·
Jun 01, 2024
b3063
750f60c0
·
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
·
Jun 01, 2024
b3058
30e238b2
·
Improve HIP compatibility (#7672)
·
May 31, 2024
b3056
0c27e6f6
·
ggml : fix loongson compile warnings (#7537)
·
May 31, 2024
b3051
5921b8f0
·
llama : cache llama_token_to_piece (#7587)
·
May 31, 2024
b3046
9c4c9cc8
·
Move convert.py to examples/convert-legacy-llama.py (#7430)
·
May 30, 2024
b3045
59b0d077
·
faster avx512 exp implementation (#7551)
·
May 30, 2024
b3044
d5c05821
·
ggml : fix loongarch build (O2 issue) (#7636)
·
May 30, 2024
b3042
3854c9d0
·
[SYCL] fix intel docker (#7630)
·
May 30, 2024
b3040
55d62262
·
metal : remove invalid asserts (#7617)
·
May 29, 2024
b3039
975ec63f
·
metal : add missing asserts (#7617)
·
May 29, 2024
b3038
fb76ec31
·
ggml : fix YARN + add tests + add asserts (#7617)
·
May 29, 2024
b3037
cce3dcff
·
cuda : non-cont concat support (#7610)
·
May 29, 2024
b3036
210d9917
·
llama-bench : add support for the RPC backend (#7435)
·
May 29, 2024
b3035
87bdf2a1
·
ggml : use atomic_flag for critical section (#7598)
·
May 29, 2024
Prev
1
2
3
4
5
…
98
Next