Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
ceb/bert
7286b83d
·
BERT WIP
·
Feb 06, 2024
gg/convert-fix-byte-tokens
adcf16fd
·
py : fix empty bytes arg
·
Feb 05, 2024
ik/ggml-quants-cpp
91c453fb
·
One cannot possibly be defining static_assert in a C++ compilation
·
Feb 05, 2024
gg/flash-attn-interleave-cc
49a483e0
·
wip
·
Feb 04, 2024
gg/flash-attn-32x8
a647257b
·
cuda : express strides with helper constants
·
Feb 04, 2024
gg/flash-attn-cuda
b957b8f5
·
cuda : add flash_attn kernel (wip)
·
Feb 01, 2024
flash-attn-cuda
ac26f270
·
cuda : increase C to 128 for better performance
·
Feb 01, 2024
gg/flash-attn-mask-f16
1ad42b1f
·
ggml : ggml_soft_max uses F16 mask
·
Jan 31, 2024
ik/fix_iq3xxs_metal
719a0871
·
iq3_xxs: forgotten update of the grid points
·
Jan 30, 2024
gg/flash-attn-simd
2bf91c53
·
metal : clean up
·
Jan 25, 2024
gg/flash-attn-wip3
6ccbd177
·
wip
·
Jan 24, 2024
gg/flash-attn-wip4
da23b56f
·
wip : no ic 8 step
·
Jan 24, 2024
gg/flash-attn-wip2
06c2d0d1
·
wip
·
Jan 23, 2024
gg/flash-attn-online
a9681feb
·
ggml : online attention (CPU)
·
Jan 20, 2024
ceb/fix-msvc-build
32a392fe
·
try a differerent fix
·
Jan 19, 2024
ceb/restore-convert
4a3bc152
·
py : linting with mypy and isort
·
Jan 19, 2024
ceb/nomic-vulkan-fix-add
14532151
·
kompute : fix ggml_add kernel
·
Jan 19, 2024
ik/faster_hellaswag
ccc78a20
·
hellaswag: speed up even more by parallelizing log-prob evaluation
·
Jan 18, 2024
gg/imatrix-gpu-4931
2917e6b5
·
Merge branch 'master' into gg/imatrix-gpu-4931
·
Jan 17, 2024
gg/fix-spm-added-tokens-dict-4958
23742deb
·
py : fix padded dummy tokens (I hope)
·
Jan 17, 2024
Prev
1
2
3
4
5
6
7
8
9
10
…
13
Next