Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
norm-quants-rebase
b4e70822
·
metal : add poc for normalized Q4_0 and Q4_1
·
Aug 30, 2023
norm-quants
8c2b8812
·
cuda : poc for norm quants (only -b 1 works)
·
Aug 30, 2023
speculative
847896ab
·
speculative : add --draft CLI arg
·
Sep 03, 2023
speculative-grammar
c79d130f
·
make : fix speculative build
·
Sep 04, 2023
metal-cont-bug
f3a84b2e
·
llama : better express the KV cache dependencies in the graph
·
Sep 04, 2023
build-metal-default
30ac7a41
·
gitignore : metal
·
Sep 04, 2023
metal-fix-norm
2f689dee
·
metal : minor
·
Sep 07, 2023
fix-rocm-shared-lib-build
61436803
·
Compile ggml-rocm with -fpic when building shared library
·
Sep 13, 2023
mul-mat-pad
e7e7b114
·
llama : remove experimental stuff
·
Sep 14, 2023
fix-cmake-out-of-source-install
c2217ca2
·
Fix llama.h location when built outside of root directory
·
Sep 14, 2023
support-starcoder-fix
92a4f868
·
llama : make starcoder graph build more consistent with others
·
Sep 15, 2023
custom-attention-mask-no-roped-cache
784d14ed
·
llama : store non-RoPEd K cache (WIP)
·
Sep 17, 2023
cam-simple-fix
72e7ef4e
·
simple : fixes
·
Sep 26, 2023
custom-attention-mask
c5650ed4
·
server : avoid context swaps by shifting the KV cache
·
Sep 28, 2023
fix-sessions
5418932b
·
llama : fix comments for llama_kv_cache API
·
Oct 03, 2023
server-parallel
5ab6c213
·
server-parallel : add "--reverse-prompt" + compiler warning fixes
·
Oct 06, 2023
gguf-fix-publish
ba44776d
·
bump version
·
Oct 07, 2023
metal-improve-batching
6b9554a7
·
metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7
·
Oct 08, 2023
fix-refact
acead654
·
Merge branch 'master' into fix-refact
·
Oct 08, 2023
fix-kv-cache-access
ee268b54
·
llama : no longer perform uninitialized access to the KV cache
·
Oct 08, 2023
Prev
1
2
3
4
5
6
7
…
13
Next