Tags

Tags give the ability to mark specific points in history as being important

b1821

43f76bf1 · main : print total token count and tokens consumed so far (#4874) · Jan 11, 2024
b1820

2f043328 · server : fix typo in model name (#4876) · Jan 11, 2024
b1819

2a7c94db · metal : put encoder debug group behind a define (#4873) · Jan 11, 2024
b1818

64802ec0 · sync : ggml · Jan 11, 2024
b1810

5c1980d8 · server : fix build + rename enums (#4870) · Jan 11, 2024
b1808

57d016ba · llama : add additional suffixes for model params (#4834) · Jan 10, 2024
b1807

329ff615 · llama : recognize 1B phi models (#4847) · Jan 10, 2024
b1806

d34633d8 · clip : support more quantization types (#4846) · Jan 10, 2024
b1803

36e5a08b · llava-cli : don't crash if --image flag is invalid (#4835) · Jan 09, 2024
b1796

18c2e175 · ggml : fix vld1q_s8_x4 32-bit compat (#4828) · Jan 09, 2024
b1795

8f900abf · CUDA: faster softmax via shared memory + fp16 math (#4742) · Jan 09, 2024
b1794

1fc2f265 · common : fix the short form of `--grp-attn-w`, not `-gat` (#4825) · Jan 08, 2024
b1792

dd5ae064 · SOTA 2-bit quants (#4773) · Jan 08, 2024
b1791

668b31fc · swift : exclude ggml-metal.metal from the package (#4822) · Jan 08, 2024
b1789

52531fdf · main : add self-extend support (#4815) · Jan 08, 2024
b1788

b0034d93 · examples : add passkey test (#3856) · Jan 08, 2024
b1786

226460cc · llama-bench : add no-kv-offload parameter (#4812) · Jan 07, 2024
b1785

d5a410e8 · CUDA: fixed redundant value dequantization (#4809) · Jan 07, 2024
b1784

9dede37d · llama : remove unused vars (#4796) · Jan 07, 2024
b1783

3c36213d · llama : remove redundant GQA check (#4796) · Jan 07, 2024

Prev
1
…
34
35
36
37
38
39
40
41
42
…
98
Next