Files
text-generation-webui/modules
oobabooga 7618f3fe8c Add -gptq-preload for 4-bit offloading (#460)
This works in a 4GB card now:

```
python server.py --model llama-7b-hf --gptq-bits 4 --gptq-pre-layer 20
```
2023-03-20 16:30:56 -03:00
..
2023-03-17 11:42:25 -03:00
2023-03-17 16:06:11 -03:00
2023-02-02 10:39:37 -03:00
2023-03-20 13:36:52 -03:00
2023-03-19 12:11:35 -03:00
2023-03-20 13:36:52 -03:00
2023-03-15 13:24:54 -03:00