Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

1 cpu core - 160 Tokens per second

72,142 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

on a 3 year old base model m1 macbook air

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

can't make this up, these are the actual training params for this particular checkpoint: 269 steps, 2469 sequence length, training at 4.2 it/s on 22420mb of ram on an nvidia L4 and it converged at 142 steps. This is art at this point.

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

wget

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

This is in 8bit by theway. it actually did 198 tps in mixed bitnet mode, but was rambling at lot at 76mb gguf size and that was a bitnet trining run

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

If someone wants to code-review, implemented a whole new optimizer from @cognitivecompai to train this and getting some of the best conversational results so far. Also Eric's looking for help to do a pR for hf transformers & axolotl Training run code here

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

it looks like the optimizer step may not be being applied properly

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

If you need to reproduce results, I uploaded a 8bit .gguf of the training checkpoint from this video for you to try. Careful it's very finnicky styll. May have to start prompts with Human: How do raccons meow. The base model is untrained.

Profilbild von Maziyar PANAHI
Maziyar PANAHIvor 1 Jahr

This is a lie! 🔥 don’t believe it people! ❤️ PS: when do you release? @nisten 🤣

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

Try the checkpoint ./llama-cli -n 768 -fa -b 768 --min-p 0.3 --top-p 0.85 -ctk q8_0 -ctv q8_0 --keep -1 -p "You're a Nasa jpl engineer teaching huamn about cats in space." -m model.gguf --temp 1.69 -ngl 0 -t 1 -co -cnv -n 2000 --reverse-prompt "Assistant:"

Profilbild von atharva
atharvavor 1 Jahr

so goooood, trying this out tonight

Profilbild von nisten - e/acc
nisten - e/accvor 1 Jahr

release is coming just hang on , multiple new merges going on at the same time for llama.cpp and aphrodite engine implementation

Ähnliche Videos