Video wird geladen...
Video konnte nicht geladen werden
1 cpu core - 160 Tokens per second
72,142 Aufrufe • vor 1 Jahr •via X (Twitter)
11 Kommentare

on a 3 year old base model m1 macbook air

can't make this up, these are the actual training params for this particular checkpoint: 269 steps, 2469 sequence length, training at 4.2 it/s on 22420mb of ram on an nvidia L4 and it converged at 142 steps. This is art at this point.

wget

This is in 8bit by theway. it actually did 198 tps in mixed bitnet mode, but was rambling at lot at 76mb gguf size and that was a bitnet trining run

If someone wants to code-review, implemented a whole new optimizer from @cognitivecompai to train this and getting some of the best conversational results so far. Also Eric's looking for help to do a pR for hf transformers & axolotl Training run code here

it looks like the optimizer step may not be being applied properly

If you need to reproduce results, I uploaded a 8bit .gguf of the training checkpoint from this video for you to try. Careful it's very finnicky styll. May have to start prompts with Human: How do raccons meow. The base model is untrained.

This is a lie! 🔥 don’t believe it people! ❤️ PS: when do you release? @nisten 🤣

Try the checkpoint ./llama-cli -n 768 -fa -b 768 --min-p 0.3 --top-p 0.85 -ctk q8_0 -ctv q8_0 --keep -1 -p "You're a Nasa jpl engineer teaching huamn about cats in space." -m model.gguf --temp 1.69 -ngl 0 -t 1 -co -cnv -n 2000 --reverse-prompt "Assistant:"

so goooood, trying this out tonight

release is coming just hang on , multiple new merges going on at the same time for llama.cpp and aphrodite engine implementation


