GRPO Fine Tuning on Llama 3.1 8B
This is the second iteration of this blog post. I wrote the entire thing up, and didnt actually save the bed i had put it into. It must have gotten closed of...
This is the second iteration of this blog post. I wrote the entire thing up, and didnt actually save the bed i had put it into. It must have gotten closed of...
A few months ago I met a friend for drinks at the pub. While waiting, I dove into this Anthropic article on agents. I took considerable notes because a, I fo...
Everyone on the internet is a fucking retard except me. And it has bothered me so much the last few months as I try and fit my mental models to the incorrect...
OPEN SOURCE/OFFENSIVE SECURITY TOOLING aka “for the greater good I wish to harm your ability to make a living!” Despite living on the stupid platform, I r...
Note from the future - I have gotten approx 18 months of amusement out of this bot, but the twitter API no longer seems to allow free use at all. I am receiv...