Surya

@sdand • 15,797 subscribers

@llmh

Shorts

made an app that guesses where you are in the world with just a picture using image embeddings trained on street view data first time using swiftui, consumer apps in general, TestFlight below

made an app that guesses where you are in the world with just a picture using image embeddings trained on street view data first time using swiftui, consumer apps in general, TestFlight below

9,600,988 次观看

Just got GPT-3.5-turbo-instruct and turbo function calling to create math animations (manim) just from text Now you can easily ask to graph, answer questions, and you'll get a beautifully rendered animation explaining the concept It works thanks to few-shot prompting 🎯

Just got GPT-3.5-turbo-instruct and turbo function calling to create math animations (manim) just from text Now you can easily ask to graph, answer questions, and you'll get a beautifully rendered animation explaining the concept It works thanks to few-shot prompting 🎯

52,416 次观看

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Can computer-use models play games now, one-shot? I gave Claude Opus 4.5 a simple prompt like "play league of legends" and it starts clicking and typing around my computer pretty effectively even though it doesn't win due to latency More interestingly between Minecraft, finding car insurance, and booking flights I noticed some emergent behavior: persistently maximizing EV even if that involves shortcuts

Can computer-use models play games now, one-shot? I gave Claude Opus 4.5 a simple prompt like "play league of legends" and it starts clicking and typing around my computer pretty effectively even though it doesn't win due to latency More interestingly between Minecraft, finding car insurance, and booking flights I noticed some emergent behavior: persistently maximizing EV even if that involves shortcuts

411,680 次观看 • 7 个月前

introducing vmux: `vmux run` bundles and runs your code in a long-running cloudflare container and gives you a preview url in under 5s its a drop-in for uv run - close your laptop, run a tinker training job, come back and reattach via tmux!

introducing vmux: `vmux run` bundles and runs your code in a long-running cloudflare container and gives you a preview url in under 5s its a drop-in for uv run - close your laptop, run a tinker training job, come back and reattach via tmux!

215,070 次观看 • 6 个月前

night vision with the apple vision pro, no cameras needed

night vision with the apple vision pro, no cameras needed

810,694 次观看 • 2 年前

claude playing TFT -- i ran it for two games and it improved in-context from 0 to 3/30 rounds won. it figured out "3-starring" on its own (buying pairs to upgrade units) which is a core mechanic of this game and hasnt been instructed on how to play besides asking it to "play tft and win" comparing this cua agent to OpenAI Five's result from 2017 which is a different game, model(a big LSTM), had access to Dota's API outputted a discrete action at 4.6hz this cua agent runs at 0.15hz considering it takes anywhere from 2-15 seconds to think(0.07hz without), and is just pixels in, any mouse/keyboard out... and (potentially) wasnt trained on TFT specifically but not sure despite this 2-15s seems to be fine for a turned based game like TFT but in prior videos like league/minecraft is considerably harder for a cua agent to do since by the time you take a screenshot the game state may have changed and you've wasted 7 seconds thinking, and another 7 seconds thinking about the new game state, and catastrophically failing like that. the intention here is not to play TFT or league with a bot. please dont use it as such

claude playing TFT -- i ran it for two games and it improved in-context from 0 to 3/30 rounds won. it figured out "3-starring" on its own (buying pairs to upgrade units) which is a core mechanic of this game and hasnt been instructed on how to play besides asking it to "play tft and win" comparing this cua agent to OpenAI Five's result from 2017 which is a different game, model(a big LSTM), had access to Dota's API outputted a discrete action at 4.6hz this cua agent runs at 0.15hz considering it takes anywhere from 2-15 seconds to think(0.07hz without), and is just pixels in, any mouse/keyboard out... and (potentially) wasnt trained on TFT specifically but not sure despite this 2-15s seems to be fine for a turned based game like TFT but in prior videos like league/minecraft is considerably harder for a cua agent to do since by the time you take a screenshot the game state may have changed and you've wasted 7 seconds thinking, and another 7 seconds thinking about the new game state, and catastrophically failing like that. the intention here is not to play TFT or league with a bot. please dont use it as such

222,132 次观看 • 7 个月前

made a site to run deepseek r1 for free -- locally in your browser, no downloads, no servers purely using WebGPU r1-web is open source, made in america, run on american servers, available below forever

made a site to run deepseek r1 for free -- locally in your browser, no downloads, no servers purely using WebGPU r1-web is open source, made in america, run on american servers, available below forever

440,183 次观看 • 1 年前

I made a site that uses WebGPU to run Qwen3 .6b with thinking locally, directly in your browser, no installation or servers necessary and runs offline Available forever for free and open source:

I made a site that uses WebGPU to run Qwen3 .6b with thinking locally, directly in your browser, no installation or servers necessary and runs offline Available forever for free and open source:

167,509 次观看 • 8 个月前

I made a RL policy that guesses where a picture was taken without GPS data It continuously learns, updating its weights with every use in realtime -- over the weekend it improved 13.9% with <100 images Best of all, it does this without ever storing any image data, link below

I made a RL policy that guesses where a picture was taken without GPS data It continuously learns, updating its weights with every use in realtime -- over the weekend it improved 13.9% with <100 images Best of all, it does this without ever storing any image data, link below

167,896 次观看 • 9 个月前

To solve AGI, we must first solve Geoguessr For that I built vlm-gym, a simple RL gym written in scratch, in JAX for Qwen3VL-4B (released yesterday) And added Geospot, a RL environment for geolocation and learned VLMs can learn how to geoguess. More:

To solve AGI, we must first solve Geoguessr For that I built vlm-gym, a simple RL gym written in scratch, in JAX for Qwen3VL-4B (released yesterday) And added Geospot, a RL environment for geolocation and learned VLMs can learn how to geoguess. More:

140,348 次观看 • 9 个月前

inspired by SDPO, i made continualcode -- a minimal claude code that learns from your corrections in real-time, built on tinker. when you deny a diff, the model uses your correction as context to teach itself, takes a gradient step on LoRA, and retries with updated weights. claude code but it updates the model weights!

inspired by SDPO, i made continualcode -- a minimal claude code that learns from your corrections in real-time, built on tinker. when you deny a diff, the model uses your correction as context to teach itself, takes a gradient step on LoRA, and retries with updated weights. claude code but it updates the model weights!

82,719 次观看 • 5 个月前

Introducing vmux - incredibly fast, stateful cloud sandboxes for coding agents for the first time you get persistent GPU/CPU sandboxes via Modal/CF backed by Durable Objects to stream logs live, native preview URLs, and attach a real shell spin up a notebook or train nanogpt via codex - with a Modal sandbox spun up in seconds

Introducing vmux - incredibly fast, stateful cloud sandboxes for coding agents for the first time you get persistent GPU/CPU sandboxes via Modal/CF backed by Durable Objects to stream logs live, native preview URLs, and attach a real shell spin up a notebook or train nanogpt via codex - with a Modal sandbox spun up in seconds

75,417 次观看 • 5 个月前

i cloned tiktok but the feed algo is recursively updated by a coding model and each video is sourced from wikipedia

i cloned tiktok but the feed algo is recursively updated by a coding model and each video is sourced from wikipedia

25,555 次观看 • 3 个月前

I made an app that lets Claude control my computer this Christmas. It started making music and can play Minecraft (at 4x speed)

I made an app that lets Claude control my computer this Christmas. It started making music and can play Minecraft (at 4x speed)

Surya Dantuluri

36,548 次观看 • 6 个月前

deploy to preview in under 3 seconds cold with vmux! i told a friend i didnt have any modal credits left on saturday, by monday night i rolled my own cpu modal - seems prevalent if llms can code you should give them easy access(besides ssh) to provision, run code, and deploy containers fast

deploy to preview in under 3 seconds cold with vmux! i told a friend i didnt have any modal credits left on saturday, by monday night i rolled my own cpu modal - seems prevalent if llms can code you should give them easy access(besides ssh) to provision, run code, and deploy containers fast

Surya Dantuluri

12,275 次观看 • 6 个月前

没有更多内容可加载