Loading video...
Video Failed to Load
How well can Qwen3.5 models debug code? I built BugFind-15 — 15 buggy snippets across Python, JS, Rust, and Go. Docker sandbox compiles and validates every fix. Two trap scenarios where the code is correct and the model must resist "fixing" it. Tested every Qwen3.5 size from 0.8B to... show more
35,006 views • 2 months ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
