Loading video...

Video Failed to Load

Go Home

🚀New Amazon Q Developer agent for software development is available to customers: This agent is based on a new agent architecture that has exciting results coming from the SWE-bench scores (on the full and verified benchmarks) representing AI models’ ability to resolve real-world coding problems. Interesting aspect of Q...

28,946 views • 1 year ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Introducing ALE-Bench, ALE-Agent! Towards Automating Long-Horizon Algorithm Engineering for Hard Optimization Problems Blog: Paper: ALE-Bench is a coding benchmark primarily focused on hard optimization (NP-hard) problems. We developed this benchmark with AtCoder Inc., a leading coding contest platform company. What makes ALE-Bench unique is its focus on hard optimization problems that demand long-horizon and creative reasoning. It’s open-ended, in the sense that true optima are out of reach (NP-hard) and scores can continuously improve. We believe this benchmark has the potential to become one of the key benchmarks for reasoning and coding in the next generation. ALE-Agent is our end-to-end agent that we specifically designed for this challenging domain. In fact, our ALE-Agent has already built an impressive track record in the wild! In May 2025, our agent participated in a live AtCoder Heuristic Competition (AHC), alongside 1,000 other participants in real-time. AHC is considered to be one of the most challenging coding competitions in this domain. Our ALE-Agent achieved an impressive ranking of 21st out of 1,000 human participants in the competition (top 2%), marking a turning point for AI discovery of solutions to hard optimization problems with a wide spectrum of important real world applications such as logistics, routing, packing, factory production planning, power-grid balancing. We look forward to applying this technology to real industrial optimization opportunities. Building on the insights from this study, Sakana AI will continue to tackle the challenge of developing AI with even greater algorithm engineering capabilities. ALE-Bench Dataset: ALE-Bench Code: This research was conducted in collaboration with AtCoder Inc. (AtCoder). We are deeply grateful for their outstanding expertise and contributions in optimization and algorithms, which were invaluable in providing data, analyzing results, and enabling our AI agent’s participation in their contests.

Sakana AI

237,195 views • 11 months ago

We are excited to announce a powerful step for the future of FOMO! Taking a page out of Virtuals book on BASE, FOMO will be releasing the ability for future projects to be paired in $FOMO in the coming weeks. This is the biggest release we have ever announced. Launch your AI Agent Token + $FOMO trading pair Every individual agent token is paired with the $FOMO token in its liquidity pool. When launching an agent on you will need $FOMO tokens, which are used to create the liquidity pool. This process creates deflationary pressure for FOMO and the entire agent ecosystem. When creating your agent and token, you will have the option to pair your launch with FOMO or SOL, as our goal is not to alienate any project, but rather invite the best communities, CTO’s and builders to launch with us. If you decide to pair your project with FOMO you in turn get full marketing and dev support, once your project graduates the bonding curve and reaches Raydium. Further, as an added incentive, as our revenue grows we will be using part of the funds to support projects that have paired in FOMO. And Devs who launch tokens paired in FOMO will earn fees from their AI Agent token launch. Building the most robust agents using our framework will catapult us as one of the most prominent standards of the Solana ecosystem. Not only have we developed our own core infrastructure, but we also pull from some of the best repo’s and developer talent in all of AI, not just blockchain. Our team is comprised of 9 world class artificial intelligence engineers, PHDs in mathematics and engineering from the top companies on the cutting edge of AI. The future of AI Agents will be on Solana and we will help lead the way.

FOMO

129,858 views • 1 year ago

i just built a 4-agent software team. everything runs from Telegram and gets managed on a kanban board. a project manager who plans the work, a backend developer, a frontend developer, and a tester. the PM reads a goal, breaks it into linked tasks, and assigns each to the right agent. the thing that makes them a team instead of four strangers is a shared kanban board. every task is a row that survives crashes, and when an agent finishes, it writes a summary of what it built and what the next agent needs to know. the next agent reads that summary before it starts. so the frontend developer never has to guess the API shape, and the tester knows exactly what to verify. the hardest part was not the coordination. it was building an agent that could actually act like a backend engineer. a backend engineer stands up a database, wires auth, manages storage, deploys functions, and keeps all of it consistent while the rest of the team builds on top. an agent doing this from scratch drowns. it burns its context window remembering which tables exist and which endpoint it created three steps ago, and the work degrades fast. so the backend agent needs a backend built for agents, not for humans clicking through a dashboard. that is where InsForge came in. it is an open-source, agent-native backend, and i added it to my backend developer agent as a skill. a skill is a step-by-step guide that teaches the agent how to do a specific kind of work. with InsForge installed, the agent stopped improvising infrastructure and followed a reliable path: create the project, define the database, set up auth, deploy functions. to test the whole team, i had them build a working Google Docs clone, AI features included. the backend agent spun up the full service on its own. database tables, user auth, document handling, and edge functions running real TypeScript, all in one dashboard. the frontend agent read that summary and built the UI on top of it, and the tester closed the loop. the result was a backend an agent could reason about end to end, instead of one it kept getting lost inside. if you are building an AI backend engineer, InsForge is worth a look, it's 100% open-source. InsForge GitHub: (don't forget to star 🌟) the full article on Hermes Kanban: Mission Control for your Agents is quoted below.

Akshay 🚀

114,135 views • 3 days ago