
Sakana AI
@SakanaAILabs • 62,421 subscribers
Sakana AI is an AI R&D company based in Tokyo. We want to develop AI solutions for Japan’s needs, and democratize AI in Japan. https://t.co/OWL1EpmWtn
Shorts
Videos

Sakana AIは、日本の美を学んだAIとして、浮世絵風画像生成モデルEvo-Ukiyoeと、浮世絵カラー化モデルEvo-Nishikieを公開します。 ブログ → Evo-Ukiyoe デモ → Evo-Nishikie デモ → 浮世絵は日本の代表的な美術として世界的に人気があるため、画像生成の世界でも「浮世絵」というキーワードがプロンプトによく使われています。しかし生成画像は日本風イラストレーションになりがちで、あまり浮世絵らしくありません。そこで、Sakana AIは、実際の浮世絵により近い画像を生成するため、立命館大学アート・リサーチセンター(ARC)立命館大学アート・リサーチセンター (Art Research Center) のご協力をいただき、ARC所蔵浮世絵作品のデジタル画像を学習した画像生成モデルを開発しました。 今回公開するモデルは、プロンプトから画像を生成するEvo-Ukiyoeと、古典籍の挿絵をカラー化するEvo-Nishikieモデルです。これらのモデルが、歴史や文化を学ぶための新たなコンテンツ作成に利用され、浮世絵に関する興味を増すことにつながり、日本や世界の人々が浮世絵や日本文化に興味を持つきっかけを生み出すことを期待しています。 なお、Evo-Nishikieでカラー化した古典籍『絵本玉かつら』の全丁は、Center for Open Data in the Humanities (CODH) のページで公開しています。 また、モデルは以下のページで公開しています。 Evo-Ukiyoe モデル → Evo-Nishikie モデル →
Sakana AI16,694,267 views • 1 year ago

We’re excited to introduce KAME: Tandem Architecture for Enhancing Knowledge in Real-Time Speech-to-Speech Conversational AI, accepted at #ICASSP2026! 🐢 Blog Paper Can a speech AI think deeply without pausing to process? In real conversation, we don’t wait until we’ve fully worked out what we want to say—we start talking, and our thoughts catch up as the sentence unfolds. Fast speech-to-speech models achieve this, but their reasoning tends to stay shallow. Cascaded pipelines that route through a knowledgeable LLM are smarter, but the added latency breaks the flow—they fall back to "think, then speak." In our new paper, we propose a way to break this trade-off. We call it KAME (Turtle in Japanese). A speech-to-speech model handles the fast response loop and starts replying immediately. In parallel, a backend LLM runs asynchronously, generating response candidates that are continuously injected as "oracle" signals in real time. This shifts the AI paradigm from "think, then speak" to "speak while thinking." The backend LLM is completely swappable. You can plug in GPT-4.1, Claude Opus, or Gemini 2.5 Flash depending on the task without changing the frontend. In our experiments, Claude tended to score higher on reasoning, while GPT did better on humanities questions. Try the model yourself here:
Sakana AI290,088 views • 1 month ago

What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based platform for interactive artificial life research. The setup: several small CNNs share a 2D grid, each seeing only a 3x3 neighborhood. No global plan. They compete for territory by attacking neighbours and defending against incoming attacks, learning via gradient descent online while the simulation runs. What we didn't expect was the role of the learning itself. Gradient descent isn't just optimising each species' strategy. Instead, it acts to stabilize the whole system during simulation. Species that overextend get pushed back by the loss. Species that stagnate get nudged to grow. This means you can push parameters toward edge-of-chaos regimes: a zone characterised by emergent complexity. Letting the neural networks learn acts to hold the complex system together while you explore and interact. The platform lets you steer all of this interactively. You can draw walls to create niches, erase parts of the system online, and tune 40+ system parameters to explore the most interesting configurations. We find it mesmerizing to watch species carve out territories and reorganise when you perturb them. Everything runs client-side in your browser, no install needed. Blog: Code:
Sakana AI255,798 views • 1 month ago

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels. The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch. Our system is also able to produce highly optimized CUDA kernels that are much faster than existing CUDA kernels commonly used in production. We believe that fundamentally, AI systems can and should be as resource-efficient as the human brain, and that the best path to achieve this efficiency is to use AI to make AI more efficient! We are excited to publish our paper, The AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition. We also release a dataset of over 17,000 verified CUDA kernels produced by The AI CUDA Engineer. Paper: Kernel Archive Webpage: HuggingFace Dataset: The AI CUDA Engineer utilizes evolutionary LLM-driven code optimization to autonomously improve the runtime of machine learning operations. Our system is not only able to convert PyTorch code into CUDA kernels, but through the use of evolution, it can also optimize the runtime performance of CUDA kernels, fuse multiple operations, and even discover novel solutions for writing efficient CUDA operations by learning from past innovations! We believe The AI CUDA Engineer opens a new era of AI-driven acceleration of AI and automated inference time optimization. We (Robert Lange, Aaditya Prasad 🇺🇸, Suuun, Maxence Faldor, Yujin Tang, hardmaru) are excited to continue Sakana AI's mission of leveraging AI to improve AI.
Sakana AI1,149,274 views • 1 year ago

江戸時代の古文風テキストで会話できるチャットボット「からまる」を公開 ブログ: デモ: Sakana AIが江戸時代のテキストで学習した「からまる」は現代日本語で質問すると、江戸時代の世界観と当時の古文風テキストで回答してくれます。 「からまる」は、独自に構築した江戸テキストデータセットを元に、一貫して江戸時代の世界観を反映したテキストで回答します。このデータセットは、数千点以上の江戸時代の書物などを元に構築したもので、人間が作成したデータに加えて、今までテキスト化されてこなかった1,000冊以上の書物にもAIくずし字OCRを適用し、新たなデータを作成しました。この膨大なデータをもとに、江戸の世界に関して「からまる」が何を記憶し、回答できるようになったか、ぜひデモで会話しながら確かめてください。 また、本モデルは、分野に特化した大規模言語モデルの一例として、数千万文字規模の継続学習でも十分に有用な成果が得られることを示しています。これは、他の分野での同様のニーズへの展開可能性を示唆しています。 現代の知識を持ちながら江戸時代の世界観で自然に応答することは、人間には非常に困難です。しかし「からまる」は、それを実現し、時代を超えて過去の文化を身近に体感できるため、研究や教育分野での幅広い活用が期待されます。
Sakana AI658,044 views • 1 year ago

この度、新手法「TAID」を用いて学習された小規模日本語言語モデル「TinySwallow-1.5B」を公開しました。 私たちは、大規模言語モデル(LLM)の知識を効率的に小規模モデルへ転移させる新しい知識蒸留手法「TAID (Temporally Adaptive Interpolated Distillation)」を開発しました。この手法では、小規模モデルの学習進度に合わせて大規模モデルの知識を転移させることで、効果的な知識転移を実現します。この研究は機械学習分野の国際会議ICLR 2025に採択されました。 論文: GitHub: そして、TAIDを用いて32BパラメータのLLMから約1/20の大きさの1.5Bパラメータの小規模言語モデルへ知識転移を行い、同規模のモデルの中で最高性能となる日本語モデル「TinySwallow-1.5B」を作り出すことに成功しました。 小規模サイズである「TinySwallow-1.5B」は、外部APIなどを介さずお手元のスマートフォンやPCで完結したチャットが可能です。下記のウェブアプリのリンクから、ブラウザ上で動作するチャットアプリをお試しいただけます。 デモ: GitHub: モデル:
Sakana AI559,935 views • 1 year ago

Introducing Digital Red Queen (DRQ): Adversarial Program Evolution in Core War with LLMs Blog: Core War is a programming game where self-replicating assembly programs, called warriors, compete for control of a virtual machine. In this dynamic environment, where there is no distinction between code and data, warriors must crash opponents while defending themselves to survive. In this work, we explore how LLMs can drive open-ended adversarial evolution of these programs within Core War. Our approach is inspired by the Red Queen Hypothesis from evolutionary biology: the principle that species must continually adapt and evolve simply to survive against ever-changing competitors. We found that running our DRQ algorithm for longer durations produces warriors that become more generally robust. Most notably, we observed an emergent pressure towards convergent evolution. Independent runs, starting from completely different initial conditions, evolved toward similar general-purpose behaviors—mirroring how distinct species in nature often evolve similar traits to solve the same problems. Simulating these adversarial dynamics in an isolated sandbox offers a glimpse into the future, where deployed LLM systems might eventually compete against one another for computational or physical resources in the real world. This project is a collaboration between MIT and Sakana AI led by Akarsh Kumar Full Paper (Website): Full Paper (arxiv): Code:
Sakana AI143,164 views • 4 months ago

Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM! Our method leverages LLMs to propose and implement new preference optimization algorithms. We then train models with those algorithms and evaluate their performance, providing feedback to the LLM. By repeating this process for multiple generations in an evolutionary loop, the LLM discovers many highly-performant and novel preference optimization objectives! Paper: GitHub: Model: We proudly collaborated with the University of Oxford (Foerster Lab for AI Research) and Cambridge University (Mihaela van der Schaar) on this groundbreaking project. Looking ahead, we envision a future where AI-driven research reduces the need for extensive human intervention and computational resources. This will accelerate scientific discoveries and innovation, pushing the boundaries of what AI can achieve.
Sakana AI555,782 views • 2 years ago

Introducing Continuous Thought Machines New Blog: Modern AI is powerful, but it’s still distinct from human-like flexible intelligence. We believe neural timing is key. Our Continuous Thought Machine is built from the ground up to use neural dynamics as a powerful representation for intelligence. Thought takes time, and reasoning is a process. Biological brains inspire us with their complex neural activity, where neural timing is critical to intelligence. We’re exploring how to bring that power to AI. The Continuous Thought Machine (CTM) incorporates neuron-level temporal processing and neural synchronization, moving beyond current AI limitations. Our approach has two core innovations: (1) neuron-level temporal processing, where each neuron uses unique parameters to process a history of incoming signals for fine-grained temporal dynamics, and (2) neural synchronization, used as a direct latent representation to modulate data and produce outputs, encoding information directly in the timing of neural activity. Learn more about our approach: Interactive Report: Full Paper: GitHub :
Sakana AI289,548 views • 1 year ago

GPT-5 on Sudoku-Bench 🧩 Since releasing Sudoku-Bench in May 2025, when no LLM could solve a classic 9x9 puzzle, we've been evaluating the latest generation of models. GPT-5 now leads our leaderboard with 33% puzzles solved--approximately 2x the previous leader--and is the first LLM we've tested to solve a 9x9 Sudoku variant. However, with 67% of the much harder puzzles remaining unsolved, Sudoku-Bench continues to present significant challenges for AI reasoning. Modern Sudoku variants require models to first understand novel rulesets through meta-reasoning, then maintain global consistency across long reasoning chains. Our experiments with GRPO fine-tuning on Qwen2.5-7b and "Thought Cloning" (training on expert human reasoning from Cracking the Cryptic) show that current approaches still struggle with the spatial reasoning and creative "break-in" points that human solvers use naturally. We believe new approaches are required to solve our benchmark. These results highlight persistent gaps between computational problem-solving and human-like reasoning, particularly in tasks requiring integrated mathematical logic, spatial awareness, and creative insight. Read more about our update here: 🔗 Blogpost →
Sakana AI154,512 views • 6 months ago

Introducing SoftMatcha 2: A Fast and Soft Pattern Matcher for Trillion-Scale Pre-Training Corpora What lies within a trillion-scale pre-training corpus? Can you truly guarantee your benchmarks are uncontaminated simply because there are no exact string matches? Alongside several research institutions in Japan, Sakana AI is proud to have collaborated in the development of SoftMatcha 2, an ultra-fast and flexible search tool that enables search over trillion-scale natural language corpora in under 0.3 seconds, even while handling semantic variations (substitution, insertion, and deletion). No existing tool meets all these criteria, including infini-gram-mini (EMNLP’25 Best Paper) or the original SoftMatcha (ICLR’25). Our approach employs string matching based on suffix arrays that scales well with corpus size. To mitigate the combinatorial explosion induced by the semantic relaxation of queries, our method is built on two key algorithmic ideas: fast exact lookup enabled by a disk-aware design, and dynamic corpus-aware pruning. As a practical application, we demonstrate that SoftMatcha 2 identifies potential benchmark contamination in pre-training corpora that existing exact-match approaches miss. You can try searching through a 100B-scale corpus via our online demo. The system remains blazingly fast even on trillion-token corpora, so we encourage you to host it yourself for larger scales. Demo: Paper: Code: This work is a collaboration with researchers from the University of Tokyo, NII, Kyoto University, SOKENDAI, NINJAL, Tohoku University, and RIKEN.
Sakana AI88,647 views • 3 months ago

Introducing ALE-Bench, ALE-Agent! Towards Automating Long-Horizon Algorithm Engineering for Hard Optimization Problems Blog: Paper: ALE-Bench is a coding benchmark primarily focused on hard optimization (NP-hard) problems. We developed this benchmark with AtCoder Inc., a leading coding contest platform company. What makes ALE-Bench unique is its focus on hard optimization problems that demand long-horizon and creative reasoning. It’s open-ended, in the sense that true optima are out of reach (NP-hard) and scores can continuously improve. We believe this benchmark has the potential to become one of the key benchmarks for reasoning and coding in the next generation. ALE-Agent is our end-to-end agent that we specifically designed for this challenging domain. In fact, our ALE-Agent has already built an impressive track record in the wild! In May 2025, our agent participated in a live AtCoder Heuristic Competition (AHC), alongside 1,000 other participants in real-time. AHC is considered to be one of the most challenging coding competitions in this domain. Our ALE-Agent achieved an impressive ranking of 21st out of 1,000 human participants in the competition (top 2%), marking a turning point for AI discovery of solutions to hard optimization problems with a wide spectrum of important real world applications such as logistics, routing, packing, factory production planning, power-grid balancing. We look forward to applying this technology to real industrial optimization opportunities. Building on the insights from this study, Sakana AI will continue to tackle the challenge of developing AI with even greater algorithm engineering capabilities. ALE-Bench Dataset: ALE-Bench Code: This research was conducted in collaboration with AtCoder Inc. (AtCoder). We are deeply grateful for their outstanding expertise and contributions in optimization and algorithms, which were invaluable in providing data, analyzing results, and enabling our AI agent’s participation in their contests.
Sakana AI237,195 views • 11 months ago

We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨ The CTM is an AI that mimics biological brains by using neural dynamics & synchronization to think over time. It can solve complex mazes by building internal maps, gaze around images to classify them, and learn algorithms—all emergent from its core design. This is just the beginning. A hint of what we're exploring next… (video attached!) The team: Luke Darlow Ciaran Sebastian Risi Jeffrey Seely Llion Jones
Sakana AI168,099 views • 8 months ago

Introducing Reinforcement-Learned Teachers (RLTs): Transforming how we teach LLMs to reason with reinforcement learning (RL). Blog: Paper: Traditional RL focuses on “learning to solve” challenging problems with expensive LLMs and constitutes a key step in making student AI systems ultimately acquire reasoning capabilities via distillation and cold-starting. Enter our RLTs—a new class of models prompted with not only a problem’s question but also its solution, and directly trained to generate clear, step-by-step “explanations” to teach their students. Remarkably, an RLT with only 7B parameters produces superior results when distilling and cold-starting students in competitive and graduate-level reasoning tasks than orders-of-magnitude larger LLMs. RLTs are as effective even when distilling 32B students, much larger than the teacher itself—unlocking a new standard for efficiency in developing reasoning language models with RL. Code:
Sakana AI179,125 views • 11 months ago

Introducing Petri Dish Neural Cellular Automata (PD-NCA) 🦠 The search for open-ended complexification, a north star of Artificial Life (ALife) simulations, is a question that fascinates us deeply. In this work we explore the role of continual adaptation in ALife simulation, where the cellular automata in our system do not rely on a fixed set of parameters, but rather learn continuously during the simulation itself. Our Petri Dish Neural Cellular Automata (PD-NCA) is a new ALife substrate that consists of a differentiable world where multiple NCA learn to self-replicate and grow via ongoing gradient descent. Every individual is constantly trying to grow, all the while learning to adapt and out-compete its neighbors. PD-NCA allows for complex, emergent behaviors like cyclic dynamics, territorial defense, and spontaneous cooperation. The video below shows the sheer variety and complexity that unfolds during several different simulations (each colour is a different NCA).
Sakana AI105,032 views • 7 months ago

Introducing CycleQD: A population-based model merging via Quality Diversity CycleQD builds on our model merging research, advancing two fronts: evolving a swarm of specialized agents to complement one another, and laying the groundwork for life-long learning by enabling diverse, adaptable skill acquisition at the population-level.
Sakana AI121,513 views • 1 year ago
0:25
Sensitive content
This media may contain sensitive content.