Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Introducing ml-intern, the agent that just automated the post-training team Hugging Face It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed... models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: Web + mobile: And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.show more

Aksel

3,115 subscribers

1,260,736 Aufrufe • vor 1 Monat •via X (Twitter)

Gesundheit & Wellness Bildung Wissenschaft & Technologie

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

OpenAI's Deep Research is getting a run for its money. Deep Lake was just released, and it's a different take on an AI system that can do deep research on your own data. You can use Deep Lake to build AI search with reasoning on your private and public data. (Look at the attached videos to get an idea of how it works.) If you want to research proprietary and sensitive data, Deep Research won't help you because it's limited to public data. Deep Lake, however, will allow you to use your private data. On top of that, Deep Lake supports multi-modal retrieval from the ground up. It uses vision language models for data ingestion and retrieval so that you can connect any data (PDFs, images, videos, structured data, etc.) You can even use mixed-data queries! Deep Lake can search your data from S3, Dropbox, and GCP. It learns from your queries over time, making the results as relevant to your work as possible!

OpenAI's Deep Research is getting a run for its money. Deep Lake was just released, and it's a different take on an AI system that can do deep research on your own data. You can use Deep Lake to build AI search with reasoning on your private and public data. (Look at the attached videos to get an idea of how it works.) If you want to research proprietary and sensitive data, Deep Research won't help you because it's limited to public data. Deep Lake, however, will allow you to use your private data. On top of that, Deep Lake supports multi-modal retrieval from the ground up. It uses vision language models for data ingestion and retrieval so that you can connect any data (PDFs, images, videos, structured data, etc.) You can even use mixed-data queries! Deep Lake can search your data from S3, Dropbox, and GCP. It learns from your queries over time, making the results as relevant to your work as possible!

Santiago

171,340 Aufrufe • vor 1 Jahr

chat with papers for any arXiv link to HF paper you can now chat using Hugging Chat All Hugging Face Papers now include a built-in assistant, powered by HuggingChat and the Hugging Face MCP server. It helps you quickly understand papers by answering questions, summarizing key ideas, and providing context as you browse the latest research

chat with papers for any arXiv link to HF paper you can now chat using Hugging Chat All Hugging Face Papers now include a built-in assistant, powered by HuggingChat and the Hugging Face MCP server. It helps you quickly understand papers by answering questions, summarizing key ideas, and providing context as you browse the latest research

AK

39,895 Aufrufe • vor 5 Monaten

🚨 JUST IN: CHINA just released an AI EMPLOYEE that works 24X7 on its own. 100% OPEN SOURCE. It researches, codes, builds websites, creates slide decks, and generates videos. All by itself. All on your computer. It's called DeerFlow. You give it a task. It makes a plan, spins up its own team of sub-agents, and gets to work. You come back and there's a finished deliverable waiting. Not a draft. Not a summary. The actual thing. Not a chatbot. Not a research assistant. An AI with its own computer that works while you sleep. Here's what it does on its own: → Spawns multiple sub-agents in parallel, each tackling a different piece of your task, then combines everything into one finished output → Writes real code, runs it, reads the results, and fixes its own mistakes without asking you once → Builds slide decks, websites, full research reports, and data dashboards from scratch → Remembers you across sessions. Your writing style. Your tech stack. Your preferences. Gets better every time. → Reads files you upload, works with them inside its own filesystem, hands you clean finished outputs → Searches the web, runs commands, calls any tool you plug in Here's how it thinks: You give one instruction. The lead agent makes a plan. Sub-agents fan out and work in parallel. Results come back. Everything gets synthesized. You get a deliverable. A single research task might split into a dozen sub-agents, each exploring a different angle, then converge into one finished website with generated visuals. Here's the wildest part: DeerFlow 2.0 launched on February 28th 2026 and hit number 1 on all of GitHub Trending the same day. Version 2.0 was a complete rewrite. Zero shared code with version 1. Because users kept using it for things the team never intended. Data pipelines. Dashboards. Entire content workflows. The community told them what it needed to become. So they burned it down and rebuilt it. 22.7K GitHub stars. 2.7K forks. Built by ByteDance 100% Open Source. MIT License.

🚨 JUST IN: CHINA just released an AI EMPLOYEE that works 24X7 on its own. 100% OPEN SOURCE. It researches, codes, builds websites, creates slide decks, and generates videos. All by itself. All on your computer. It's called DeerFlow. You give it a task. It makes a plan, spins up its own team of sub-agents, and gets to work. You come back and there's a finished deliverable waiting. Not a draft. Not a summary. The actual thing. Not a chatbot. Not a research assistant. An AI with its own computer that works while you sleep. Here's what it does on its own: → Spawns multiple sub-agents in parallel, each tackling a different piece of your task, then combines everything into one finished output → Writes real code, runs it, reads the results, and fixes its own mistakes without asking you once → Builds slide decks, websites, full research reports, and data dashboards from scratch → Remembers you across sessions. Your writing style. Your tech stack. Your preferences. Gets better every time. → Reads files you upload, works with them inside its own filesystem, hands you clean finished outputs → Searches the web, runs commands, calls any tool you plug in Here's how it thinks: You give one instruction. The lead agent makes a plan. Sub-agents fan out and work in parallel. Results come back. Everything gets synthesized. You get a deliverable. A single research task might split into a dozen sub-agents, each exploring a different angle, then converge into one finished website with generated visuals. Here's the wildest part: DeerFlow 2.0 launched on February 28th 2026 and hit number 1 on all of GitHub Trending the same day. Version 2.0 was a complete rewrite. Zero shared code with version 1. Because users kept using it for things the team never intended. Data pipelines. Dashboards. Entire content workflows. The community told them what it needed to become. So they burned it down and rebuilt it. 22.7K GitHub stars. 2.7K forks. Built by ByteDance 100% Open Source. MIT License.

Kanika

735,266 Aufrufe • vor 2 Monaten

You can now make your own shadcn/ui and use it in v0. Create it on open it in v0, and use it as the foundation for your app.

You can now make your own shadcn/ui and use it in v0. Create it on open it in v0, and use it as the foundation for your app.

v0

56,019 Aufrufe • vor 6 Monaten

I decided to share part of a prompt you can use to research any protocol in seconds using INFINIT Intelligence by INFINIT. As an example, I used Silo Labs, where I currently farm most of my stablecoin yields. 🔖 Bookmark this + read until the end for a bonus. Just replace [PROTOCOL NAME] with any protocol you want to research. Prompt: "Conduct thorough research on [PROTOCOL NAME] and answer the following questions: - What is the project building? - What problem does it solve, and for whom? - What makes it different from others? - What blockchain is it built on? - Is there a token? What’s its purpose? - How does the protocol work technically? - How does it make money or sustain itself? - What’s the staking model, emissions, burns, and treasury? - Who are the founders, and is the team public and credible? - Who funded it? Who holds most tokens? - Are users, TVL, and volume growing? - How strong is the ecosystem around it? - Who are the main competitors, and how does it compare? - Is the protocol audited? Any past hacks? - What are the best ways to use the protocol (strategies)? - What’s coming next? Key milestones or launches? - What are the biggest risks? - How strong is the community and social traction?" Bonus: Quote this post with a reason why you like INFINIT, and I’ll DM you the full version of the prompt within the next 24 hours.

I decided to share part of a prompt you can use to research any protocol in seconds using INFINIT Intelligence by INFINIT. As an example, I used Silo Labs, where I currently farm most of my stablecoin yields. 🔖 Bookmark this + read until the end for a bonus. Just replace [PROTOCOL NAME] with any protocol you want to research. Prompt: "Conduct thorough research on [PROTOCOL NAME] and answer the following questions: - What is the project building? - What problem does it solve, and for whom? - What makes it different from others? - What blockchain is it built on? - Is there a token? What’s its purpose? - How does the protocol work technically? - How does it make money or sustain itself? - What’s the staking model, emissions, burns, and treasury? - Who are the founders, and is the team public and credible? - Who funded it? Who holds most tokens? - Are users, TVL, and volume growing? - How strong is the ecosystem around it? - Who are the main competitors, and how does it compare? - Is the protocol audited? Any past hacks? - What are the best ways to use the protocol (strategies)? - What’s coming next? Key milestones or launches? - What are the biggest risks? - How strong is the community and social traction?" Bonus: Quote this post with a reason why you like INFINIT, and I’ll DM you the full version of the prompt within the next 24 hours.

Keno

15,876 Aufrufe • vor 11 Monaten

3 weeks since ml-intern launched and we just hit 1M messages exchanged. that's 3.3 agent-years of ML research in 21 days. 2 months worth of research every day. 17,383 training jobs total. talk about AI acceleration. here's some of what people built: Carlos Miguel Patiño replicated the full DeepSeek v4 architecture and pre+post trained a 100M MoE from scratch. → it landed a third place submission on Keller Jordan optimizer competition. autoresearch on SOTA territory. Lewis Tunstall Got the intern to convert Alec Radford's cool new talkie-lm 1930 model to work with transformers. tokenizer, chat template, model conversion etc all one-shotted by ml-intern. someone created entire PhD dissertation chapter on context-aware agentic cyber defense drafted with 16 research subagents. and someone used it to crack an Paul Jankura kernel optimization take-home. (we don't know how to feel about this one 👀 ) just getting started →

3 weeks since ml-intern launched and we just hit 1M messages exchanged. that's 3.3 agent-years of ML research in 21 days. 2 months worth of research every day. 17,383 training jobs total. talk about AI acceleration. here's some of what people built: Carlos Miguel Patiño replicated the full DeepSeek v4 architecture and pre+post trained a 100M MoE from scratch. → it landed a third place submission on Keller Jordan optimizer competition. autoresearch on SOTA territory. Lewis Tunstall Got the intern to convert Alec Radford's cool new talkie-lm 1930 model to work with transformers. tokenizer, chat template, model conversion etc all one-shotted by ml-intern. someone created entire PhD dissertation chapter on context-aware agentic cyber defense drafted with 16 research subagents. and someone used it to crack an Paul Jankura kernel optimization take-home. (we don't know how to feel about this one 👀 ) just getting started →

Aksel

35,091 Aufrufe • vor 1 Monat

Dexter vs. Claude Code I ran tests overnight and Dexter came out ahead on complex financial tasks that required deep research. Dexter won on: • speed (by 92%) • cost (by 26%) • correctness (by 31%) I use Claude Code often, so this was fun to see. A key challenge for CC is that it relies on web search for financial data. Most of what it finds comes from news sites, blogs, and other secondary sources. Dexter uses primary source data from Financial Datasets, so the performance gap makes sense. Plenty of room to improve on Dexter. The gap will only grow from here. Evals from vals. Report coming next.

Dexter vs. Claude Code I ran tests overnight and Dexter came out ahead on complex financial tasks that required deep research. Dexter won on: • speed (by 92%) • cost (by 26%) • correctness (by 31%) I use Claude Code often, so this was fun to see. A key challenge for CC is that it relies on web search for financial data. Most of what it finds comes from news sites, blogs, and other secondary sources. Dexter uses primary source data from Financial Datasets, so the performance gap makes sense. Plenty of room to improve on Dexter. The gap will only grow from here. Evals from vals. Report coming next.

virat

26,638 Aufrufe • vor 6 Monaten

🐯 as soon as we finished our [training] completion ceremony, i talked a lot with jihoon [on the phone], i think we talked for almost an hour 🐯 we were like, "you went through that too?" "i did too," and such... it also kinda differs a bit depending on the training camp

🐯 as soon as we finished our [training] completion ceremony, i talked a lot with jihoon [on the phone], i think we talked for almost an hour 🐯 we were like, "you went through that too?" "i did too," and such... it also kinda differs a bit depending on the training camp

🌌

14,863 Aufrufe • vor 6 Monaten

We are entering an extremely exciting era for open-weight models. Kimi K2.6 now feels like a top agentic model. I took it for a spin via Fireworks AI fast inference APIs. Kimi K2.6 has impressive agentic capabilities, design skills, and the ability to synthesize large amounts of information. I built a little Skill that produces survey papers on any AI research topic you want. (see example in the clip) You can use the skill to tell your agent to generate a survey on whatever topic and watch it go to work. The artifact was fully generated by Kimi.ai's Kimi K2.6. It's cheap and fast. Next step for me is to explore ways to continue integrating the capabilities of these models on use cases like automating my LLM knowledge bases and augmenting my agent memory capabilities. Stay tuned for more.

We are entering an extremely exciting era for open-weight models. Kimi K2.6 now feels like a top agentic model. I took it for a spin via Fireworks AI fast inference APIs. Kimi K2.6 has impressive agentic capabilities, design skills, and the ability to synthesize large amounts of information. I built a little Skill that produces survey papers on any AI research topic you want. (see example in the clip) You can use the skill to tell your agent to generate a survey on whatever topic and watch it go to work. The artifact was fully generated by Kimi.ai's Kimi K2.6. It's cheap and fast. Next step for me is to explore ways to continue integrating the capabilities of these models on use cases like automating my LLM knowledge bases and augmenting my agent memory capabilities. Stay tuned for more.

elvis

47,678 Aufrufe • vor 1 Monat

ANTHROPIC JUST TURNED AI AGENTS INTO GIT REPOS Anthropic shipped "ant" - a CLI that runs every Claude API endpoint straight from your terminal. The headline isn't the terminal access. It's that you can now version-control an AI agent as YAML in Git and have CI sync it to the Claude Platform, the same way you ship code. - Every API resource is a subcommand: messages, models, files, agents, sessions - Define an agent in a YAML file, check it into your repo, and keep it in sync with one update command - Spin up a session, send it an event, then pull every event and tool call back from the same CLI - Claude Code knows how to drive ant out of the box - it shells out and reads the results with no glue code Agents just stopped being prompts you babysit and became infrastructure you deploy.

ANTHROPIC JUST TURNED AI AGENTS INTO GIT REPOS Anthropic shipped "ant" - a CLI that runs every Claude API endpoint straight from your terminal. The headline isn't the terminal access. It's that you can now version-control an AI agent as YAML in Git and have CI sync it to the Claude Platform, the same way you ship code. - Every API resource is a subcommand: messages, models, files, agents, sessions - Define an agent in a YAML file, check it into your repo, and keep it in sync with one update command - Spin up a session, send it an event, then pull every event and tool call back from the same CLI - Claude Code knows how to drive ant out of the box - it shells out and reads the results with no glue code Agents just stopped being prompts you babysit and became infrastructure you deploy.

BuBBliK

199,701 Aufrufe • vor 17 Tagen

Placing objects sounds simple… until robots have to do it. This method makes it simple, fast & reliable. [Github ⬇️] Robotic object placement is tough, especially with stacking, hanging, or insertion. AnyPlace is a new two-stage method that uses only synthetic data and a vision-language model to teach robots where and how to place objects; even in the real world. Why this works ✅ Finds the right spot with help from vision-language models ✅ Handles stacking, insertion, and hanging with no real-world training ✅ Trained on synthetic data using Blender and IsaacSim ✅ Works in the real world without fine-tuning It shows that smart use of simulation and language models can make robotic placement tasks easier, faster, and more reliable. Github: Paper: Thank you for sharing Animesh Garg !

Placing objects sounds simple… until robots have to do it. This method makes it simple, fast & reliable. [Github ⬇️] Robotic object placement is tough, especially with stacking, hanging, or insertion. AnyPlace is a new two-stage method that uses only synthetic data and a vision-language model to teach robots where and how to place objects; even in the real world. Why this works ✅ Finds the right spot with help from vision-language models ✅ Handles stacking, insertion, and hanging with no real-world training ✅ Trained on synthetic data using Blender and IsaacSim ✅ Works in the real world without fine-tuning It shows that smart use of simulation and language models can make robotic placement tasks easier, faster, and more reliable. Github: Paper: Thank you for sharing Animesh Garg !

Ilir Aliu - eu/acc

22,843 Aufrufe • vor 1 Jahr

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

Dan Shipper 📧

350,939 Aufrufe • vor 22 Tagen

MiniMax is the James Bond of AI agents. It uses the world's first open-weight model (MiniMax-M1), and it squeezes every bit of power from it. The agent takes a prompt and does more than any other agent in the market right now: 1. It can do Deep Research 2. It can write code 3. It can design web pages 4. It can build 3D models I built 5 different experiences using MiniMax and recorded them for you:

MiniMax is the James Bond of AI agents. It uses the world's first open-weight model (MiniMax-M1), and it squeezes every bit of power from it. The agent takes a prompt and does more than any other agent in the market right now: 1. It can do Deep Research 2. It can write code 3. It can design web pages 4. It can build 3D models I built 5 different experiences using MiniMax and recorded them for you:

Santiago

44,730 Aufrufe • vor 1 Jahr

Fine-tune DeepSeek-OCR on your own language! (100% local) DeepSeek-OCR is a 3B-parameter vision model that achieves 97% precision while using 10× fewer vision tokens than text-based LLMs. It handles tables, papers, and handwriting without killing your GPU or budget. Why it matters: Most vision models treat documents as massive sequences of tokens, making long-context processing expensive and slow. DeepSeek-OCR uses context optical compression to convert 2D layouts into vision tokens, enabling efficient processing of complex documents. The best part? You can easily fine-tune it for your specific use case on a single GPU. I used Unsloth to run this experiment on Persian text and saw an 88.26% improvement in character error rate. ↳ Base model: 149% character error rate (CER) ↳ Fine-tuned model: 60% CER (57% more accurate) ↳ Training time: 60 steps on a single GPU Persian was just the test case. You can swap in your own dataset for any language, document type, or specific domain you're working with. I've shared the complete guide in the next tweet - all the code, notebooks, and environment setup ready to run with a single click. Everything is 100% open-source!

Fine-tune DeepSeek-OCR on your own language! (100% local) DeepSeek-OCR is a 3B-parameter vision model that achieves 97% precision while using 10× fewer vision tokens than text-based LLMs. It handles tables, papers, and handwriting without killing your GPU or budget. Why it matters: Most vision models treat documents as massive sequences of tokens, making long-context processing expensive and slow. DeepSeek-OCR uses context optical compression to convert 2D layouts into vision tokens, enabling efficient processing of complex documents. The best part? You can easily fine-tune it for your specific use case on a single GPU. I used Unsloth to run this experiment on Persian text and saw an 88.26% improvement in character error rate. ↳ Base model: 149% character error rate (CER) ↳ Fine-tuned model: 60% CER (57% more accurate) ↳ Training time: 60 steps on a single GPU Persian was just the test case. You can swap in your own dataset for any language, document type, or specific domain you're working with. I've shared the complete guide in the next tweet - all the code, notebooks, and environment setup ready to run with a single click. Everything is 100% open-source!

Akshay 🚀

126,036 Aufrufe • vor 7 Monaten

You can now use Qwen3-VL in Jan. Find the GGUF model on Hugging Face, click "Use this model" and select Jan, or copy the model link and paste it into Jan Hub. Thanks Qwen 🧡

You can now use Qwen3-VL in Jan. Find the GGUF model on Hugging Face, click "Use this model" and select Jan, or copy the model link and paste it into Jan Hub. Thanks Qwen 🧡

👋 Jan

44,150 Aufrufe • vor 7 Monaten

I just built a Meta Ads diagnostic in Claude Code that tells you WHY your account broke, not just what changed 🤯 It spins up a team of agents that each investigate a different reason performance dropped, then argue against each other to kill the wrong answer before it ever reaches you. All inside Claude Code. Perfect for DTC brands and agencies who panic-kill creative the second CPA spikes. If you've watched ROAS fall off a cliff and opened Ads Manager with ten tabs going, you already know what happens next. Your gut says "creative fatigue." You kill your best-performing ad. A week later performance is still broken, because that was never the problem. Guessing wrong is the most expensive move in paid social. This workflow ends the guessing: → One agent investigates each competing theory — creative fatigue, budget and delivery changes, traffic quality, offer and seasonality → Each one is blind to the others, reasoning only from its own slice of the data so they can't bias each other → A refuter agent then attacks every surviving theory and tries to kill it → A theory only stands if the data can't disprove it → You get a ranked diagnosis: the real cause, the evidence for and against it, and the one move to make this week No anchoring on the first obvious answer. No killing winning creative on a hunch. No "here's what happened" reports that never tell you why. What you get: → Every theory tested in parallel instead of one biased guess → An adversarial pass that kills the wrong answer before you act on it → A ranked diagnosis with confidence levels and evidence both ways → A reusable workflow you drop next month's export into and re-run Built 100% in Claude Code with the new dynamic workflows. The first account I ran it on looked like textbook creative fatigue. The workflow disagreed, and traced the real cause to a budget change that had doubled spend and flooded delivery with junk traffic. I put together a full playbook with the exact workflow, the prompt, and how to run it on your own account. Want it for free? > Like this post > Comment "META" And I'll send it over (must be following so I can DM)

I just built a Meta Ads diagnostic in Claude Code that tells you WHY your account broke, not just what changed 🤯 It spins up a team of agents that each investigate a different reason performance dropped, then argue against each other to kill the wrong answer before it ever reaches you. All inside Claude Code. Perfect for DTC brands and agencies who panic-kill creative the second CPA spikes. If you've watched ROAS fall off a cliff and opened Ads Manager with ten tabs going, you already know what happens next. Your gut says "creative fatigue." You kill your best-performing ad. A week later performance is still broken, because that was never the problem. Guessing wrong is the most expensive move in paid social. This workflow ends the guessing: → One agent investigates each competing theory — creative fatigue, budget and delivery changes, traffic quality, offer and seasonality → Each one is blind to the others, reasoning only from its own slice of the data so they can't bias each other → A refuter agent then attacks every surviving theory and tries to kill it → A theory only stands if the data can't disprove it → You get a ranked diagnosis: the real cause, the evidence for and against it, and the one move to make this week No anchoring on the first obvious answer. No killing winning creative on a hunch. No "here's what happened" reports that never tell you why. What you get: → Every theory tested in parallel instead of one biased guess → An adversarial pass that kills the wrong answer before you act on it → A ranked diagnosis with confidence levels and evidence both ways → A reusable workflow you drop next month's export into and re-run Built 100% in Claude Code with the new dynamic workflows. The first account I ran it on looked like textbook creative fatigue. The workflow disagreed, and traced the real cause to a budget change that had doubled spend and flooded delivery with junk traffic. I put together a full playbook with the exact workflow, the prompt, and how to run it on your own account. Want it for free? > Like this post > Comment "META" And I'll send it over (must be following so I can DM)

Mike Futia

12,371 Aufrufe • vor 16 Tagen

this AI agent builds and sells info products on full autopilot. here's how: - scan subreddits like r/anxiety, r/solotravel, r/socialskills, r/overthinking every few hours - find the fears people keep posting about over and over - generate a short PDF guide that actually helps them through it - spin up a landing page with payments built in - scan Reddit 24/7 for people posting about that exact problem and drop helpful comments pointing them to the guide - run completely hands off it finds the pain, builds the product and finds the customers. fully automated reply "AGENT" + RT and I'll send you a free guide so you can set it up too (must be following so I can DM)

this AI agent builds and sells info products on full autopilot. here's how: - scan subreddits like r/anxiety, r/solotravel, r/socialskills, r/overthinking every few hours - find the fears people keep posting about over and over - generate a short PDF guide that actually helps them through it - spin up a landing page with payments built in - scan Reddit 24/7 for people posting about that exact problem and drop helpful comments pointing them to the guide - run completely hands off it finds the pain, builds the product and finds the customers. fully automated reply "AGENT" + RT and I'll send you a free guide so you can set it up too (must be following so I can DM)

Chris

24,175 Aufrufe • vor 2 Monaten

Meet Stable Audio 3.0, the open-weight model family built for artistic experimentation. This is our open invitation to experiment with generative audio. We believe the best innovations are still waiting to be built. The 4-1-1 on 3.0: 📣 You own your outputs, and can distribute and commercialize them under the Stability AI Community License (up to $1 million in revenue). 🎵 New and improved capabilities include variable-length generation up to six minutes, and full song composition on portable devices, no GPU required. ✅ Trained on a fully licensed dataset. 🎨 You can customize the models on your own library with support for LoRa training, which we’ve documented for the first time. More on the models 👇

Meet Stable Audio 3.0, the open-weight model family built for artistic experimentation. This is our open invitation to experiment with generative audio. We believe the best innovations are still waiting to be built. The 4-1-1 on 3.0: 📣 You own your outputs, and can distribute and commercialize them under the Stability AI Community License (up to $1 million in revenue). 🎵 New and improved capabilities include variable-length generation up to six minutes, and full song composition on portable devices, no GPU required. ✅ Trained on a fully licensed dataset. 🎨 You can customize the models on your own library with support for LoRa training, which we’ve documented for the first time. More on the models 👇

Stability AI

154,029 Aufrufe • vor 1 Monat

subagents are just recursive agents where you can apply different prompts + models depending on the task. since they’re just a primitive, Cursor cli can actually spawn subagents by calling cursor-agent in headless mode via shell commands. that’s what makes the cli so nice. you can extend it, experiment, and have a lot of fun exploring orchestration patterns. here’s one way to do it w. dynamic model selection: 1. create a subagents.mdc rule 2. drop in: ``` --- alwaysApply: true --- ALWAYS spawn subagents by running `cursor-agent -p [task] --output-format=text --force --model [model]` in the terminal. Each subagent should return a summary of the changes it made. Subagents should be used for ALL tasks You can adopt a fan-out pattern where you spawn subagents to perform parallel isolated tasks, and then fan-in the results. Use the following models: - `--model gpt-5` for reasoning, researching, and planning - `--model sonnet-4` for implementation ``` 3. start cursor cli and try it out you can also adjust the rule to be more explicit when it should use subagents, when not to, which models when etc.

subagents are just recursive agents where you can apply different prompts + models depending on the task. since they’re just a primitive, Cursor cli can actually spawn subagents by calling cursor-agent in headless mode via shell commands. that’s what makes the cli so nice. you can extend it, experiment, and have a lot of fun exploring orchestration patterns. here’s one way to do it w. dynamic model selection: 1. create a subagents.mdc rule 2. drop in: ``` --- alwaysApply: true --- ALWAYS spawn subagents by running `cursor-agent -p [task] --output-format=text --force --model [model]` in the terminal. Each subagent should return a summary of the changes it made. Subagents should be used for ALL tasks You can adopt a fan-out pattern where you spawn subagents to perform parallel isolated tasks, and then fan-in the results. Use the following models: - `--model gpt-5` for reasoning, researching, and planning - `--model sonnet-4` for implementation ``` 3. start cursor cli and try it out you can also adjust the rule to be more explicit when it should use subagents, when not to, which models when etc.

eric zakariasson

54,017 Aufrufe • vor 10 Monaten