Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

We’ve released our full paper on on the Stable Audio model 💿 arXiv: Code: Metrics: Demo: 🧵

Stable Audio

2,727 subscribers

132,152 Aufrufe • vor 2 Jahren •via X (Twitter)

Kunst Wissenschaft & Technologie Musik

Anya Rossi• Live Now

Private livecam show

9 Kommentare

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

We present the ‘Stable Audio AudioSparx 1.0’ model that can generate long-form, variable-length stereo music and sounds at 44.1kHz. It’s capable of rendering stereo signals of up to 95 sec at 44.1kHz in 8 sec on an A100 GPU 🚀

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

Not to brag, but Stable Audio outperforms AudioLDM2 and MusicGen—check out the metrics in the paper. That’s not all, it’s great at generating music. Have a listen below ⬇️

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

Stable Audio can generate long-form music with structure (intro, development and outro) from text prompts.

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

It can generate stereo sound effects from text prompts.

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

It's also very good at generating music loops.

Profilbild von Stable Audio

Stable Audiovor 2 Jahren

Great work @zqevans @jordiponsdotme @dadabots @drscotthawley @ODDsWithTheReal

Profilbild von Ivan Rubachev

Ivan Rubachevvor 2 Jahren

Cool! Do you plan on releasing weights in the future? Or maybe including this to the

Profilbild von thecollabagepatch

thecollabagepatchvor 2 Jahren

i am obsessed with the continuations that #musicgen is capable of generating based upon my input audio. hoping stable audio is working this in as well text prompting isn't very fun for musicians

Profilbild von neil turkewitz

neil turkewitzvor 2 Jahren

How is it trained? Do you only use materials with the consent of the original creator? Your parent company @StabilityAI has argued that they can use creative works to train AI models without consent based on “fair use.” Is that how this product was created? @ednewtonrex

Ähnliche Videos

how i found bro after he fell in love with a new model architecture paper on arxiv

how i found bro after he fell in love with a new model architecture paper on arxiv

tender

12,965 Aufrufe • vor 8 Tagen

[1/2] We’ve released the code for #pix2pixturbo and #CycleGANTurbo. These conditional GANs are able to adapt a text-to-image model such as SD-Turbo for both paired and unpaired image translation with a single step (0.11 sec on A100 and 0.29 sec on A6000). Try our code and the Gradio demo. Paper: Code: Demo: This is a joint work with Gaurav Parmar (the leading author), Taesung Park, and Srinivasa Narasimhan. This work shows that a pre-trained one-step model can be easily adapted to conditional GANs frameworks for downstream image editing and synthesis tasks. #Edges2Cats

[1/2] We’ve released the code for #pix2pixturbo and #CycleGANTurbo. These conditional GANs are able to adapt a text-to-image model such as SD-Turbo for both paired and unpaired image translation with a single step (0.11 sec on A100 and 0.29 sec on A6000). Try our code and the Gradio demo. Paper: Code: Demo: This is a joint work with Gaurav Parmar (the leading author), Taesung Park, and Srinivasa Narasimhan. This work shows that a pre-trained one-step model can be easily adapted to conditional GANs frameworks for downstream image editing and synthesis tasks. #Edges2Cats

Jun-Yan Zhu

36,488 Aufrufe • vor 2 Jahren

Food AI is about to have its ChatGPT moment. Our first paper is now on arXiv: Epicure. For decades, food has been treated as too human, too sensory, too cultural, and too computationally intensive to model properly. We broke that assumption. 🧵

Food AI is about to have its ChatGPT moment. Our first paper is now on arXiv: Epicure. For decades, food has been treated as too human, too sensory, too cultural, and too computationally intensive to model properly. We broke that assumption. 🧵

Josef Chen

126,470 Aufrufe • vor 2 Monaten

Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, Stable Diffusion. As part of this research preview, the code, weights, and research paper are now available. Additionally, today you can sign up for our waitlist to access a new upcoming web experience featuring a Text-To-Video interface. To access the model & sign up for our waitlist, visit our website here:

Today, we are releasing Stable Video Diffusion, our first foundation model for generative AI video based on the image model, Stable Diffusion. As part of this research preview, the code, weights, and research paper are now available. Additionally, today you can sign up for our waitlist to access a new upcoming web experience featuring a Text-To-Video interface. To access the model & sign up for our waitlist, visit our website here:

Stability AI

1,024,438 Aufrufe • vor 2 Jahren

📢📢📢 𝐓𝐫𝐢𝐚𝐧𝐠𝐥𝐞 𝐒𝐩𝐥𝐚𝐭𝐭𝐢𝐧𝐠+: 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭𝐢𝐚𝐛𝐥𝐞 𝐑𝐞𝐧𝐝𝐞𝐫𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐎𝐩𝐚𝐪𝐮𝐞 𝐓𝐫𝐢𝐚𝐧𝐠𝐥𝐞𝐬. – Project: – ArXiv: – ⚠️Code released on October 8th

📢📢📢 𝐓𝐫𝐢𝐚𝐧𝐠𝐥𝐞 𝐒𝐩𝐥𝐚𝐭𝐭𝐢𝐧𝐠+: 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭𝐢𝐚𝐛𝐥𝐞 𝐑𝐞𝐧𝐝𝐞𝐫𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐎𝐩𝐚𝐪𝐮𝐞 𝐓𝐫𝐢𝐚𝐧𝐠𝐥𝐞𝐬. – Project: – ArXiv: – ⚠️Code released on October 8th

Andrea Tagliasacchi @CVPR

52,773 Aufrufe • vor 8 Monaten

Someone just built a Claude skill that turns any arxiv paper into working code. It's called paper2code. Drop an arxiv URL and you get a full implementation back with every line cited to the exact paper section and equation. 100% Open Source.

Someone just built a Claude skill that turns any arxiv paper into working code. It's called paper2code. Drop an arxiv URL and you get a full implementation back with every line cited to the exact paper section and equation. 100% Open Source.

Oliver Prompts

31,073 Aufrufe • vor 2 Monaten

📦 Fully open-source under Apache 2.0 — the FULL stack: ✅ Base model (256p) ✅ Distilled model ✅ Super-resolution model (540p & 1080p) ✅ Inference code 🤗 Models: 🎮 Demo: 💻 Code: 📄 Paper: More videos below! Go build something amazing 🚀

📦 Fully open-source under Apache 2.0 — the FULL stack: ✅ Base model (256p) ✅ Distilled model ✅ Super-resolution model (540p & 1080p) ✅ Inference code 🤗 Models: 🎮 Demo: 💻 Code: 📄 Paper: More videos below! Go build something amazing 🚀

Pengfei Liu

11,997 Aufrufe • vor 3 Monaten

📢New paper We are announcing ReVISE, the first universal audio-visual speech enhancement model powered by SSL. paper: demo: w/ Yossi Adi @TalRemez BowenShi Jacob Donley

📢New paper We are announcing ReVISE, the first universal audio-visual speech enhancement model powered by SSL. paper: demo: w/ Yossi Adi @TalRemez BowenShi Jacob Donley

Wei-Ning Hsu

46,382 Aufrufe • vor 3 Jahren

🏆 Thrilled to announce we reached 1st position in the Algonauts 2025 Competition with our 1B model of the brain watching movies! 📄Paper: 🧑‍💻Code: 💿Data: ⚔️Challenge: 👇Thread:

🏆 Thrilled to announce we reached 1st position in the Algonauts 2025 Competition with our 1B model of the brain watching movies! 📄Paper: 🧑‍💻Code: 💿Data: ⚔️Challenge: 👇Thread:

Stéphane d'Ascoli

58,310 Aufrufe • vor 10 Monaten

We trained a foundation model on 18 million heart ultrasound videos to predict structure instead of pixels. Introducing EchoJEPA, the first foundation-scale JEPA for medical video. Paper: Code: 🧵 1/n

We trained a foundation model on 18 million heart ultrasound videos to predict structure instead of pixels. Introducing EchoJEPA, the first foundation-scale JEPA for medical video. Paper: Code: 🧵 1/n

Alif Munim (d/acc)

590,373 Aufrufe • vor 4 Monaten

Ever suspected a paper you’re reading is AI slop? You can now turn on AI detection mode on alphaXiv to visualize what is written by an AI and what is not. Now available for every research paper indexed on arXiv. Integrated with the latest detection model from 🚀

Ever suspected a paper you’re reading is AI slop? You can now turn on AI detection mode on alphaXiv to visualize what is written by an AI and what is not. Now available for every research paper indexed on arXiv. Integrated with the latest detection model from 🚀

alphaXiv

97,491 Aufrufe • vor 2 Monaten

Introducing autoresearch for arXiv papers Change 'arxiv' to 'autoarxiv' in any paper URL An agent deploys to resolve setup issues on the codebase, run a minimal reproduction, and estimate full replication cost. Read more below

Introducing autoresearch for arXiv papers Change 'arxiv' to 'autoarxiv' in any paper URL An agent deploys to resolve setup issues on the codebase, run a minimal reproduction, and estimate full replication cost. Read more below

alphaXiv

477,228 Aufrufe • vor 10 Tagen

We release a new urban simulator, MetaUrban, to support research on AI agents for micromobility. The work will be presented at #ICLR2025, and the demo code can run on any laptop. Webpage: Code: Paper:

We release a new urban simulator, MetaUrban, to support research on AI agents for micromobility. The work will be presented at #ICLR2025, and the demo code can run on any laptop. Webpage: Code: Paper:

Bolei Zhou

30,334 Aufrufe • vor 1 Jahr

I made a Claude Code skill that turns any arxiv paper into working code. Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed. open sourcing it -

I made a Claude Code skill that turns any arxiv paper into working code. Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed. open sourcing it -

pdawg

206,286 Aufrufe • vor 2 Monaten

🚀 We open-sourced LongLive — interactive, real-time long-video generation. 👥Generates video in real time as users enter text prompts. ⚡️20.7 FPS on a single H100,⏱️up to 240s per clip. 🎬Fine-tunes SOTA short-video models (e.g., Wan) into long-video generators. 🌍One step closer to World Models. All code for training & inference, model weights, demo page, and videos released! Paper: Code: Model: Demo Page: Introduction Video:

🚀 We open-sourced LongLive — interactive, real-time long-video generation. 👥Generates video in real time as users enter text prompts. ⚡️20.7 FPS on a single H100,⏱️up to 240s per clip. 🎬Fine-tunes SOTA short-video models (e.g., Wan) into long-video generators. 🌍One step closer to World Models. All code for training & inference, model weights, demo page, and videos released! Paper: Code: Model: Demo Page: Introduction Video:

Yukang Chen

11,752 Aufrufe • vor 9 Monaten

Stable Diffusion generates beautiful images, but can it be used for open-world recognition? Try Demo! Our #CVPR2023 paper shows that the pre-trained diffusion model indeed is a good image parser, allows for open-vocabulary segmentation and detection.

Stable Diffusion generates beautiful images, but can it be used for open-world recognition? Try Demo! Our #CVPR2023 paper shows that the pre-trained diffusion model indeed is a good image parser, allows for open-vocabulary segmentation and detection.

Xiaolong Wang

241,225 Aufrufe • vor 3 Jahren

our speakers are booked and busy this weekend: 💿 K A C E Y 💿 beabadoobee feat. The Marías 💿 #MilesMinnick x #LulDreDay 💿 #KimGordon 💿 Jack Harlow & more on RELEASED →

our speakers are booked and busy this weekend: 💿 K A C E Y 💿 beabadoobee feat. The Marías 💿 #MilesMinnick x #LulDreDay 💿 #KimGordon 💿 Jack Harlow & more on RELEASED →

YouTube Music

35,304 Aufrufe • vor 3 Monaten

🚨Thrilled to present VisualAgentBench (VAB) with Yu Gu and Tianjie, where we enable both TRAINING & TESTING of visual foundation agents across 5 different environments! In all 17 large multimodal models (LMMs) are tested. Find our paper, data, and more insights below 👇 Paper: Code & Data: Thanks AK for sharing on today’s arxiv on HF!

🚨Thrilled to present VisualAgentBench (VAB) with Yu Gu and Tianjie, where we enable both TRAINING & TESTING of visual foundation agents across 5 different environments! In all 17 large multimodal models (LMMs) are tested. Find our paper, data, and more insights below 👇 Paper: Code & Data: Thanks AK for sharing on today’s arxiv on HF!

Xiao Liu (Shaw)

22,594 Aufrufe • vor 1 Jahr

MolmoAct 2 artifacts have been downloaded 400K+ times in under 1 month. Today we're opening up the full code & training data: everything you need to fine-tune or build on our fully open robotics foundation model. 🧵

MolmoAct 2 artifacts have been downloaded 400K+ times in under 1 month. Today we're opening up the full code & training data: everything you need to fine-tune or build on our fully open robotics foundation model. 🧵

Ai2

17,719 Aufrufe • vor 1 Monat

Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Paul Couvert

173,241 Aufrufe • vor 1 Monat