Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

This is probably the most complex workflow I’ve ever built, only with open-source tools. It took my 4 days. It takes four inputs: author, title, and style; and generates a full visual animated story in one click in ComfyUI . I worked on it for four days. There are... still some bugs, but here’s the first preview. Here’s a quick breakdown: - The four inputs are sent to LLMs with precise instructions to generate: first, prompts for images and image modifications; second, prompts for animations; third, prompts for generating music. - All voices are generated from the text and timed precisely, as they determine the length of each animation segment. - The first image and video are generated to serve as the title, but also as the guide for all other images created for the video. - Titles and subtitles are also added automatically in Comfy. - I also developed a lot of custom nodes for minor frame calculations, mostly to match audio and video. - The full system is a large loop that, for each line of text, generates an image and then a video from that image. The loop was the hardest part to build in this workflow, so it can process either a 20-second video or a 2-minute video with the same input. - There are multiple combinations of LLMs that try to understand the text in the best way to provide the best prompts for images and video. - The final video is assembled entirely within ComfyUI. - The music is generated based on the LLM output and matches the exact timing of the full animation. - Done! For reference, this workflow uses a lot of models and only works on an RTX 6000 Pro with plenty of RAM. My goal is not to replace humans, as I’ll try to explain later, this workflow is highly controlled and can be adapted or reworked at any point by real artists! My aim was to create a tool that can animate text in one go, allowing the AI some freedom while keeping a strict flow. I don’t know yet how I’ll share this workflow with people, I still need to polish it properly, but maybe through Patreon. Anyway, I hope you enjoy my research, and let’s always keep pushing further! :)show more

Lovis Odin

9,093 subscribers

58,769 görüntüleme • 10 ay önce •via X (Twitter)

Bilim & Teknoloji Eğitim Sanat

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

I'm playing around with generative AI tools and stitching them together into visual stories. Here I took the first few sentences of Pride and Prejudice and made it into a video. The gen stack used for this one: - Anthropic Claude took the first chapter, generated the scenes and the individual prompts to to the image generator. - Ideogram took the prompts and generate the images - Luma took the images and animated them - for narration - VEED | AI Video Creation to stitch it together (Many of these choices are just what I happened to use for this one while exploring a bunch of things). Anyway honestly it was pretty messy and there is a ton of copy pasting between all of the tools, and even this little video with 3 scenes took me about an hour. There is a huge storytelling opportunity here for whoever can make this convenient. Who is building the first 100% AI-native movie maker?

I'm playing around with generative AI tools and stitching them together into visual stories. Here I took the first few sentences of Pride and Prejudice and made it into a video. The gen stack used for this one: - Anthropic Claude took the first chapter, generated the scenes and the individual prompts to to the image generator. - Ideogram took the prompts and generate the images - Luma took the images and animated them - for narration - VEED | AI Video Creation to stitch it together (Many of these choices are just what I happened to use for this one while exploring a bunch of things). Anyway honestly it was pretty messy and there is a ton of copy pasting between all of the tools, and even this little video with 3 scenes took me about an hour. There is a huge storytelling opportunity here for whoever can make this convenient. Who is building the first 100% AI-native movie maker?

Andrej Karpathy

608,746 görüntüleme • 2 yıl önce

Tested this concepting workflow. ▶️ Converting a level design image to a 3D asset gallery image ▶️ re-generating segmented subjects into separate high quality images. ▶️ using these images to generate the 3D meshes. First time using 'for loops' in ComfyUI, works great and the prompts are updated for every segment. I used the Nano Banana 2 API node in ComfyUI to generate the final high quality images. Those SAM3 nodes are crazy btw, just set some points and it creates a mask! For the 3D model generation I used the new Tripo Smart Mesh tool, which is fast! The output is not perfect yet. I think in a lot of cases not production ready right now but that depends on the input image and the goal as well. For pre-production / concepting phase it's perfect and for far distance meshes it’s fine. I’m sure these tools keep improving, curious what we can work with in 3-6 months. Minimal cleanup and scaling in Blender and tested the meshes in UEFN / Unreal Engine. After creating the ComfyUI workflow it took less than an hour to do everything.

Tested this concepting workflow. ▶️ Converting a level design image to a 3D asset gallery image ▶️ re-generating segmented subjects into separate high quality images. ▶️ using these images to generate the 3D meshes. First time using 'for loops' in ComfyUI, works great and the prompts are updated for every segment. I used the Nano Banana 2 API node in ComfyUI to generate the final high quality images. Those SAM3 nodes are crazy btw, just set some points and it creates a mask! For the 3D model generation I used the new Tripo Smart Mesh tool, which is fast! The output is not perfect yet. I think in a lot of cases not production ready right now but that depends on the input image and the goal as well. For pre-production / concepting phase it's perfect and for far distance meshes it’s fine. I’m sure these tools keep improving, curious what we can work with in 3-6 months. Minimal cleanup and scaling in Blender and tested the meshes in UEFN / Unreal Engine. After creating the ComfyUI workflow it took less than an hour to do everything.

Jerome | InsaneUnreal

17,267 görüntüleme • 4 ay önce

Seedance 2.0 is allowing us to enter a new era of music video creation. Here is how I created HONEY. It was a quick test to see how well this workflow holds up. 🐝 1 - Write your song and generate the music with Suno 5.5. 2 - Use an image generator of your choice. For HONEY I combined both Grok Imagine for aesthetics and Nano Banana Pro for refined editing. 3 - In Capcut I import my audio and just save out a blank video video containing the audio. This step is important because this video file containing audio will now be used with Seedance 2.0 as a video reference with Omni. This allows the AI to apply automatic and realistic lipsync and movement to the music, it's extremely powerful! 4 - Once I have a both my image and video with audio as reference, I use Seedance 2.0 Omni and upload my starting image and then the video reference with the audio. 5 - From here I'm simply prompting like normal, specifying what's happening in my scene with detailed instructions, mentioning multi shots and camera angle changes and then specifying that the person is singing along to the song. I type out the lyrics that are present to have better lipsync accuracy. 6 - Once I have generated a video and like the result, I do video to video, so i upload that video that just got generated and type "The scene continues" and prompt new actions to take place. This allows you to expand on a narrative. These new shots can be used as B-ROLL and since I uploaded my video as reference I have full consistency of everything it saw in the video. This is also extremely powerful. 7 - This is actually the most difficult part. Edit in Capcut. This is where you need to understand pacing and shot selection from all the scenes you generated to bring it all together. You must be strategic with the editing. Goodluck! I'll probably record a video tutorial at some point as it's easier to see what is being done.

Seedance 2.0 is allowing us to enter a new era of music video creation. Here is how I created HONEY. It was a quick test to see how well this workflow holds up. 🐝 1 - Write your song and generate the music with Suno 5.5. 2 - Use an image generator of your choice. For HONEY I combined both Grok Imagine for aesthetics and Nano Banana Pro for refined editing. 3 - In Capcut I import my audio and just save out a blank video video containing the audio. This step is important because this video file containing audio will now be used with Seedance 2.0 as a video reference with Omni. This allows the AI to apply automatic and realistic lipsync and movement to the music, it's extremely powerful! 4 - Once I have a both my image and video with audio as reference, I use Seedance 2.0 Omni and upload my starting image and then the video reference with the audio. 5 - From here I'm simply prompting like normal, specifying what's happening in my scene with detailed instructions, mentioning multi shots and camera angle changes and then specifying that the person is singing along to the song. I type out the lyrics that are present to have better lipsync accuracy. 6 - Once I have generated a video and like the result, I do video to video, so i upload that video that just got generated and type "The scene continues" and prompt new actions to take place. This allows you to expand on a narrative. These new shots can be used as B-ROLL and since I uploaded my video as reference I have full consistency of everything it saw in the video. This is also extremely powerful. 7 - This is actually the most difficult part. Edit in Capcut. This is where you need to understand pacing and shot selection from all the scenes you generated to bring it all together. You must be strategic with the editing. Goodluck! I'll probably record a video tutorial at some point as it's easier to see what is being done.

Travis Davids

19,081 görüntüleme • 3 ay önce

✨ I made my first video game with ChatGPT: 1) ChatGPT generates a text-based adventure game with DALL-E 3 generating images for it 2) Every time you play the game is different because it generates the story and images live 3) The images from DALL-E are sent to Runway which turns images into video 4) The text is sent to ElevenLabs which turns the text adventure into a pirate narrator voice 5) It's merged into a video 6) Interactive buttons are overlayed The game is called: 🐒🏝️🇳🇱The Secret of Monkey Island: Amsterdam (unofficial) And you can play it here: (video + TTS + buttons doesn't work auto yet, for now manual but text + img works, I'm building an interface for it now)

✨ I made my first video game with ChatGPT: 1) ChatGPT generates a text-based adventure game with DALL-E 3 generating images for it 2) Every time you play the game is different because it generates the story and images live 3) The images from DALL-E are sent to Runway which turns images into video 4) The text is sent to ElevenLabs which turns the text adventure into a pirate narrator voice 5) It's merged into a video 6) Interactive buttons are overlayed The game is called: 🐒🏝️🇳🇱The Secret of Monkey Island: Amsterdam (unofficial) And you can play it here: (video + TTS + buttons doesn't work auto yet, for now manual but text + img works, I'm building an interface for it now)

@levelsio

2,724,956 görüntüleme • 2 yıl önce

Here is a step-by-step introduction to building a workflow with a custom AI agent that uses MCP. I explain every component in the video: 1. Building the MCP server 2. Building the agent and an MCP client 3. Building a workflow that uses the agent The goal is simple: Generate a dialogue between two people and make one yell and the other answer with sarcasm. Kestra is sponsoring this video. They are an open-source orchestration platform (repo link below), and I used them to build the workflow that connects every component.

Here is a step-by-step introduction to building a workflow with a custom AI agent that uses MCP. I explain every component in the video: 1. Building the MCP server 2. Building the agent and an MCP client 3. Building a workflow that uses the agent The goal is simple: Generate a dialogue between two people and make one yell and the other answer with sarcasm. Kestra is sponsoring this video. They are an open-source orchestration platform (repo link below), and I used them to build the workflow that connects every component.

Santiago

51,527 görüntüleme • 1 yıl önce

This video was made almost entirely by AI. I used ChatGPT to write a script, Midjourney to create reference images, Runway Gen-1 to apply the style of the images to my source video, and Boomy AI for the music. Workflow breakdown w/ comparisons in thread. 🧵

This video was made almost entirely by AI. I used ChatGPT to write a script, Midjourney to create reference images, Runway Gen-1 to apply the style of the images to my source video, and Boomy AI for the music. Workflow breakdown w/ comparisons in thread. 🧵

Nick St. Pierre

2,279,571 görüntüleme • 3 yıl önce

"This is how GPT-4 sees and hears itself" I used GPT-4 to describe itself. Then I used its description to generate an image, a video based on this image and a soundtrack. Tools I used: GPT-4, Midjourney, Kainber AI, Mubert, RunwayML This is the description I used that GPT-4 had of itself as a prompt to text-to-image, image-to-video, and text-to-music. I put the video and sound together in RunwayML.

"This is how GPT-4 sees and hears itself" I used GPT-4 to describe itself. Then I used its description to generate an image, a video based on this image and a soundtrack. Tools I used: GPT-4, Midjourney, Kainber AI, Mubert, RunwayML This is the description I used that GPT-4 had of itself as a prompt to text-to-image, image-to-video, and text-to-music. I put the video and sound together in RunwayML.

Kris Kashtanova

1,233,452 görüntüleme • 3 yıl önce

I'm a bit confused... Google's Veo 2 is the best video model in text-to-video. But on the other hand... The newly released image-to-video for Veo 2 (on Freepik (now Magnific) and @FAL) feels underwhelming. Input images generated with Runway Frames. Here it is compared to Luma

I'm a bit confused... Google's Veo 2 is the best video model in text-to-video. But on the other hand... The newly released image-to-video for Veo 2 (on Freepik (now Magnific) and @FAL) feels underwhelming. Input images generated with Runway Frames. Here it is compared to Luma

Alex Patrascu

85,445 görüntüleme • 1 yıl önce

Most AI video tools still feel like traditional editors with a few AI features added in. You still end up adjusting timelines, fixing frames, and spending time editing. But Hailuo AI felt different to me. I gave it a simple prompt and one image, and it generated a full video on its own. No complicated workflow, no manual editing just idea to video in minutes. The text-to-video results are surprisingly good, image-to-video transitions look smooth, and the overall output feels cinematic. What I liked most is the speed. You can quickly test ideas without getting stuck in editing. Try it here: #Hailuo

Most AI video tools still feel like traditional editors with a few AI features added in. You still end up adjusting timelines, fixing frames, and spending time editing. But Hailuo AI felt different to me. I gave it a simple prompt and one image, and it generated a full video on its own. No complicated workflow, no manual editing just idea to video in minutes. The text-to-video results are surprisingly good, image-to-video transitions look smooth, and the overall output feels cinematic. What I liked most is the speed. You can quickly test ideas without getting stuck in editing. Try it here: #Hailuo

Markandey Sharma

32,326 görüntüleme • 2 ay önce

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

AI at Meta

2,264,864 görüntüleme • 1 yıl önce

Our very own Alexander Chen created "Hello, world" a short film inspired by the technology of his childhood with Google Labs Flow. Here's a closer look at how he brought his vision to life: 1) All of the clips were generated in Flow using text-to-image prompts except he used image-to-video prompts for the close-up shots to ensure consistency and accuracy (and the images used were generated by Imagen). 2) To evoke the exact feelings of his childhood home, he used Google Street View for visual references then asked the Google Gemini to write descriptions and prompts, then uploaded those into Flow which is what produced such vivid and accurate renderings in the film. 3) Every sound used in the video is generated by Veo 3. However, to blend things together, he used a separate video editing software to extend the head and tail of each clip over the proceeding and following clip, occasionally crossfading the audio to help with transitions.

Our very own Alexander Chen created "Hello, world" a short film inspired by the technology of his childhood with Google Labs Flow. Here's a closer look at how he brought his vision to life: 1) All of the clips were generated in Flow using text-to-image prompts except he used image-to-video prompts for the close-up shots to ensure consistency and accuracy (and the images used were generated by Imagen). 2) To evoke the exact feelings of his childhood home, he used Google Street View for visual references then asked the Google Gemini to write descriptions and prompts, then uploaded those into Flow which is what produced such vivid and accurate renderings in the film. 3) Every sound used in the video is generated by Veo 3. However, to blend things together, he used a separate video editing software to extend the head and tail of each clip over the proceeding and following clip, occasionally crossfading the audio to help with transitions.

Google AI

94,790 görüntüleme • 1 yıl önce

A student built a whole faceless passive business by creating AI backyards. I kept seeing these AI backyard builds hitting 50M views and thought it took months of editing, but I was wrong. The creators aren't starting with a dirt lot. They start with the perfect final image and make the AI work backward. And the secret is stupidly simple: generate the final picture first, and let the AI reverse-engineer it. Here is the exact 3-step system you can use to build this: - The Blueprint: Upload your finished backyard to the "Restoration Timelapse" GPT. It reverse-engineers the final image into text prompts for the empty lot. - The Setup: Paste those prompts into Dzine. This generates your "before" images with perfectly matched geometry. - The Animation: Upload your empty lot and finished yard into Kling 3.0 to animate the build. Once Kling spits out the video file, drop it into CapCut, keep the raw construction audio, and export. I broke down the complete, step-by-step architecture with GPT + GROK + CAPCUT in my full guide below 👇

A student built a whole faceless passive business by creating AI backyards. I kept seeing these AI backyard builds hitting 50M views and thought it took months of editing, but I was wrong. The creators aren't starting with a dirt lot. They start with the perfect final image and make the AI work backward. And the secret is stupidly simple: generate the final picture first, and let the AI reverse-engineer it. Here is the exact 3-step system you can use to build this: - The Blueprint: Upload your finished backyard to the "Restoration Timelapse" GPT. It reverse-engineers the final image into text prompts for the empty lot. - The Setup: Paste those prompts into Dzine. This generates your "before" images with perfectly matched geometry. - The Animation: Upload your empty lot and finished yard into Kling 3.0 to animate the build. Once Kling spits out the video file, drop it into CapCut, keep the raw construction audio, and export. I broke down the complete, step-by-step architecture with GPT + GROK + CAPCUT in my full guide below 👇

Spivach

17,421 görüntüleme • 1 ay önce

Hedra’s Character-3 is a game-changer for AI storytelling! 🔥✨ Upload an image, add speech and movement prompts, or even drop in an audio file, and it generates the full video for you. The Process is super easy: Image: Freepik Audio: elevenlabs (you can record yourself) Video: Hedra Video prompt: female reporter excitedly reporting the news, holds a microphone, wearing a professional suit, delivering a live broadcast., natural movements You can also use Hedra to generate audio and images! All in one 🚀🔥 What do you think of the result!?

Hedra’s Character-3 is a game-changer for AI storytelling! 🔥✨ Upload an image, add speech and movement prompts, or even drop in an audio file, and it generates the full video for you. The Process is super easy: Image: Freepik Audio: elevenlabs (you can record yourself) Video: Hedra Video prompt: female reporter excitedly reporting the news, holds a microphone, wearing a professional suit, delivering a live broadcast., natural movements You can also use Hedra to generate audio and images! All in one 🚀🔥 What do you think of the result!?

Amira Zairi

76,116 görüntüleme • 1 yıl önce

Using Nano Banana Pro + Sora, I can now create a short video in just about one minute. I’ve been using Kling for a while to make short videos, and while the results can be great, the workflow has always felt a bit heavy for me. Previously, I had to export the video frame by frame as images, then send those keyframes into Kling’s first/last frame mode to generate the in-betweens. It worked, but it took a lot of time and small steps. Recently I realized I don’t necessarily need to do that anymore: now I can use Nano Banana Pro to generate the images, then send them directly to Sora to create the short video—no need to manually extract a specific frame. Another bonus is that Sora also handles basic sound design for you, which saves me from an extra round of editing and makes the whole process feel much lighter. This idea actually came from a post I saw this morning on Twitter by SD | AI Animation Storyteller, which gave me a new way of thinking about my workflow. I’m still experimenting, but so far it’s been a big improvement in efficiency for my use case. I’ve put the detailed steps I’m using in the comments in case anyone wants to try something similar.

Using Nano Banana Pro + Sora, I can now create a short video in just about one minute. I’ve been using Kling for a while to make short videos, and while the results can be great, the workflow has always felt a bit heavy for me. Previously, I had to export the video frame by frame as images, then send those keyframes into Kling’s first/last frame mode to generate the in-betweens. It worked, but it took a lot of time and small steps. Recently I realized I don’t necessarily need to do that anymore: now I can use Nano Banana Pro to generate the images, then send them directly to Sora to create the short video—no need to manually extract a specific frame. Another bonus is that Sora also handles basic sound design for you, which saves me from an extra round of editing and makes the whole process feel much lighter. This idea actually came from a post I saw this morning on Twitter by SD | AI Animation Storyteller, which gave me a new way of thinking about my workflow. I’m still experimenting, but so far it’s been a big improvement in efficiency for my use case. I’ve put the detailed steps I’m using in the comments in case anyone wants to try something similar.

underwood

46,400 görüntüleme • 7 ay önce

It took me less than 10 minutes to create this scene. The longest part was finding the style for my initial image. I used Grok Imagine's extension video to ensure perfect continuity of style, voice, and music. Then I animated it, prompting in my native language without asking for any dialogue. Grok understood everything I asked and created a dialogue between the children in perfect French. I later posted the same children with a prompt in English, and this time, the dialogue was in English. I used Grok Imagine to create the image, animate it, and then extend the video to 30 seconds. I then started again with an image that I modified through editing and used it to tell another story.

It took me less than 10 minutes to create this scene. The longest part was finding the style for my initial image. I used Grok Imagine's extension video to ensure perfect continuity of style, voice, and music. Then I animated it, prompting in my native language without asking for any dialogue. Grok understood everything I asked and created a dialogue between the children in perfect French. I later posted the same children with a prompt in English, and this time, the dialogue was in English. I used Grok Imagine to create the image, animate it, and then extend the video to 30 seconds. I then started again with an image that I modified through editing and used it to tell another story.

Déborah

500,949 görüntüleme • 4 ay önce

I asked Garry Tan how to use meta prompting to get better at AI: "My partners at YC Jared Friedman and Pete Koomen showed me how to do this. You can take almost anything that you do all the time and just drop it into a context window. And then say, “Here’s a bunch of inputs and outputs." And maybe you also add a bunch of notes. And then you tell it, “Write me a prompt that can act as an agent that takes this input and makes this output over here.” You can do this for almost any type of knowledge work. And you can even introspect. "What are things you notice that I did to convert this from the input to the output?”. And then you can just start using the prompt. Initially, it’s going to suck. Because it’s just not that smart yet. But what’s funny is now, I also use it to Iterate my writing. You can be very direct, "I would never say that", "Don’t say it like this", or "Oh, you used the long word there, use the short word". Just speak to it conversationally. And then when you're happy with the output, you can use that new output to make a new prompt. "Based on this conversation, give me a better initial prompt that incorporates all the things we talked about." And you can do this with literally everything. And in theory, there’s so much it applies to that people do day-to-day. You could use it for tweets. You could use it for editing podcasts. You can use it for pretty much everything. I have a folder of prompts that I use all the time. My YouTube prompt is on v27 or something. I'll go through this process with all the different max models. I'll use GPT 5.2 Pro. I’ll use Grok. I'll use Claude. Then, I’ll take all the outputs from all the models and put them into Claude and say "Here’s my prompt, here’s the output from four LLMs, including yourself. Rate each response and tell me what the pros and cons of each approach are." And I usually say "give it to me in numbered form". And then you can agree with one, disagree with two, tell it three is this or that. And then after that, you say given all of this, synthesize it."

I asked Garry Tan how to use meta prompting to get better at AI: "My partners at YC Jared Friedman and Pete Koomen showed me how to do this. You can take almost anything that you do all the time and just drop it into a context window. And then say, “Here’s a bunch of inputs and outputs." And maybe you also add a bunch of notes. And then you tell it, “Write me a prompt that can act as an agent that takes this input and makes this output over here.” You can do this for almost any type of knowledge work. And you can even introspect. "What are things you notice that I did to convert this from the input to the output?”. And then you can just start using the prompt. Initially, it’s going to suck. Because it’s just not that smart yet. But what’s funny is now, I also use it to Iterate my writing. You can be very direct, "I would never say that", "Don’t say it like this", or "Oh, you used the long word there, use the short word". Just speak to it conversationally. And then when you're happy with the output, you can use that new output to make a new prompt. "Based on this conversation, give me a better initial prompt that incorporates all the things we talked about." And you can do this with literally everything. And in theory, there’s so much it applies to that people do day-to-day. You could use it for tweets. You could use it for editing podcasts. You can use it for pretty much everything. I have a folder of prompts that I use all the time. My YouTube prompt is on v27 or something. I'll go through this process with all the different max models. I'll use GPT 5.2 Pro. I’ll use Grok. I'll use Claude. Then, I’ll take all the outputs from all the models and put them into Claude and say "Here’s my prompt, here’s the output from four LLMs, including yourself. Rate each response and tell me what the pros and cons of each approach are." And I usually say "give it to me in numbered form". And then you can agree with one, disagree with two, tell it three is this or that. And then after that, you say given all of this, synthesize it."

The Peel

51,632 görüntüleme • 4 ay önce

THIS IS CRAZY. I connected ElevenLabs to Lovable and built an AI storytelling app in UNDER 20 minutes. Enter a kid's name, pick a theme. The app writes a personalized bedtime story, narrates it with ElevenLabs, and plays ambient sounds that match the story world. → Text to Speech narration with emotion → Sound effects generated on the fly → Download the full story as audio Comment "VOICE" and I'll send you the full workflow + prompts. 👇

THIS IS CRAZY. I connected ElevenLabs to Lovable and built an AI storytelling app in UNDER 20 minutes. Enter a kid's name, pick a theme. The app writes a personalized bedtime story, narrates it with ElevenLabs, and plays ambient sounds that match the story world. → Text to Speech narration with emotion → Sound effects generated on the fly → Download the full story as audio Comment "VOICE" and I'll send you the full workflow + prompts. 👇

Prajwal Tomar

40,502 görüntüleme • 2 ay önce

Here's my first series of Anime created by Generative AI. Vidu 2.0 is a groundbreaking advancement for storytelling in the Anime style. #vidu @Viduforhuman Hi everyone 😊 😉 Here's my first AI-created Anime series. The name of my anime is Kuro & Yuki. I won't spoil the story for you, but it begins with these two boys locked up in what appears to be a highly secure institute/prison. Yuki has reached the age where he has awakened his power. While wandering the corridors of the institute, he meets Kuro, a boy with autism. It's a fleeting encounter as they end up being separated. As the story unfolds, you'll understand who they really are and why they are locked up. I don't plan on making episode 2 yet, even though I've grown attached to my characters' story. I'll continue the story only if a lot of people are interested and want to know what happens next. Otherwise, I'll explore other universes and styles of Anime 😉 Tools: - Epidemic Sound (For sound effects) - Vidu AI (to transform frames into animation) - ElevenLabs (for certain expressions) When I say Vidu 2.0 is a game-changer, I mean it. I had already made my animation with the previous version of Vidu, but when they gave me access to the beta, I urgently changed my plans and recreated all the animations. Let me tell you, it's a whole new level. My first animation was ultra frustrating to bring my ideas to life! Really! And since Vidu couldn't handle many image styles, I clearly couldn't bring my ideas to life. But with Vidu 2.0, I enjoy it much more! For the first time, I really get to bring my ideas to life! Until now, I always had to make compromises. Of course, Vidu is still not perfect, and there are still many obstacles, but it is a truly magnificent advancement! (and it surpasses all the Anime-style AIs I've tested recently) Moreover, Vidu 2.0 is fantastic for special effects, but also for embedding elements into the video (like a hand that appears and interacts with the character, or another character; and the best part is that they perfectly match the style of the original images!) * The animations/images and voices are AI-generated * The SoundDesign is traditional and done by me * The story was entirely written by me (I don't use AI to create my stories, it's important to me to write them myself) The voices were generated thanks to Nijivoice (for secondary characters) and Hailuo Audio (for main characters). I thank Yachimat (yachimat - AI Short Anime) for introducing me to Nijivoice, it's very generous of him! During the beta, we didn't consume credits, otherwise, this animation would have cost me around 20,000 credits, ha ha. Vidu 2.0, with its superb stability and fidelity to the style of images, offers brand new horizons for storytelling! One problem with Vidu 2.0 is that for now, you have to manually extend videos (by exporting the last frame). As you can see, this allowed me to create long scenes with different actions, and everything fits together perfectly! There are still many obstacles to storytelling, such as character consistency (I use the image to video function; I produce my images with Nijijourney) and it's always laborious to have a consistent character! The same goes for backgrounds. Here are the points I've identified that would facilitate storytelling: For Vidu: - Function to extend videos (you can already do this manually by exporting the last frame) - Function to invent the beginning or end of a video using a single frame (you can already do this manually by putting a completely white or completely black image as the first or last frame; this technique also allows you to have very dynamic results with Vidu 2.0! ) - It's difficult to obtain facial expressions for certain image styles (the same image style that the previous version couldn't handle at all). - More dynamism There's always room for improvement, but Vidu 2.0 has truly opened up exciting new possibilities for creative storytelling. ai aiart aianimation aianime ainews anime animenews aitools aitool

Naegiko

66,107 görüntüleme • 1 yıl önce

The biggest mistake people make with Seedance 2.0 is writing prompts at all. Sounds strange, but the model wasn't built for describing things in words - it was built for multimodal direction: up to 12 references at once, combining images, video, and audio. Each reference type controls a different layer: the image sets the style, the video defines the camera movement, the audio sets the rhythm of the scene. When you combine all three instead of typing "camera slowly pushes in, tense atmosphere" - the model understands it directly, with no interpretation and no guessing involved. Text is the weakest control tool available here. And most users are stuck using exactly that.

The biggest mistake people make with Seedance 2.0 is writing prompts at all. Sounds strange, but the model wasn't built for describing things in words - it was built for multimodal direction: up to 12 references at once, combining images, video, and audio. Each reference type controls a different layer: the image sets the style, the video defines the camera movement, the audio sets the rhythm of the scene. When you combine all three instead of typing "camera slowly pushes in, tense atmosphere" - the model understands it directly, with no interpretation and no guessing involved. Text is the weakest control tool available here. And most users are stuck using exactly that.

Zentrix⌚️

72,877 görüntüleme • 1 gün önce