Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

"MeshFlow: Efficient Artistic Mesh Generation via MeshVAE and Flow-based Diffusion Transformer" TL;DR: learns a continuous mesh latent space and generates vertices and connectivity in parallel with flow matching, producing quality 3D meshes up to 18× faster than autoregressive.

Alexandre Morgand

2,132 subscribers

42,638 просмотров • 10 дней назад •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

MotionStreamer announced on Hugging Face Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

MotionStreamer announced on Hugging Face Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

AK

43,387 просмотров • 1 год назад

Oh wow! TRELLIS 2 just dropped and it's a huge leap for image → 3D mesh generation. 📦 Compact structured latents = editable, efficient meshes 🔍 Sharper topology + better materials 💨 Faster inference with higher quality Links 👇

Oh wow! TRELLIS 2 just dropped and it's a huge leap for image → 3D mesh generation. 📦 Compact structured latents = editable, efficient meshes 🔍 Sharper topology + better materials 💨 Faster inference with higher quality Links 👇

Will Eastcott

21,235 просмотров • 6 месяцев назад

Scaling up GANs for Text-to-Image Synthesis present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs abs: project page:

Scaling up GANs for Text-to-Image Synthesis present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs abs: project page:

AK

278,115 просмотров • 3 лет назад

3D Gaussian Splats are cool, but they're static (Part 25). DG-Mesh can reconstruct high-quality and time-consistent 3D meshes from a single video and track mesh vertices over time, which enables texture editing on dynamic objects.

3D Gaussian Splats are cool, but they're static (Part 25). DG-Mesh can reconstruct high-quality and time-consistent 3D meshes from a single video and track mesh vertices over time, which enables texture editing on dynamic objects.

Dreaming Tulpa 🥓👑

79,033 просмотров • 2 лет назад

ActionMesh Animated 3D Mesh Generation with Temporal 3D Diffusion

ActionMesh Animated 3D Mesh Generation with Temporal 3D Diffusion

AK

12,327 просмотров • 4 месяцев назад

MinerU-Diffusion A 2.5B diffusion-based OCR model that replaces slow autoregressive decoding with parallel block-wise diffusion, achieving up to 3.2x faster inference while improving robustness on complex documents with tables, formulas, and layouts.

MinerU-Diffusion A 2.5B diffusion-based OCR model that replaces slow autoregressive decoding with parallel block-wise diffusion, achieving up to 3.2x faster inference while improving robustness on complex documents with tables, formulas, and layouts.

DailyPapers

15,304 просмотров • 2 месяцев назад

"Fast 4D Mesh Generation by Spatio-Temporal Attention Chains" TL;DR: training-free spatio-temporal attention chains generate topology-consistent 4D meshes from video 13× faster while improving temporal correspondence quality

"Fast 4D Mesh Generation by Spatio-Temporal Attention Chains" TL;DR: training-free spatio-temporal attention chains generate topology-consistent 4D meshes from video 13× faster while improving temporal correspondence quality

Alexandre Morgand

13,643 просмотров • 27 дней назад

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

Daniel Israel

38,699 просмотров • 8 месяцев назад

Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications Short summary: In our system, we animate a background 3D mesh and have the Gaussian splats follow the mesh’s vertices. During preprocessing, splats are assigned to mesh vertices, and their relative transformations are stored. Once this data is saved, you can instantly use it in your applications without further preprocessing. At runtime, we animate the background 3D mesh, update the Gaussian splats in parallel, and resort all Gaussian splats every frame based on the viewer’s perspective.

Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications Short summary: In our system, we animate a background 3D mesh and have the Gaussian splats follow the mesh’s vertices. During preprocessing, splats are assigned to mesh vertices, and their relative transformations are stored. Once this data is saved, you can instantly use it in your applications without further preprocessing. At runtime, we animate the background 3D mesh, update the Gaussian splats in parallel, and resort all Gaussian splats every frame based on the viewer’s perspective.

MrNeRF

66,016 просмотров • 8 месяцев назад

[NeurIPS '24] DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation Abstract (excerpt) We introduce DreamMesh4D, a novel framework that combines mesh representation with sparse-controlled deformation technique to generate high-quality 4D object from a monocular video. To overcome the limitation of classical texture representation, we bind Gaussian splats to the surface of the triangular mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh provided by a single image based 3D generation method. Sparse points are then uniformly sampled across the surface of the mesh, and are used to build a deformation graph to drive the motion of the 3D object for the sake of computational efficiency and providing additional constraint. For each step, transformations of sparse control points are predicted using a deformation network, and the mesh vertices as well as the bound surface Gaussians are deformed via a geometric skinning algorithm. The skinning algorithm is a hybrid approach combining LBS (linear blending skinning) and DQS (dual-quaternion skinning), mitigating drawbacks associated with both approaches. The static surface Gaussians and mesh vertices as well as the dynamic deformation network are learned via reference view photometric loss, score distillation loss as well as other regularization losses in a two-stage manner. Extensive experiments demonstrate that our method outperforms prior video-to-4D generation methods in terms of rendering quality and spatial-temporal consistency.

[NeurIPS '24] DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation Abstract (excerpt) We introduce DreamMesh4D, a novel framework that combines mesh representation with sparse-controlled deformation technique to generate high-quality 4D object from a monocular video. To overcome the limitation of classical texture representation, we bind Gaussian splats to the surface of the triangular mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh provided by a single image based 3D generation method. Sparse points are then uniformly sampled across the surface of the mesh, and are used to build a deformation graph to drive the motion of the 3D object for the sake of computational efficiency and providing additional constraint. For each step, transformations of sparse control points are predicted using a deformation network, and the mesh vertices as well as the bound surface Gaussians are deformed via a geometric skinning algorithm. The skinning algorithm is a hybrid approach combining LBS (linear blending skinning) and DQS (dual-quaternion skinning), mitigating drawbacks associated with both approaches. The static surface Gaussians and mesh vertices as well as the dynamic deformation network are learned via reference view photometric loss, score distillation loss as well as other regularization losses in a two-stage manner. Extensive experiments demonstrate that our method outperforms prior video-to-4D generation methods in terms of rendering quality and spatial-temporal consistency.

MrNeRF

12,323 просмотров • 1 год назад

3DTopia-XL High-Quality 3D PBR Asset Generation via Primitive Diffusion demo: model: 3DTopia-XL scales high-quality 3D asset generation using Diffusion Transformer (DiT) built upon an expressive and efficient 3D representation, PrimX. The denoising process takes 5 seconds to generate a 3D PBR asset from text/image input which is ready for the graphics pipeline to use.

3DTopia-XL High-Quality 3D PBR Asset Generation via Primitive Diffusion demo: model: 3DTopia-XL scales high-quality 3D asset generation using Diffusion Transformer (DiT) built upon an expressive and efficient 3D representation, PrimX. The denoising process takes 5 seconds to generate a 3D PBR asset from text/image input which is ready for the graphics pipeline to use.

AK

87,086 просмотров • 1 год назад

We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇

We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇

Agrim Gupta

431,123 просмотров • 2 лет назад

🚀🚀🚀Hunyuan 3D Studio just leveled up to 1.1! We've integrated the art-grade 3D generative model, Hunyuan 3D-PolyGen 1.5, to deliver the industry's most advanced mesh quality directly to your workflow. 🎨 🖌️ Art-Grade Quad Mesh: We've pioneered an end-to-end native quad mesh generation method. Unlike older methods that generated only tri-meshes, PolyGen 1.5 directly learns quad topology to produce cleaner, continuous edge loops and superior wireframe quality. 🎮 Professional Viability: Achieving this topology standard makes models instantly production-ready for game artists, 3D designers, and developers across game development, animation, and VR content creation. ⚙️ Flexible Output: PolyGen 1.5 supports both Quad and Triangular Topology, ensuring viability for both soft-surface and hard-surface models in professional pipelines. PolyGen 1.5 sets a new SOTA in stability, detail, and wireframe quality. Explore Hunyuan 3D Studio 1.1 and see the results:

🚀🚀🚀Hunyuan 3D Studio just leveled up to 1.1! We've integrated the art-grade 3D generative model, Hunyuan 3D-PolyGen 1.5, to deliver the industry's most advanced mesh quality directly to your workflow. 🎨 🖌️ Art-Grade Quad Mesh: We've pioneered an end-to-end native quad mesh generation method. Unlike older methods that generated only tri-meshes, PolyGen 1.5 directly learns quad topology to produce cleaner, continuous edge loops and superior wireframe quality. 🎮 Professional Viability: Achieving this topology standard makes models instantly production-ready for game artists, 3D designers, and developers across game development, animation, and VR content creation. ⚙️ Flexible Output: PolyGen 1.5 supports both Quad and Triangular Topology, ensuring viability for both soft-surface and hard-surface models in professional pipelines. PolyGen 1.5 sets a new SOTA in stability, detail, and wireframe quality. Explore Hunyuan 3D Studio 1.1 and see the results:

Tencent Hy

146,057 просмотров • 6 месяцев назад

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors paper page: present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture. In both stages, the 3D content is learned through reference view supervision and novel views guided by a combination of 2D and 3D diffusion priors. We introduce a single trade-off parameter between the 2D and 3D priors to control exploration (more imaginative) and exploitation (more precise) of the generated geometry. Additionally, we employ textual inversion and monocular depth regularization to encourage consistent appearances across views and to prevent degenerate solutions, respectively. Magic123 demonstrates a significant improvement over previous image-to-3D techniques, as validated through extensive experiments on synthetic benchmarks and diverse real-world images.

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors paper page: present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture. In both stages, the 3D content is learned through reference view supervision and novel views guided by a combination of 2D and 3D diffusion priors. We introduce a single trade-off parameter between the 2D and 3D priors to control exploration (more imaginative) and exploitation (more precise) of the generated geometry. Additionally, we employ textual inversion and monocular depth regularization to encourage consistent appearances across views and to prevent degenerate solutions, respectively. Magic123 demonstrates a significant improvement over previous image-to-3D techniques, as validated through extensive experiments on synthetic benchmarks and diverse real-world images.

AK

305,643 просмотров • 3 лет назад

📢MeshArt: Generating Articulated Meshes with Structure-guided Transformers Daoyi Gao generates articulated meshes with a hierarchical transformer, modeling articulation-aware structures that guide mesh synthesis. w/ Yawar Siddiqui Lei Li Project:

📢MeshArt: Generating Articulated Meshes with Structure-guided Transformers Daoyi Gao generates articulated meshes with a hierarchical transformer, modeling articulation-aware structures that guide mesh synthesis. w/ Yawar Siddiqui Lei Li Project:

Angela Dai

18,840 просмотров • 1 год назад

DeepMesh is out on Hugging Face Auto-Regressive Artist-mesh Creation with Reinforcement Learning Conditioned on point clouds and images, DeepMesh generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality.

DeepMesh is out on Hugging Face Auto-Regressive Artist-mesh Creation with Reinforcement Learning Conditioned on point clouds and images, DeepMesh generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality.

AK

109,174 просмотров • 1 год назад

SongBloom Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

SongBloom Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

AK

28,118 просмотров • 1 год назад

We are thrilled to release the next leap in art-grade 3D generative models. Our single-click model pipeline gives unprecedented mesh outputs, with mesh parts-based topology. It is available now for all Cube tiers to start for free. ☑️ Our multi-stage hierarchical AI models produce a fully assembled 3D mesh with adaptive poly-counts, providing the clean, separated topology you need. ✅ A parts-based approach enables high-resolution meshes. 🌐 Quad and Triangle mesh support. Access now:

We are thrilled to release the next leap in art-grade 3D generative models. Our single-click model pipeline gives unprecedented mesh outputs, with mesh parts-based topology. It is available now for all Cube tiers to start for free. ☑️ Our multi-stage hierarchical AI models produce a fully assembled 3D mesh with adaptive poly-counts, providing the clean, separated topology you need. ✅ A parts-based approach enables high-resolution meshes. 🌐 Quad and Triangle mesh support. Access now:

Common Sense Machines

420,323 просмотров • 11 месяцев назад

Nvidia presents LLaMA-Mesh Unifying 3D Mesh Generation with Language Models

Nvidia presents LLaMA-Mesh Unifying 3D Mesh Generation with Language Models

AK

113,506 просмотров • 1 год назад