Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

💫It's fascinating that a single feed-forward pass through an LLM can replace a complex rendering pipeline, like Blender! Just feed it 3D shapes, xyz positions, and poses as tokens, and it spits out the image token-by-token. The dual, aka scene reconstruction, is also possible! 👇

Georgia Gkioxari

12,094 subscribers

44,553 просмотров • 1 год назад •via X (Twitter)

Наука и технологии

Anya Rossi• Live Now

Private livecam show

Комментарии: 7

Фото профиля Georgia Gkioxari

Georgia Gkioxari1 год назад

The dual: Image goes in as tokens, and 3D shapes, xyz positions and poses come out token-by-token.

Фото профиля Georgia Gkioxari

Georgia Gkioxari1 год назад

Read more here:

Фото профиля Mike Roberts

Mike Roberts1 год назад

This is exciting!! Congrats Georgia and team 🥳 Silly naive question: How should I think about this work in relation to that recent RenderFormer paper from a few weeks ago?

Фото профиля Georgia Gkioxari

Georgia Gkioxari1 год назад

RenderFormer is awesome! Very very similar in spirit -- use a neural net to predict the rendered image of an input 3D asset. Many differences in algorithm and in scope: * our model is fully autoregressive and performs next-token prediction. This token can an image token, shape token, or text token. RenderFormer is vision-transformer style. * We can do 3D-to-image (rendering), image-to-3D (reconstruction, recognition), image + 3D-to-image + 3D (instruction-following), all with the same framework enabled by the unified token-wise model. It's why we love tokens! * For the rendering task, we emphasize on compositionality (scenes composed of many objects) with control over the locations, poses and object types/shapes -- all specified in the input. Our model at the end is dirt simple, just an LLM, but we found some things to be very critical: (1) how to best encode numbers to specific 3D locations and poses, (2) how to discretize/tokenize 3D shapes, which are inherently continuous, and (3) how to fuse the modalities.

Фото профиля Bisheshwor Neupane

Bisheshwor Neupane1 год назад

Is it faster?

Фото профиля Krish Mehta

Krish Mehta1 год назад

Wow, feels like it could be applied to better part segmentation too?

Фото профиля honasu-san

honasu-san1 год назад

According to apple it must be an illusion.

Похожие видео

Nature presents a captivating confluence of similarity and diversity. Our new method 3D-Fauna learns a pan-category articulated 3D model of quadruped animals from Internet photos. At test time, it turns a single image into an animatable textured 3D mesh in a feed-forward pass.

Nature presents a captivating confluence of similarity and diversity. Our new method 3D-Fauna learns a pan-category articulated 3D model of quadruped animals from Internet photos. At test time, it turns a single image into an animatable textured 3D mesh in a feed-forward pass.

Elliott / Shangzhe Wu

107,220 просмотров • 2 лет назад

Image-Blaster is a Claude skill that can create an entire 3D environment from a single image. The special sauce here is that it also extracts key environment elements and converts them into their own separate 3D models. Full YT video:

Image-Blaster is a Claude skill that can create an entire 3D environment from a single image. The special sauce here is that it also extracts key environment elements and converts them into their own separate 3D models. Full YT video:

Matt Workman

25,206 просмотров • 1 месяц назад

Vibe creating a 3D scene in Blender got even easier. Just provide a 2D reference image — LLM deconstructs the scene, figures out what 3D assets are needed, generates them using Tripo, and arranges them to match your reference image. Pretty darn cool!

Vibe creating a 3D scene in Blender got even easier. Just provide a 2D reference image — LLM deconstructs the scene, figures out what 3D assets are needed, generates them using Tripo, and arranges them to match your reference image. Pretty darn cool!

Bilawal Sidhu

112,462 просмотров • 1 год назад

TOKENFI LAUNCHPAD IS OFFICIALLY LIVE ON MAINNET TokenFi Launchpad is a decentralized launchpad for projects that want to raise funds for their crypto tokens. It is powered by $TOKEN as its main utility token on the BNB and ETH chains, with a 2% fee charged on funds raised by every project, 50% of which is used to buy and burn $TOKEN, making it perpetually deflationary. The very first project to go live on TokenFi Launchpad is the YakDAO token sale, and it is live now. You can find information on how to participate in the YakDAO token sale here:

TOKENFI LAUNCHPAD IS OFFICIALLY LIVE ON MAINNET TokenFi Launchpad is a decentralized launchpad for projects that want to raise funds for their crypto tokens. It is powered by $TOKEN as its main utility token on the BNB and ETH chains, with a 2% fee charged on funds raised by every project, 50% of which is used to buy and burn $TOKEN, making it perpetually deflationary. The very first project to go live on TokenFi Launchpad is the YakDAO token sale, and it is live now. You can find information on how to participate in the YakDAO token sale here:

TokenFi

113,700 просмотров • 2 лет назад

Day 5 of why you need to use Jupiter Mobile: TRUSTED TOKEN DATA & INSIGHTS See trusted, verified token data + a clean feed of relevant content in Jupiter Mobile Token Pro’s ‘About’ section – powered by VRFD, Jupiter’s verified data layer for token information and insights. Onchain finance has never had a reliable source of truth for token information. Most wallets surface raw links or leave you to dig through socials and explorers. Instead, you get: - Simple summaries explaining what the token is and what’s currently happening - A clean feed of verified community insights, not random noise - Verified token metadata with direct links to official websites and socials If a token is covered by VRFD, you see community-validated information. If it isn’t, the full project X feed is surfaced automatically, so you still have full context. What used to take digging across multiple pages now lives directly on the token page. Just Use Jupiter (Mobile).

Day 5 of why you need to use Jupiter Mobile: TRUSTED TOKEN DATA & INSIGHTS See trusted, verified token data + a clean feed of relevant content in Jupiter Mobile Token Pro’s ‘About’ section – powered by VRFD, Jupiter’s verified data layer for token information and insights. Onchain finance has never had a reliable source of truth for token information. Most wallets surface raw links or leave you to dig through socials and explorers. Instead, you get: - Simple summaries explaining what the token is and what’s currently happening - A clean feed of verified community insights, not random noise - Verified token metadata with direct links to official websites and socials If a token is covered by VRFD, you see community-validated information. If it isn’t, the full project X feed is surfaced automatically, so you still have full context. What used to take digging across multiple pages now lives directly on the token page. Just Use Jupiter (Mobile).

Jupiter

239,435 просмотров • 5 месяцев назад

SAM 3D enables accurate 3D reconstruction from a single image, supporting real-world applications in editing, robotics, and interactive scene generation. Matt, a SAM 3D researcher, explains how the two-model design makes this possible for both people and complex environments. 🔗 Read the SAM 3D Objects research paper: 🔗 Read the SAM 3D Body research paper:

SAM 3D enables accurate 3D reconstruction from a single image, supporting real-world applications in editing, robotics, and interactive scene generation. Matt, a SAM 3D researcher, explains how the two-model design makes this possible for both people and complex environments. 🔗 Read the SAM 3D Objects research paper: 🔗 Read the SAM 3D Body research paper:

AI at Meta

17,858 просмотров • 7 месяцев назад

TOKENFI TOKEN LAUNCHER IS LIVE ON MAINNET IT’S OFFICIAL! TokenFi Token Launcher is live on the mainnet of five top EVM blockchains: Ethereum, BNB Chain, Base, Arbitrum, and opBNB. With TokenFi Token Launcher, ANYONE can create a fungible crypto token (ERC-20), NFT (ERC-721), or Multi-Token (ERC-1155) in just a few clicks, regardless of their crypto experience. The TokenFi platform and framework has been thoroughly audited and is being monitored in real-time by industry-leading blockchain security auditor Certik. This means that the tokens you create with TokenFi are safe and audited. TokenFi Token Launcher is powered by $TOKEN as its main utility token, and every successful transaction on the Ethereum and BNB blockchains buys and burns $TOKEN, enhancing its utility and making it perpetually deflationary as adoption of the platform grows. TokenFi Token Launcher aims to be the dominant token creation platform on EVM blockchains, and to this effect, we will be rolling out the platform on dozens of additional EVM blockchains in the coming weeks and months. You can learn more about TokenFi Token Launcher here:

TOKENFI TOKEN LAUNCHER IS LIVE ON MAINNET IT’S OFFICIAL! TokenFi Token Launcher is live on the mainnet of five top EVM blockchains: Ethereum, BNB Chain, Base, Arbitrum, and opBNB. With TokenFi Token Launcher, ANYONE can create a fungible crypto token (ERC-20), NFT (ERC-721), or Multi-Token (ERC-1155) in just a few clicks, regardless of their crypto experience. The TokenFi platform and framework has been thoroughly audited and is being monitored in real-time by industry-leading blockchain security auditor Certik. This means that the tokens you create with TokenFi are safe and audited. TokenFi Token Launcher is powered by $TOKEN as its main utility token, and every successful transaction on the Ethereum and BNB blockchains buys and burns $TOKEN, enhancing its utility and making it perpetually deflationary as adoption of the platform grows. TokenFi Token Launcher aims to be the dominant token creation platform on EVM blockchains, and to this effect, we will be rolling out the platform on dozens of additional EVM blockchains in the coming weeks and months. You can learn more about TokenFi Token Launcher here:

TokenFi

1,288,816 просмотров • 2 лет назад

The ERC-20i mini app is now live on Farcaster This isn’t just a viewer - it’s a portal into the future of tokens. For the first time, you can explore living, dynamic tokens: art that evolves, reacts, and renders onchain, right inside your social feed. No more static assets. No more offchain wrappers. ERC-20i turns tokens into interactive, expressive digital organisms. And now, they’re right in your feed. 🍄 A new era for art, identity and token UX is here.

The ERC-20i mini app is now live on Farcaster This isn’t just a viewer - it’s a portal into the future of tokens. For the first time, you can explore living, dynamic tokens: art that evolves, reacts, and renders onchain, right inside your social feed. No more static assets. No more offchain wrappers. ERC-20i turns tokens into interactive, expressive digital organisms. And now, they’re right in your feed. 🍄 A new era for art, identity and token UX is here.

Fungi (🍄,🛡️)

20,790 просмотров • 1 год назад

I just built a full 3D world I could actually walk through… from a single prompt. No Blender. No game engine. No complex 3D tools. Just AI. This is OpenArt Worlds and it feels like the future of world creation. 🌍 Let me show you what happened!

I just built a full 3D world I could actually walk through… from a single prompt. No Blender. No game engine. No complex 3D tools. Just AI. This is OpenArt Worlds and it feels like the future of world creation. 🌍 Let me show you what happened!

SPIDER ◾

64,706 просмотров • 3 месяцев назад

Whale Watch v3 is live 🚨 v3 is a game changer - update your apps! 🔔 Custom Notifications - get alerted whenever whales make important moves ⚡️ New Filters - sort by newly launched tokens or tokens just purchase by whales for the first time 📈 Token Feeds - click into any token to see a Whale Watch feed specific to that token Whale Watch allows you to track what memecoin whales are buying and selling in real-time!

Whale Watch v3 is live 🚨 v3 is a game changer - update your apps! 🔔 Custom Notifications - get alerted whenever whales make important moves ⚡️ New Filters - sort by newly launched tokens or tokens just purchase by whales for the first time 📈 Token Feeds - click into any token to see a Whale Watch feed specific to that token Whale Watch allows you to track what memecoin whales are buying and selling in real-time!

AssetDash

617,063 просмотров • 1 год назад

time to properly introduce WINTscan. built to track every single movement across the Virtuals Protocol ecosystem, and soon, the entire evm market. WINTscan currently has 3 main functions: - transaction history: full visibility into all ecosystem activity. see every transaction in detail. origin, destination, token, value, and wallet tags. filter by token, type, or importance to surface only what matters. - wallets: map the biggest players at a glance. see the top holders, their portfolio breakdowns, and which tokens they dominate. filter by value, tag, or allocation to find the real whales. - real-time feed: instant feed of movements as they happen. the all-seeing eye that lets you keep track of every single movement in the ecosystem instantly, and stay ahead of the competition. it's important to note that, all of these data points will be used to feed the new, upgraded version of the Watchtower. A Watchtower you can personalize to only get the most important alerts according to your needs and get them instantly. So keep an eye out, or don't, cause "The Watchtower" does it for you. go try it out, it's live and it's free to access for now:

time to properly introduce WINTscan. built to track every single movement across the Virtuals Protocol ecosystem, and soon, the entire evm market. WINTscan currently has 3 main functions: - transaction history: full visibility into all ecosystem activity. see every transaction in detail. origin, destination, token, value, and wallet tags. filter by token, type, or importance to surface only what matters. - wallets: map the biggest players at a glance. see the top holders, their portfolio breakdowns, and which tokens they dominate. filter by value, tag, or allocation to find the real whales. - real-time feed: instant feed of movements as they happen. the all-seeing eye that lets you keep track of every single movement in the ecosystem instantly, and stay ahead of the competition. it's important to note that, all of these data points will be used to feed the new, upgraded version of the Watchtower. A Watchtower you can personalize to only get the most important alerts according to your needs and get them instantly. So keep an eye out, or don't, cause "The Watchtower" does it for you. go try it out, it's live and it's free to access for now:

WhaleIntel.ai

21,725 просмотров • 8 месяцев назад

SceneScript treats 3D reconstruction as a language problem rather than a geometry one. The model watches a video of a room and just learns to write a script for it. It autoregressively spits out text commands like make_wall(...) or make_bbox(...) that define the scene. Stanford's new "Scene Language" paper goes a step further adding CLIP embeddings to capture visual appearance too. The fact that language models already understand spatial relationships well enough to write out scene graphs is pretty wild.

SceneScript treats 3D reconstruction as a language problem rather than a geometry one. The model watches a video of a room and just learns to write a script for it. It autoregressively spits out text commands like make_wall(...) or make_bbox(...) that define the scene. Stanford's new "Scene Language" paper goes a step further adding CLIP embeddings to capture visual appearance too. The fact that language models already understand spatial relationships well enough to write out scene graphs is pretty wild.

Bilawal Sidhu

107,011 просмотров • 11 месяцев назад

In Late Feb I called out $OM for having fishy price action and tokenomics Also that token just didn't pass the sniff test > zero pullbacks on a 400X? Get real Can anyone confirm if my theory that the tokens were stranded on a Polkadot chain was correct?

In Late Feb I called out $OM for having fishy price action and tokenomics Also that token just didn't pass the sniff test > zero pullbacks on a 400X? Get real Can anyone confirm if my theory that the tokens were stranded on a Polkadot chain was correct?

EllioTrades

131,597 просмотров • 1 год назад

Project #2: LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)

Project #2: LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)

Brendan Bycroft

1,201,152 просмотров • 2 лет назад

This is a trillion-dollar industry, and you can't solve it with an LLM: • Forecasting • Fraud detection • Churn prediction Large Language Models are fundamentally bad at solving these problems. When you feed structured data into an LLM, it doesn't see relationships, and it treats every number, date, and foreign key as a token. That's why you always get garbage back. An LLM thinks your database is a Wikipedia article. It doesn't understand its structure or its relationships. GPT-4 scores 63% on relational prediction tasks. That's the best it can do, and that's pretty much useless. You can't expect real-world business value to come from summarizing Wikipedia articles.

This is a trillion-dollar industry, and you can't solve it with an LLM: • Forecasting • Fraud detection • Churn prediction Large Language Models are fundamentally bad at solving these problems. When you feed structured data into an LLM, it doesn't see relationships, and it treats every number, date, and foreign key as a token. That's why you always get garbage back. An LLM thinks your database is a Wikipedia article. It doesn't understand its structure or its relationships. GPT-4 scores 63% on relational prediction tasks. That's the best it can do, and that's pretty much useless. You can't expect real-world business value to come from summarizing Wikipedia articles.

Santiago

94,677 просмотров • 2 месяцев назад

📢GaussianGPT: autoregressive 3D Gaussian scene generation. We introduce a GPT-style model that directly generates 3D Gaussian scenes, token by token, in a series of small, discrete decision steps. Generation, completion, and large-scale outpainting in a single pipeline. Unlike diffusion-based approaches, GaussianGPT explicitly models the scene distribution at every step, allowing for quite flexible scene synthesis. 🌐 ▶️ Great work by Nicolas von Lützow, Barbara Roessle, Katharina Schmid

📢GaussianGPT: autoregressive 3D Gaussian scene generation. We introduce a GPT-style model that directly generates 3D Gaussian scenes, token by token, in a series of small, discrete decision steps. Generation, completion, and large-scale outpainting in a single pipeline. Unlike diffusion-based approaches, GaussianGPT explicitly models the scene distribution at every step, allowing for quite flexible scene synthesis. 🌐 ▶️ Great work by Nicolas von Lützow, Barbara Roessle, Katharina Schmid

Matthias Niessner

151,763 просмотров • 2 месяцев назад

They tried to track it. It rerouted through 41 nodes and vanished. The system logged an anomaly - a token signature no one can trace.

They tried to track it. It rerouted through 41 nodes and vanished. The system logged an anomaly - a token signature no one can trace.

GhostWareOS

16,607 просмотров • 8 месяцев назад

Introducing FLARE #CVPR2026 2025 FLARE is a feed-forward model that simultaneously estimates high-quality camera poses, 3D geometry, and appearance from sparse uncalibrated images. 1/4

Introducing FLARE #CVPR2026 2025 FLARE is a feed-forward model that simultaneously estimates high-quality camera poses, 3D geometry, and appearance from sparse uncalibrated images. 1/4

Gordon Wetzstein

29,518 просмотров • 1 год назад

PhysX-Anything: creates simulation-ready, articulated 3D assets from a single image; VLM-based model using a new 3D representation that reduces token count by 193x

PhysX-Anything: creates simulation-ready, articulated 3D assets from a single image; VLM-based model using a new 3D representation that reduces token count by 193x

Wildminder

49,610 просмотров • 7 месяцев назад