Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Introduce EAGLE, a new method for fast LLM decoding based on compression: - 3x🚀than vanilla - 2x🚀 than Lookahead (on its benchmark) - 1.6x🚀 than Medusa (on its benchmark) - provably maintains text distribution - trainable (in 1~2 days) and testable on RTX 3090s Playground: Blog: Code: ⚒️First Principle:... Compression! Yi Ma We find that the sequence of second-top-layer features is compressible, making the prediction of subsequent feature vectors from previous ones easy by a small model. 🙏Acknowledge: This project is greatly inspired by the Medusa team (Tianle Cai @yli3521 Zhengyang Geng Hongwu Peng Tri Dao), the Lookahead team (Hao Zhang LMSYS Org), and others. Joint work with Yuhui Li and Chao Zhangshow more

Hongyang Zhang

2,795 subscribers

118,855 Aufrufe • vor 2 Jahren •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Jointly announcing EAGLE-3 with SGLang: Setting a new record in LLM inference acceleration! - 5x🚀than vanilla (on HF) - 1.4x🚀than EAGLE-2 (on HF) - A record of ~400 TPS on LLama 3.1 8B with a single H100 (on SGLang) - 1.65x🚀in latency even for large bs=64 (on SGLang) - A new scaling law: more training data, better speedup - Apache 2.0 Paper: Code: SGLang version: ⚒️Takeaway: Introducing training-time test, a novel draft model training technique: we replace feature prediction with direct token prediction and shift from top-layer-only features to multi-layer feature fusion. This approach unlocks a new scaling law previously undiscovered in EAGLE and EAGLE-2. 🙏Acknowledge: We would like to thank the SGLang team (zhyncs Lianmin Zheng Ying Sheng James Liu, Ke Bao, and others LMSYS Org) for their merge and careful evaluation of EAGLE-3 on SGLang. 🤝Want to collaborate? We're a small academic group with limited GPU resources. If you're interested in supporting our next version of EAGLE or would like us to train a preliminary version tailored to a specific model, please get in touch! Joint work with Yuhui Li, Fangyun Wei, and Chao Zhang

Jointly announcing EAGLE-3 with SGLang: Setting a new record in LLM inference acceleration! - 5x🚀than vanilla (on HF) - 1.4x🚀than EAGLE-2 (on HF) - A record of ~400 TPS on LLama 3.1 8B with a single H100 (on SGLang) - 1.65x🚀in latency even for large bs=64 (on SGLang) - A new scaling law: more training data, better speedup - Apache 2.0 Paper: Code: SGLang version: ⚒️Takeaway: Introducing training-time test, a novel draft model training technique: we replace feature prediction with direct token prediction and shift from top-layer-only features to multi-layer feature fusion. This approach unlocks a new scaling law previously undiscovered in EAGLE and EAGLE-2. 🙏Acknowledge: We would like to thank the SGLang team (zhyncs Lianmin Zheng Ying Sheng James Liu, Ke Bao, and others LMSYS Org) for their merge and careful evaluation of EAGLE-3 on SGLang. 🤝Want to collaborate? We're a small academic group with limited GPU resources. If you're interested in supporting our next version of EAGLE or would like us to train a preliminary version tailored to a specific model, please get in touch! Joint work with Yuhui Li, Fangyun Wei, and Chao Zhang

Hongyang Zhang

42,200 Aufrufe • vor 1 Jahr

Prediction Market is one of the leading Web3 niches in 2024! With about $4 Billion in trading volume, $192 Million in TVL and millions of users in 2024, prediction protocols have been showing tremendous growth and attracting user adoption in Web3. PolyMarket seems to be leading the pack, with over $175M in TVL, and backing from Vitalik Buterin. However, a majority of Prediction Markets, including PolyMarket, currently lacks the flexibility and capital efficiency required for seamless transactions. Also, they are all majorly focused on driving Web2 users thus neglecting the need for markets that cater for short-term, high-risk investments from Web3 Degens. To tackle these flaws, there's a need for a revolutionary contender that understands the need of Web3 Chads That's where Predict Hub comes in PredictHub is a prediction Market that transforms real world events into opportunities for everyone to participate and forecast. Launching on Arbitrum, PredictHub is already catching the attention of major players in the Space by offering something PolyMarket and others don't - Flexibility and Incentives. By offering fast market updates and innovative prediction category like ETF Forecasts, PredictHub is changing how we interact with Prediction Markets. But then, here's where it gets more interesting; PredictHub offer users a unique point system, where you don't just make predictions, you also earn rewards. The more you Predict, the more you earn. These rewards are 2-fold: Nova and Orbit Points. Nova Points are earned by traders based on their trading activity and their leaderboard ranking. Orbit Points, on the other hand, are earned by users who provide liquidity, based on their LP size and duration. Other Point systems include PolyMarket User Points, Leaderboard Bonus and Market Multipliers. These rewards offer users more competitive edge than other prediction markets. Apart from these rewards, PredictHub features a unique 3-tier referral system, rewarding users with even more as you invite your friends. The more friends you bring, the greater the rewards. On top of these, PredictHub focuses on USDC and a wide-range of yield bearing assets like GLP, gUSDC, and sUSDe, enabling users to optimise their earning while holding assets across Networks. Exciting, right? PredictHub is in its Testnet phase and you can start earning Points Right away 🔅 Here's how to Get Started on PredictHub: 1. Go to 2. Request Faucet 3. Start making predictions and earning Points Easy-Peasy ✅ More Info can be gotten from Predict Hub All eyes are on PredictHub as the fix for the flaws of Prediction Market Protocols. With its unique approach targeting untapped niches that most existing prediction markets have yet to explore, I believe the Protocol has the potential to become a breakout success I will be placing good Predictions to Position 🚀🚀🚀

Prediction Market is one of the leading Web3 niches in 2024! With about $4 Billion in trading volume, $192 Million in TVL and millions of users in 2024, prediction protocols have been showing tremendous growth and attracting user adoption in Web3. PolyMarket seems to be leading the pack, with over $175M in TVL, and backing from Vitalik Buterin. However, a majority of Prediction Markets, including PolyMarket, currently lacks the flexibility and capital efficiency required for seamless transactions. Also, they are all majorly focused on driving Web2 users thus neglecting the need for markets that cater for short-term, high-risk investments from Web3 Degens. To tackle these flaws, there's a need for a revolutionary contender that understands the need of Web3 Chads That's where Predict Hub comes in PredictHub is a prediction Market that transforms real world events into opportunities for everyone to participate and forecast. Launching on Arbitrum, PredictHub is already catching the attention of major players in the Space by offering something PolyMarket and others don't - Flexibility and Incentives. By offering fast market updates and innovative prediction category like ETF Forecasts, PredictHub is changing how we interact with Prediction Markets. But then, here's where it gets more interesting; PredictHub offer users a unique point system, where you don't just make predictions, you also earn rewards. The more you Predict, the more you earn. These rewards are 2-fold: Nova and Orbit Points. Nova Points are earned by traders based on their trading activity and their leaderboard ranking. Orbit Points, on the other hand, are earned by users who provide liquidity, based on their LP size and duration. Other Point systems include PolyMarket User Points, Leaderboard Bonus and Market Multipliers. These rewards offer users more competitive edge than other prediction markets. Apart from these rewards, PredictHub features a unique 3-tier referral system, rewarding users with even more as you invite your friends. The more friends you bring, the greater the rewards. On top of these, PredictHub focuses on USDC and a wide-range of yield bearing assets like GLP, gUSDC, and sUSDe, enabling users to optimise their earning while holding assets across Networks. Exciting, right? PredictHub is in its Testnet phase and you can start earning Points Right away 🔅 Here's how to Get Started on PredictHub: 1. Go to 2. Request Faucet 3. Start making predictions and earning Points Easy-Peasy ✅ More Info can be gotten from Predict Hub All eyes are on PredictHub as the fix for the flaws of Prediction Market Protocols. With its unique approach targeting untapped niches that most existing prediction markets have yet to explore, I believe the Protocol has the potential to become a breakout success I will be placing good Predictions to Position 🚀🚀🚀

InfoSpace OG

19,674 Aufrufe • vor 1 Jahr

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025! Paper: Code: Biological systems are capable of rapid adaptation, given limited sensory cues. For example, our human visual system can quickly adapt and tune its light sensitivity to our surroundings. While modern LLMs exhibit a wide variety of capabilities and knowledge, they remain rigid when adding task-specific capabilities. Traditionally, customizing these models requires gathering large datasets and performing often expensive, time-consuming fine-tuning for specific applications. To bypass these limitations, Text-to-LoRA (T2L) meta-learns a “hypernetwork” that takes in a text description of a desired task, as a prompt, and generates a task-specific LoRA that performs well on the task. In our experiments, we show that T2L can encode hundreds of existing LoRA adapters. While the compression is lossy, T2L maintains the performance of task-specifically tuned LoRA adapters. We also show that T2L can even generalize to unseen tasks given a natural language description of the tasks. Importantly, Text-to-LoRA is parameter-efficient. It generates LoRAs in a single, inexpensive step, based solely on a simple text description of the task. This approach is a step towards dramatically lowering the technical and computational barriers, allowing non-technical users to specialize foundation models using plain language, rather than needing deep technical expertise or large compute resources.

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025! Paper: Code: Biological systems are capable of rapid adaptation, given limited sensory cues. For example, our human visual system can quickly adapt and tune its light sensitivity to our surroundings. While modern LLMs exhibit a wide variety of capabilities and knowledge, they remain rigid when adding task-specific capabilities. Traditionally, customizing these models requires gathering large datasets and performing often expensive, time-consuming fine-tuning for specific applications. To bypass these limitations, Text-to-LoRA (T2L) meta-learns a “hypernetwork” that takes in a text description of a desired task, as a prompt, and generates a task-specific LoRA that performs well on the task. In our experiments, we show that T2L can encode hundreds of existing LoRA adapters. While the compression is lossy, T2L maintains the performance of task-specifically tuned LoRA adapters. We also show that T2L can even generalize to unseen tasks given a natural language description of the tasks. Importantly, Text-to-LoRA is parameter-efficient. It generates LoRAs in a single, inexpensive step, based solely on a simple text description of the task. This approach is a step towards dramatically lowering the technical and computational barriers, allowing non-technical users to specialize foundation models using plain language, rather than needing deep technical expertise or large compute resources.

Sakana AI

403,159 Aufrufe • vor 1 Jahr

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: Code: Like AlphaEvolve and its variants, our framework leverages LLMs to find state-of-the-art solutions to complex problems, but using orders of magnitude fewer resources! Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution, and we designed our system to be just as resourceful. On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a big leap in efficiency compared to previous methods that required thousands of evaluations. We applied ShinkaEvolve to a diverse set of hard problems with real-world applications: 1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering an entire Pareto frontier of solutions trading performance for efficiency. 2/ Competitive Programming: On ALE-Bench (a benchmark for NP-Hard optimization problems), ShinkaEvolve took the best existing agent's solutions and improved them, turning a 5th place solution on one task into a 2nd place leaderboard rank in a competitive programming competition. 3/ LLM Training: We even turned ShinkaEvolve inward to improve LLMs themselves. It tackled the open challenge of designing load balancing losses for Mixture-of-Experts (MoE) models. It discovered a novel loss function that leads to better expert specialization and consistently improves model performance and perplexity. ShinkaEvolve achieves its remarkable sample-efficiency through three key innovations that work together: (1) an adaptive parent sampling strategy to balance exploration and exploitation, (2) novelty-based rejection filtering to avoid redundant work, and (3) a bandit-based LLM ensemble that dynamically picks the best model for the job. By making ShinkaEvolve open-source and highly sample-efficient, our goal is to democratize access to advanced, open-ended discovery tools. Our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. We believe that building more efficient, nature-inspired systems is key to unlocking the future of AI-driven scientific research. We are excited to see what the community builds with it! Learn more in our technical report:

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: Code: Like AlphaEvolve and its variants, our framework leverages LLMs to find state-of-the-art solutions to complex problems, but using orders of magnitude fewer resources! Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution, and we designed our system to be just as resourceful. On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a big leap in efficiency compared to previous methods that required thousands of evaluations. We applied ShinkaEvolve to a diverse set of hard problems with real-world applications: 1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering an entire Pareto frontier of solutions trading performance for efficiency. 2/ Competitive Programming: On ALE-Bench (a benchmark for NP-Hard optimization problems), ShinkaEvolve took the best existing agent's solutions and improved them, turning a 5th place solution on one task into a 2nd place leaderboard rank in a competitive programming competition. 3/ LLM Training: We even turned ShinkaEvolve inward to improve LLMs themselves. It tackled the open challenge of designing load balancing losses for Mixture-of-Experts (MoE) models. It discovered a novel loss function that leads to better expert specialization and consistently improves model performance and perplexity. ShinkaEvolve achieves its remarkable sample-efficiency through three key innovations that work together: (1) an adaptive parent sampling strategy to balance exploration and exploitation, (2) novelty-based rejection filtering to avoid redundant work, and (3) a bandit-based LLM ensemble that dynamically picks the best model for the job. By making ShinkaEvolve open-source and highly sample-efficient, our goal is to democratize access to advanced, open-ended discovery tools. Our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. We believe that building more efficient, nature-inspired systems is key to unlocking the future of AI-driven scientific research. We are excited to see what the community builds with it! Learn more in our technical report:

Sakana AI

359,537 Aufrufe • vor 10 Monaten

Transformer by hand ✍️ ~ 6 steps walkthrough below Open the hood of a transformer and the parts list is overwhelming: embeddings, positional encoding, attention weighting, self-attention, cross-attention, multi-head attention, layer norm, skip connections, softmax, linear, Nx, shifted right, query, key, value, masking. Which of those actually make the car run? Two of them. Attention weighting and the feed-forward network. Everything else is an enhancement to make it run faster and longer, which is how we got from a car to a truck, and to the word "large" in large language model. So I drew and calculated those two parts entirely by hand. Goal: push five features through one transformer block, filling in every cell yourself. 1. Given Five positions of input features, arriving from the previous block. 2. Attention matrix Let us feed all five features to a query-key module (QK) and read back an attention weight matrix, A. The details of that module are a post of their own. 3. Attention weighting We multiply the input features by A to get the attention weighted features, Z. Still five positions. The effect is to combine features *across positions*, horizontally: X1 becomes X1 + X2, X2 becomes X2 + X3, and so on. 4. First layer Let us feed all five weighted features into the first layer of the FFN. Multiply by the weights and biases. This time the combining happens *across feature dimensions*, vertically, and each feature grows from 3 numbers to 4. Note that every position goes through the same weight matrix. That is what "position-wise" means. 5. ReLU We cross out the negatives. They become zeros. 6. Second layer Let us bring it back down: 4 dimensions to 3. The output feeds the next block, which has a completely separate set of parameters, and the whole thing runs again. You have just calculated a transformer block by hand. ✍️ The takeaway: the two parts are doing two different jobs, and neither one alone is enough. Attention mixes *across positions*, so a feature can see its neighbours. The FFN mixes *across feature dimensions*, so each position can think about itself. Horizontal, then vertical. Then that pattern repeats N times, each block with its own separate set of weights. That is the Nx from the list up top, and that is what makes the transformer run. 💾 Save this post! #AIbyHand #Transformers #DeepLearning

Transformer by hand ✍️ ~ 6 steps walkthrough below Open the hood of a transformer and the parts list is overwhelming: embeddings, positional encoding, attention weighting, self-attention, cross-attention, multi-head attention, layer norm, skip connections, softmax, linear, Nx, shifted right, query, key, value, masking. Which of those actually make the car run? Two of them. Attention weighting and the feed-forward network. Everything else is an enhancement to make it run faster and longer, which is how we got from a car to a truck, and to the word "large" in large language model. So I drew and calculated those two parts entirely by hand. Goal: push five features through one transformer block, filling in every cell yourself. 1. Given Five positions of input features, arriving from the previous block. 2. Attention matrix Let us feed all five features to a query-key module (QK) and read back an attention weight matrix, A. The details of that module are a post of their own. 3. Attention weighting We multiply the input features by A to get the attention weighted features, Z. Still five positions. The effect is to combine features across positions, horizontally: X1 becomes X1 + X2, X2 becomes X2 + X3, and so on. 4. First layer Let us feed all five weighted features into the first layer of the FFN. Multiply by the weights and biases. This time the combining happens across feature dimensions, vertically, and each feature grows from 3 numbers to 4. Note that every position goes through the same weight matrix. That is what "position-wise" means. 5. ReLU We cross out the negatives. They become zeros. 6. Second layer Let us bring it back down: 4 dimensions to 3. The output feeds the next block, which has a completely separate set of parameters, and the whole thing runs again. You have just calculated a transformer block by hand. ✍️ The takeaway: the two parts are doing two different jobs, and neither one alone is enough. Attention mixes across positions, so a feature can see its neighbours. The FFN mixes across feature dimensions, so each position can think about itself. Horizontal, then vertical. Then that pattern repeats N times, each block with its own separate set of weights. That is the Nx from the list up top, and that is what makes the transformer run. 💾 Save this post! #AIbyHand #Transformers #DeepLearning

Tom Yeh

25,829 Aufrufe • vor 16 Tagen

Building The On-Chain Cooperative 🟡 Welcome to the dawn of a new era in the crypto space, where the buzzword "community" is not just a hollow echo but a vibrant force that propels us towards a brighter future. Let's delve into the heart of MODE, the Onchain Cooperative that seeks to redefine the landscape of web3. What does MODE stand for? MODE stands for building an on-chain cooperative focused on sustainable growth and collective prosperity. At its core, MODE is guided by the principles of cooperation, shared incentives, and community-driven development. The goal is to shift from the "fat protocol" mentality where most value accrues to the blockchain/protocol itself, towards an ecosystem where builders, users, and applications can thrive together. What’s MODE's vision and mission in the web3 space? MODE's vision is to return to web3's founding promise - a future that is better for all, not just the individual. A world with aligned incentives that drive growth for everyone involved. A place with opportunities for all, not just the few. The mission is to pioneer the on-chain cooperative - where contributors are rewarded fairly based on the value they provide. Features like Sequencer Fee Sharing distribute a portion of fees to smart contract developers, incentivizing participation. The aim is to encourage collaboration instead of confrontation. Together, the MODE community can deliver new models for cooperation and shared prosperity in web3. Mode Network will solve many problems today in Web3: • Lack of incentives for developers: Developers creating decentralized apps (dApps) currently have few direct economic incentives to create and maintain their projects. Mode provides them with a steady source of income through fee-sharing. • Lack of collaboration: There are few incentives for blockchain projects to compete less and collaborate more for the benefit of the entire ecosystem. Mode's model encourages collaboration by aligning participants economically. • Excessive value accrual at the protocol layer: Mode aims for a more balanced model where the protocol's success is fueled by the success of application developers/builders and the wider community. Growth is a two-way street – "as we grow, you grow". The MODE Pledge 💛 The promise of crypto and blockchain is a brighter future. One that is better for all not just the individual. Where nothing is more important than community. We've strayed from this path. Entering a world of player vs player. Where value is extracted rather than shared. The game is zero sum rather than positive sum. And incentives are aligned with domination, rather than cooperation. Mode is the dawn of a new age. and a return to the promise of what can be. A world with aligned incentives that drive growth for builders, users and projects. A place with opportunities for all, rather than the few. Where we say goodbye to the 'fat protocol', and hello to the onchain cooperative. Join us on our mission to grow together. If this vision for a community-powered web3 ecosystem resonates - where creators are rewarded for their contributions - you can join the MODE on-chain cooperative! Visit Join the discord community Follow Mode 🟡 Together, we can transform web3 into a positive-sum game that unlocks new possibilities for all. Where your growth fuels the growth of others. Let's build the on-chain cooperative!

Building The On-Chain Cooperative 🟡 Welcome to the dawn of a new era in the crypto space, where the buzzword "community" is not just a hollow echo but a vibrant force that propels us towards a brighter future. Let's delve into the heart of MODE, the Onchain Cooperative that seeks to redefine the landscape of web3. What does MODE stand for? MODE stands for building an on-chain cooperative focused on sustainable growth and collective prosperity. At its core, MODE is guided by the principles of cooperation, shared incentives, and community-driven development. The goal is to shift from the "fat protocol" mentality where most value accrues to the blockchain/protocol itself, towards an ecosystem where builders, users, and applications can thrive together. What’s MODE's vision and mission in the web3 space? MODE's vision is to return to web3's founding promise - a future that is better for all, not just the individual. A world with aligned incentives that drive growth for everyone involved. A place with opportunities for all, not just the few. The mission is to pioneer the on-chain cooperative - where contributors are rewarded fairly based on the value they provide. Features like Sequencer Fee Sharing distribute a portion of fees to smart contract developers, incentivizing participation. The aim is to encourage collaboration instead of confrontation. Together, the MODE community can deliver new models for cooperation and shared prosperity in web3. Mode Network will solve many problems today in Web3: • Lack of incentives for developers: Developers creating decentralized apps (dApps) currently have few direct economic incentives to create and maintain their projects. Mode provides them with a steady source of income through fee-sharing. • Lack of collaboration: There are few incentives for blockchain projects to compete less and collaborate more for the benefit of the entire ecosystem. Mode's model encourages collaboration by aligning participants economically. • Excessive value accrual at the protocol layer: Mode aims for a more balanced model where the protocol's success is fueled by the success of application developers/builders and the wider community. Growth is a two-way street – "as we grow, you grow". The MODE Pledge 💛 The promise of crypto and blockchain is a brighter future. One that is better for all not just the individual. Where nothing is more important than community. We've strayed from this path. Entering a world of player vs player. Where value is extracted rather than shared. The game is zero sum rather than positive sum. And incentives are aligned with domination, rather than cooperation. Mode is the dawn of a new age. and a return to the promise of what can be. A world with aligned incentives that drive growth for builders, users and projects. A place with opportunities for all, rather than the few. Where we say goodbye to the 'fat protocol', and hello to the onchain cooperative. Join us on our mission to grow together. If this vision for a community-powered web3 ecosystem resonates - where creators are rewarded for their contributions - you can join the MODE on-chain cooperative! Visit Join the discord community Follow Mode 🟡 Together, we can transform web3 into a positive-sum game that unlocks new possibilities for all. Where your growth fuels the growth of others. Let's build the on-chain cooperative!

ETHachi Uchiha | Crypto DEGENius

16,774 Aufrufe • vor 2 Jahren

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week Every 📧 and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on Every 📧:

Dan Shipper 📧

354,033 Aufrufe • vor 2 Monaten

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

Animesh Garg

52,300 Aufrufe • vor 2 Jahren

Don't train the model, evolve the harness. I read a brilliant blog post from Hugging Face where they took a frozen open model scoring 0% on a hard legal agent benchmark, left its weights alone, and let an automated loop rewrite only the code around it. That code layer is the harness, the runtime wrapper that feeds the model context, runs its tool calls, and decides when a run ends. By the time the loop finished, the system had essentially matched Sonnet 4.6 on the benchmark's headline metric, at roughly 7x lower cost per task. Zero weights changed. The gain existed because of where the model was failing. The judge only grades files saved in the right place under the exact requested filename, and the model kept doing the legal analysis correctly, then saving it under the wrong name, dropping it in a scratch folder, or never writing it at all. So the 0% was never measuring legal reasoning. It was measuring the harness. Hand-tuning that layer is slow and model-specific, so they automated it. A Claude proposer adds exactly one mechanism per iteration, and an outer loop keeps it only if it clearly beats the current best, so accepted mechanisms compound. What the loop discovered says a lot about where agents actually fail. → The biggest single gain was file handling, not intelligence. An automatic step that lands the deliverable exactly where the judge expects it beat every prompt change, with zero extra model tokens. → Code fixes transferred across models, prompt playbooks did not. The same harness lifted a smaller model from the same family by 14 points, but the tuned prompts hurt a different model family on tasks it could already finish. → The harness mattered more than anything else. Same model, same judge, same tasks, and five different harnesses scored anywhere between 3.5% and 80.1%. The gains do eventually flatten, and the remaining misses look like real capability gaps. At some point the wrapper runs out of tricks and the model has to carry the work. But the lesson holds. A benchmark score measures the model and its harness together, and until the harness is fixed, it's impossible to know which one failed. I highly recommend reading this: I also wrote a deep dive on agent harness engineering a while back, covering the orchestration loop, tools, memory, context management, and everything that turns a stateless LLM into a capable agent. The article is quoted below.

Don't train the model, evolve the harness. I read a brilliant blog post from Hugging Face where they took a frozen open model scoring 0% on a hard legal agent benchmark, left its weights alone, and let an automated loop rewrite only the code around it. That code layer is the harness, the runtime wrapper that feeds the model context, runs its tool calls, and decides when a run ends. By the time the loop finished, the system had essentially matched Sonnet 4.6 on the benchmark's headline metric, at roughly 7x lower cost per task. Zero weights changed. The gain existed because of where the model was failing. The judge only grades files saved in the right place under the exact requested filename, and the model kept doing the legal analysis correctly, then saving it under the wrong name, dropping it in a scratch folder, or never writing it at all. So the 0% was never measuring legal reasoning. It was measuring the harness. Hand-tuning that layer is slow and model-specific, so they automated it. A Claude proposer adds exactly one mechanism per iteration, and an outer loop keeps it only if it clearly beats the current best, so accepted mechanisms compound. What the loop discovered says a lot about where agents actually fail. → The biggest single gain was file handling, not intelligence. An automatic step that lands the deliverable exactly where the judge expects it beat every prompt change, with zero extra model tokens. → Code fixes transferred across models, prompt playbooks did not. The same harness lifted a smaller model from the same family by 14 points, but the tuned prompts hurt a different model family on tasks it could already finish. → The harness mattered more than anything else. Same model, same judge, same tasks, and five different harnesses scored anywhere between 3.5% and 80.1%. The gains do eventually flatten, and the remaining misses look like real capability gaps. At some point the wrapper runs out of tricks and the model has to carry the work. But the lesson holds. A benchmark score measures the model and its harness together, and until the harness is fixed, it's impossible to know which one failed. I highly recommend reading this: I also wrote a deep dive on agent harness engineering a while back, covering the orchestration loop, tools, memory, context management, and everything that turns a stateless LLM into a capable agent. The article is quoted below.

Akshay 🚀

243,774 Aufrufe • vor 1 Monat

FACES OF THE CRIMINALS AFTER NIGERIANS LIFE: Dear President @officialasiwajubat instead of addressing serious allegations brought forth by the Nigerian Senate regarding NNPC audited accounts, the NNPC board and its leadership are preparing to travel to Kigali, Rwanda, this Friday on five private jets arranged by the son-in-law of Vice President Atiku Abubakar. Their goal is to conduct a board retreat at an exclusive resort in the Kigali hills. Led by Musa Kida, the board seems to be unaware of the distress caused to Nigeria by NNPC’s poor performance over the years. Rather than concentrating on improving NNPC, this so-called board of experts appears more focused on enjoying their own privileges. It is a disappointing situation for Nigeria and a shame on the presidency🤮 The hypocrisy of the supposed NNPC Golden team. We fear that this new A team may surpass the excesses of Kyaris wasteful regime! The organization is planning 25 Board members on a weekend retreat, which appears to be more akin to a personal vacation than a necessary business trip. The contrast between the plan for the expensive “Board retreat “ while supposedly on a cost-cutting mission raises questions about the organization’s priorities and commitment to fiscal responsibility. NNPC’s use of resources- The trip expenses include hire of 5 private jets to convey the Board to and from Kigali, expensive hotels and hosting. We hear that the Boss has a thing for big bottom ladies - What have you heard about Rwandan women?but not at the nations cost Oga!Ogun go kee all of una and I ready for una, we don enter the same trouser like this, una go hear weeeen Given the current financial constraints and emphasis on cost-cutting measures, this expenditure seems unjustifiable. This lapse of judgment warrants a second look to ensure transparency and accountability. @officialasiwajubat get to work The accusations of funds mismanagement by the senate at a time Nigeria needs every fund it can get, this is supposed to be the dream team the presidency put together to fix NNPC, Really ?? Eni shey orire seh EFCC Nigeria They stole trillions, what are you waiting for? NNPC Limited you all go soon get sense,

FACES OF THE CRIMINALS AFTER NIGERIANS LIFE: Dear President @officialasiwajubat instead of addressing serious allegations brought forth by the Nigerian Senate regarding NNPC audited accounts, the NNPC board and its leadership are preparing to travel to Kigali, Rwanda, this Friday on five private jets arranged by the son-in-law of Vice President Atiku Abubakar. Their goal is to conduct a board retreat at an exclusive resort in the Kigali hills. Led by Musa Kida, the board seems to be unaware of the distress caused to Nigeria by NNPC’s poor performance over the years. Rather than concentrating on improving NNPC, this so-called board of experts appears more focused on enjoying their own privileges. It is a disappointing situation for Nigeria and a shame on the presidency🤮 The hypocrisy of the supposed NNPC Golden team. We fear that this new A team may surpass the excesses of Kyaris wasteful regime! The organization is planning 25 Board members on a weekend retreat, which appears to be more akin to a personal vacation than a necessary business trip. The contrast between the plan for the expensive “Board retreat “ while supposedly on a cost-cutting mission raises questions about the organization’s priorities and commitment to fiscal responsibility. NNPC’s use of resources- The trip expenses include hire of 5 private jets to convey the Board to and from Kigali, expensive hotels and hosting. We hear that the Boss has a thing for big bottom ladies - What have you heard about Rwandan women?but not at the nations cost Oga!Ogun go kee all of una and I ready for una, we don enter the same trouser like this, una go hear weeeen Given the current financial constraints and emphasis on cost-cutting measures, this expenditure seems unjustifiable. This lapse of judgment warrants a second look to ensure transparency and accountability. @officialasiwajubat get to work The accusations of funds mismanagement by the senate at a time Nigeria needs every fund it can get, this is supposed to be the dream team the presidency put together to fix NNPC, Really ?? Eni shey orire seh EFCC Nigeria They stole trillions, what are you waiting for? NNPC Limited you all go soon get sense,

Gistlovers.blog1

15,305 Aufrufe • vor 1 Jahr

1 Neural Network + Obsidian + Karpathy’s 1-file method = the most unhinged second brain build of 2026. It remembers everything you’ve ever done, and it costs $0 on top of what you already pay. The base is Karpathy’s append and review: 1 giant note, new thoughts stack on top, old ones sink, every few days you reread and pull the survivors back up. No folders, no tags, no plugins the rereading IS the system, because review is what turns storage into thinking. The flaw: past 10,000 lines, no human rereads anything. That’s where the neural network takes over. You keep the note in Obsidian 1 vault, everything dumps to the top: ideas, links, meeting fragments, half-thoughts. You never organize, you only dump. It all lives as plain markdown on your own disk, and that detail is the whole trick. Because now you point Claude Code at the vault folder, and it reads every line you’ve ever written. “What did I think about pricing in March.” “Find the 3 ideas I keep circling.” “What did I drop that deserves a second look.” It answers from YOUR notes, with quotes, in 15 seconds. Then once a week, 1 prompt closes the loop: read the last 7 days, surface the 5 entries worth pulling back up, flag anything that contradicts what I wrote a month ago. The model does the sinking and surfacing Karpathy did by hand, and the note stays alive instead of turning into a graveyard. Week 1 feels like nothing. Week 4 you hit the first “I already solved this in January.” Month 3 you consult your past self more than Google. Most second brains die in 11 days under 40 plugins and 200 folders. This one is 1 file and a loop, and it compounds because dumping takes 0 discipline. Notion stores what you thought. This thing argues back.

1 Neural Network + Obsidian + Karpathy’s 1-file method = the most unhinged second brain build of 2026. It remembers everything you’ve ever done, and it costs $0 on top of what you already pay. The base is Karpathy’s append and review: 1 giant note, new thoughts stack on top, old ones sink, every few days you reread and pull the survivors back up. No folders, no tags, no plugins the rereading IS the system, because review is what turns storage into thinking. The flaw: past 10,000 lines, no human rereads anything. That’s where the neural network takes over. You keep the note in Obsidian 1 vault, everything dumps to the top: ideas, links, meeting fragments, half-thoughts. You never organize, you only dump. It all lives as plain markdown on your own disk, and that detail is the whole trick. Because now you point Claude Code at the vault folder, and it reads every line you’ve ever written. “What did I think about pricing in March.” “Find the 3 ideas I keep circling.” “What did I drop that deserves a second look.” It answers from YOUR notes, with quotes, in 15 seconds. Then once a week, 1 prompt closes the loop: read the last 7 days, surface the 5 entries worth pulling back up, flag anything that contradicts what I wrote a month ago. The model does the sinking and surfacing Karpathy did by hand, and the note stays alive instead of turning into a graveyard. Week 1 feels like nothing. Week 4 you hit the first “I already solved this in January.” Month 3 you consult your past self more than Google. Most second brains die in 11 days under 40 plugins and 200 folders. This one is 1 file and a loop, and it compounds because dumping takes 0 discipline. Notion stores what you thought. This thing argues back.

West Lord

24,679 Aufrufe • vor 16 Tagen

Had an awesome time with Mark Haywood and the $edm.v team on the site tour! The DFO permit is a near term major catalyst, which upon receiving will be followed by the final production decision. The water at the polishing pond after the tailings pond is astoundingly clean and loaded with fish, and tested as much cleaner than the water of the local streams and rivers. DFO permits are very rarely not granted, and $EDM.v's thoroughness and hard work is sure to be rewarded. The provincial government of Nova Scotia is also very supportive of the project and is eager to see them enter production. EDM's cash flow for next year will likely be greater than the current mcap, only to continue ramping up to double or greater than the current mcap. Very few juniors are this close to cash flow with such a small capex needed for restart, which will be financed with debt and paid back in around half a year. $edm.v should be 3-4x higher now, at 0.5x NPV, and even higher once the DFO permit is received this summer, 0.7-0.8x NPV. Right now we are trading near 0.10-0.15x NPV when you adjust for the DMS update, current metal prices, as well as doubled Gypsum prices, which is crazy cheap for how advanced the project is. Another impression from the tour is how solid and well made the 2700tpd mill is. It was built by Imperial Oil, a multi billion dollar company, who put in a lot of extras. There are about 5-10m CAD in replacement and maintainence parts on site from the previous opperators, still in original packaging. All of this is just the base case projection, it doesn't include the $gold optionality. Once the mine is in production and the main put is drained, they can go back to where the last producers will pulling ore and resuming mining high grade gold, which left off at 20k to 60k ounces a year based on a rough back calculation from the left over concentrates. This gold is proven via production and recovery, its absolutely there, and the team is actively working on defining the extent and quantities of it. I have been adding to my position this week $zinc $lead $gold $silver $gypsum $emo.v $fwz.v $teck $nexa $glen

Had an awesome time with Mark Haywood and the $edm.v team on the site tour! The DFO permit is a near term major catalyst, which upon receiving will be followed by the final production decision. The water at the polishing pond after the tailings pond is astoundingly clean and loaded with fish, and tested as much cleaner than the water of the local streams and rivers. DFO permits are very rarely not granted, and $EDM.v's thoroughness and hard work is sure to be rewarded. The provincial government of Nova Scotia is also very supportive of the project and is eager to see them enter production. EDM's cash flow for next year will likely be greater than the current mcap, only to continue ramping up to double or greater than the current mcap. Very few juniors are this close to cash flow with such a small capex needed for restart, which will be financed with debt and paid back in around half a year. $edm.v should be 3-4x higher now, at 0.5x NPV, and even higher once the DFO permit is received this summer, 0.7-0.8x NPV. Right now we are trading near 0.10-0.15x NPV when you adjust for the DMS update, current metal prices, as well as doubled Gypsum prices, which is crazy cheap for how advanced the project is. Another impression from the tour is how solid and well made the 2700tpd mill is. It was built by Imperial Oil, a multi billion dollar company, who put in a lot of extras. There are about 5-10m CAD in replacement and maintainence parts on site from the previous opperators, still in original packaging. All of this is just the base case projection, it doesn't include the $gold optionality. Once the mine is in production and the main put is drained, they can go back to where the last producers will pulling ore and resuming mining high grade gold, which left off at 20k to 60k ounces a year based on a rough back calculation from the left over concentrates. This gold is proven via production and recovery, its absolutely there, and the team is actively working on defining the extent and quantities of it. I have been adding to my position this week $zinc $lead $gold $silver $gypsum $emo.v $fwz.v $teck $nexa $glen

Zachariah Loney

12,670 Aufrufe • vor 1 Monat

introducing a new, very fun, LLM benchmark- the Game-of-Life Bench! the rules are simple: given an 8x8 grid following Conway's game of life rules, the goal is to create an initial pattern with at most 32 cells that can last the longest number of turns before dying/repeating. some results to highlight (with caveats detailed below): - gpt 5.1 lasts the longest with a 106 step run - claude models are really bad at this! they refuse to reason about this task and score < 25 points - deepseek r1 is the best open model with 102 steps. why? because i wanted to create a benchmark that has (i think) no practicality, but is still fun to look at, cheap, and still measures something interesting. i also am a big fan of the game of life. its absurdly simple rules leading to intractability is extremely cool to me. also, i saw a lot of work with LLMs trying to "predict" the next state in Conway's game of life, I think game-of-life bench is more fun because it's pretty open ended and only asks the LLM for the initial state. I also think this could be an RL env? but idk why you would ever train on this task haha i don't think this is a "serious" benchmark because it doesnt measure anything practical, but i still think it's a hard benchmark exactly because you can't predict what happens with your initial state many turns into the future; this is why i was initially expecting all LLMs to be bad at it, but turns out, some are clearly better than the others (the ordering may surprise you!) reminder: this is still a work-in-progress; (1) i am gpu-poor so could only do 10 runs for each model, even though total running cost is relatively low. maybe with some more credits i can run more seeds for each model. (2) i handpicked models which i think are at the frontier right now, plus some others that were on my mind. so, if you'd like to see a model on here, let me know. (3) i currently only do an 8x8 grid because i thought that by itself would be pretty hard for current LLMs, but of course we can increase grid sizes! (4) the coolest thing is, i dont think we can calculate the max possible number of states (yay undecidability!) you can go without repeating, so this is essentially a no-ceiling task, which is pretty cool! again, i did this mostly out of a desire to make LLMs do something fun. if this keeps me entertained for a few more days, i'd likely release a blog post on it. if it keeps me entertained for a week (and someone sponsors me), i'll put more work into it :P lastly, this is fully open sourced, so feel free to run this on your own!

introducing a new, very fun, LLM benchmark- the Game-of-Life Bench! the rules are simple: given an 8x8 grid following Conway's game of life rules, the goal is to create an initial pattern with at most 32 cells that can last the longest number of turns before dying/repeating. some results to highlight (with caveats detailed below): - gpt 5.1 lasts the longest with a 106 step run - claude models are really bad at this! they refuse to reason about this task and score < 25 points - deepseek r1 is the best open model with 102 steps. why? because i wanted to create a benchmark that has (i think) no practicality, but is still fun to look at, cheap, and still measures something interesting. i also am a big fan of the game of life. its absurdly simple rules leading to intractability is extremely cool to me. also, i saw a lot of work with LLMs trying to "predict" the next state in Conway's game of life, I think game-of-life bench is more fun because it's pretty open ended and only asks the LLM for the initial state. I also think this could be an RL env? but idk why you would ever train on this task haha i don't think this is a "serious" benchmark because it doesnt measure anything practical, but i still think it's a hard benchmark exactly because you can't predict what happens with your initial state many turns into the future; this is why i was initially expecting all LLMs to be bad at it, but turns out, some are clearly better than the others (the ordering may surprise you!) reminder: this is still a work-in-progress; (1) i am gpu-poor so could only do 10 runs for each model, even though total running cost is relatively low. maybe with some more credits i can run more seeds for each model. (2) i handpicked models which i think are at the frontier right now, plus some others that were on my mind. so, if you'd like to see a model on here, let me know. (3) i currently only do an 8x8 grid because i thought that by itself would be pretty hard for current LLMs, but of course we can increase grid sizes! (4) the coolest thing is, i dont think we can calculate the max possible number of states (yay undecidability!) you can go without repeating, so this is essentially a no-ceiling task, which is pretty cool! again, i did this mostly out of a desire to make LLMs do something fun. if this keeps me entertained for a few more days, i'd likely release a blog post on it. if it keeps me entertained for a week (and someone sponsors me), i'll put more work into it :P lastly, this is fully open sourced, so feel free to run this on your own!

Akshit

13,722 Aufrufe • vor 5 Monaten

🚨 RWA PAD PLATFORM LIVE AND FIRST PROJECTS COMPETE ON DEMODAY 🚨 Hello EstateX Family, RWA Pad platform is live. Because of the tech teams occupation there are still some changes made to the website and the content is not completely final. To be able to participate in the projects that will be coming on RWA Pad, you need to hold or lock $ESX. The following tiers qualify, with Tier 5 having first access AND getting the best deal. Tier 5: Unicorn Club (1M+ tokens) Tier 4: 500,000 tokens Tier 3: 100,000 tokens Tier 2: 50,000 tokens Tier 1: 10,000 tokens This also means that new investors / their community need to buy $ESX to participate. 🚨 DEMO DAY ON THE 17TH OF FEBRUARY On the 17th of February, the first projects will compete for a spot on the launchpad. These projects will bring their audience, and judges will be our (famous) project partners and KOLs. On top of that, the EstateX Family will have a decisive vote on the projects that they want to see on the launchpad. In a battle style form, they will compete and create content, viral moments and bring audience. A week later the first raise will happen on RWA Pad, followed by buyback of the revenue generated. No matter what happens, the team keeps showing up. On Monday we have the L1 chain Beta launch with a Graham AMA, we have Sky Villa’s opening up, other property payments opening up with continued sales, the demo day, RWA Pad raise with all followed by generated revenue buybacks. Make sure to watch today’s AMA as we also discussed price action, the team, the switching of legal framework to make property sales more efficient (hence the delay), how the team moves forward and what’s upcoming. 🚨 REGISTER FOR YOUR TIER ON RWA PAD You can register for your tier on RWA Pad now, by selecting one of the five. Click register now on the site below. *Content is not finalized yet and subject to change. We are also curious if you want to see other sectors apart from RWA (like AI, perps, prediction markets, privacy tokens or anything else). Let us know below 👇

🚨 RWA PAD PLATFORM LIVE AND FIRST PROJECTS COMPETE ON DEMODAY 🚨 Hello EstateX Family, RWA Pad platform is live. Because of the tech teams occupation there are still some changes made to the website and the content is not completely final. To be able to participate in the projects that will be coming on RWA Pad, you need to hold or lock $ESX. The following tiers qualify, with Tier 5 having first access AND getting the best deal. Tier 5: Unicorn Club (1M+ tokens) Tier 4: 500,000 tokens Tier 3: 100,000 tokens Tier 2: 50,000 tokens Tier 1: 10,000 tokens This also means that new investors / their community need to buy $ESX to participate. 🚨 DEMO DAY ON THE 17TH OF FEBRUARY On the 17th of February, the first projects will compete for a spot on the launchpad. These projects will bring their audience, and judges will be our (famous) project partners and KOLs. On top of that, the EstateX Family will have a decisive vote on the projects that they want to see on the launchpad. In a battle style form, they will compete and create content, viral moments and bring audience. A week later the first raise will happen on RWA Pad, followed by buyback of the revenue generated. No matter what happens, the team keeps showing up. On Monday we have the L1 chain Beta launch with a Graham AMA, we have Sky Villa’s opening up, other property payments opening up with continued sales, the demo day, RWA Pad raise with all followed by generated revenue buybacks. Make sure to watch today’s AMA as we also discussed price action, the team, the switching of legal framework to make property sales more efficient (hence the delay), how the team moves forward and what’s upcoming. 🚨 REGISTER FOR YOUR TIER ON RWA PAD You can register for your tier on RWA Pad now, by selecting one of the five. Click register now on the site below. *Content is not finalized yet and subject to change. We are also curious if you want to see other sectors apart from RWA (like AI, perps, prediction markets, privacy tokens or anything else). Let us know below 👇

EstateX

83,238 Aufrufe • vor 6 Monaten

we sped up distributed inference by up to 5x with decentralized speculative decoding. many don't realize that AI models normally generate text one single word at a time, waiting for the network after every word. speculative decoding changes this by using a "guess & confirm" system, similar to autocomplete. how it's done: 1. draft locally (the guess) instead of waiting for the network, a tiny, fast model on your device guesses the next few words instantly, without waiting for the network. 2. confirm remotely (the check) the massive remote model doesn't generate from scratch; it just checks the draft. it looks at the guesses in a batch and says "yes, yes, no." you get multiple words in the time it usually takes to get one. 3. adaptive logic dsd is smart. if the topic is creative, it lets the draft flow loose. if the topic is math or code, it checks more strictly. it balances speed and precision automatically so your inference almost feel instant. find out more: paper: blog:

we sped up distributed inference by up to 5x with decentralized speculative decoding. many don't realize that AI models normally generate text one single word at a time, waiting for the network after every word. speculative decoding changes this by using a "guess & confirm" system, similar to autocomplete. how it's done: 1. draft locally (the guess) instead of waiting for the network, a tiny, fast model on your device guesses the next few words instantly, without waiting for the network. 2. confirm remotely (the check) the massive remote model doesn't generate from scratch; it just checks the draft. it looks at the guesses in a batch and says "yes, yes, no." you get multiple words in the time it usually takes to get one. 3. adaptive logic dsd is smart. if the topic is creative, it lets the draft flow loose. if the topic is math or code, it checks more strictly. it balances speed and precision automatically so your inference almost feel instant. find out more: paper: blog:

Parallax

45,584 Aufrufe • vor 6 Monaten

INTRODUCING OCBTW: THE FIRST PANDA-BACKED ASSET A FIRST-OF-ITS-KIND 2-WAY NFT NFT SWAP We just deployed a smart contract on Bitcoin that allows you to mint "OCBTW", a limited edition generative art collection by tclow.sats, solely by swapping an Alkane Pandas ⬢ OCBTW has a supply of 200 unique pieces and is inspired by Oyl | Building Alkanes's signature hexagon logo that defines all native Alkane assets (e.g., $DIESEL). It is 100% on-chain generative art using three.js code via HTML ⬢ OCBTW can only be minted by swapping a Panda using the 2:70104 smart contract. After 200 swaps have been completed, on a first-come-first-served basis, the mint will conclude. However, OCBTW can be swapped back to a Panda at any time, freeing up supply for another Panda holder to collect a OCBTW piece from the contract. ⬢ OCBTW is thus backed entirely by Pandas, meaning each piece will never be worth less than a Panda. This is the first time on Bitcoin that an NFT is minted with another NFT. This may be the first time this has ever been done on any chain. ⬢ OCBTW also effectively locks up up to 200 Pandas for perpetuity. Just like AP-69, this swap contract has no withdrawal function. The only way to get Pandas out of the contract is to swap your OCBTW back to a Panda. This operates on a last-in, first-out (LIFO) basis. ⬢ OCBTW can be minted by calling the 2:70104 contract using opcode 42 and including a Panda in the transaction, either by using or You can only mint 1 OCBTW per transaction. Rarity of Pandas has no effect on what OCBTW mint you will receive. ⬢ OCBTW can be viewed natively in browser on iDclub 💥 Building Alkanes at the following link: Please be patient after minting for their indexer to update after blocks clear. Please also note that the Ordiscan Alkanes indexer is currently down. ⬢ OCBTW is purely art. Art on Bitcoin. Forever.

Alkane Pandas

22,591 Aufrufe • vor 11 Monaten

Dynamic workflows are a generalization of harnesses, automations, loops, routing, and graphs. It's the most powerful feature I have built into my agent orchestrator. Supports all kinds of patterns that leverage different agent backends (claude, codex, pi, hermes,...). It's a meta-harness approach that unlocks new forms of test-time compute. Example of use cases it supports: > LLM councils to get different perspectives from LLMs or plan more intensively > Dynamically routing tasks to different agents based on needs (e.g., cost efficiency and optimal intelligence) > Advisor/Judge + executor workflows and pretty much any complex graph-based pattern required by the task. I find it especially useful for long-running work and code reviewing. > Agent teams that talk to each other if needed for the task. I like to use this for AI editing, artifact creation, and other creative tasks. And I am sure it supports so many things that I haven't discovered yet. I got inspired by the dynamic workflow feature released by the Claude Code team. I had actually built it earlier this year but wanted to generalize it across different agent backends. I think this is going to become more popular in the coming days. I will share more of my findings soon.

Dynamic workflows are a generalization of harnesses, automations, loops, routing, and graphs. It's the most powerful feature I have built into my agent orchestrator. Supports all kinds of patterns that leverage different agent backends (claude, codex, pi, hermes,...). It's a meta-harness approach that unlocks new forms of test-time compute. Example of use cases it supports: > LLM councils to get different perspectives from LLMs or plan more intensively > Dynamically routing tasks to different agents based on needs (e.g., cost efficiency and optimal intelligence) > Advisor/Judge + executor workflows and pretty much any complex graph-based pattern required by the task. I find it especially useful for long-running work and code reviewing. > Agent teams that talk to each other if needed for the task. I like to use this for AI editing, artifact creation, and other creative tasks. And I am sure it supports so many things that I haven't discovered yet. I got inspired by the dynamic workflow feature released by the Claude Code team. I had actually built it earlier this year but wanted to generalize it across different agent backends. I think this is going to become more popular in the coming days. I will share more of my findings soon.

elvis

31,939 Aufrufe • vor 9 Tagen

Ethereum Strategic Reserve & Launchpool Announcement 🚀 💰 Ethereum Strategic Reserve - 410 ETH StarHeroes DAO is thrilled to announce the creation of the Strategic ETH Reserve, seeded with 410 ETH (~$1.5M) from the DAO’s decentralized Treasury. Inspired by visionary moves from industry leaders like SharpLink Gaming and GameStop, this reserve positions StarHeroes as a pioneer in accumulating ETH while bolstering decentralized infrastructure. The reserve will support: ✅ Ethereum staking ✅ Restaking ✅ On-chain storage More than just accumulating ETH, this initiative is designed to enhance the utility and long-term value of the $STAR token, directly rewarding decentralized holders and strengthening the ecosystem. 🫂 Community Launchpool - 50 ETH Distribution From thrilling in-game seasons to competitive esports leagues and electrifying IRL grand finals, StarHeroes has always been about delivering epic rewards to our community. Now, we’re taking it to the next level with the Season 1 Launchpool, distributing 50 ETH (worth ~$185,000) to $STAR stakers! This is the first time that the entity holding the Strategic Reserve is directly transferring value to the Token holders! 🚀 50 ETH locked in a galactic vault, ready to be unlocked by staking $STAR! 📅 Launch Date: Tuesday, July 29, 2025, at 15:00 UTC—one day before Ethereum’s 10th birthday! Stay tuned for more details and get ready to stake your $STAR for out-of-this-world rewards!

Ethereum Strategic Reserve & Launchpool Announcement 🚀 💰 Ethereum Strategic Reserve - 410 ETH StarHeroes DAO is thrilled to announce the creation of the Strategic ETH Reserve, seeded with 410 ETH (~$1.5M) from the DAO’s decentralized Treasury. Inspired by visionary moves from industry leaders like SharpLink Gaming and GameStop, this reserve positions StarHeroes as a pioneer in accumulating ETH while bolstering decentralized infrastructure. The reserve will support: ✅ Ethereum staking ✅ Restaking ✅ On-chain storage More than just accumulating ETH, this initiative is designed to enhance the utility and long-term value of the $STAR token, directly rewarding decentralized holders and strengthening the ecosystem. 🫂 Community Launchpool - 50 ETH Distribution From thrilling in-game seasons to competitive esports leagues and electrifying IRL grand finals, StarHeroes has always been about delivering epic rewards to our community. Now, we’re taking it to the next level with the Season 1 Launchpool, distributing 50 ETH (worth ~$185,000) to $STAR stakers! This is the first time that the entity holding the Strategic Reserve is directly transferring value to the Token holders! 🚀 50 ETH locked in a galactic vault, ready to be unlocked by staking $STAR! 📅 Launch Date: Tuesday, July 29, 2025, at 15:00 UTC—one day before Ethereum’s 10th birthday! Stay tuned for more details and get ready to stake your $STAR for out-of-this-world rewards!

StarHeroes

47,096 Aufrufe • vor 1 Jahr

Welcome to CyberNetwork, where the future of gaming meets the power of blockchain technology. CyberNetwork is revolutionising the gaming industry with a decentralised platform that offers unparalleled opportunities for gamers, developers, and investors. By leveraging blockchain technology, we ensure transparency, security, and fairness, transforming traditional gaming models into more immersive and rewarding experiences. CyberNetwork is built on SovereignChain technology from Multiversᕽ, featuring: ⚡️ 100K Transactions Per Second (TPS): Ensuring fast and seamless interactions. ⚡️ 1-Second Block Timing: Providing near-instantaneous transaction confirmations. ⚡️ 2-Second Finality: Guaranteeing rapid settlement and reduced waiting times. This robust infrastructure supports our mission to create a more equitable, transparent, and rewarding gaming ecosystem. Leveraging #MultiversX's Interoperability layer, CyberNetwork is able to seamlessly interact with major blockchains like #Bitcoin, #Ethereum, and #Solana. CyberNetwork Features: 1. Gaming Launchpad 🚀: Empowering game developers to launch and scale their projects. 2. Game Store 🏪: A comprehensive game distribution platform for discovering and purchasing games. 3. Digital Asset Marketplace 🎨: Facilitating the creation, buying, and selling of in-game assets. 4. DEX 💱: Enabling decentralized exchanges of gaming tokens and assets. 5. Gamer's Passport 🎮: A unified identity and achievement system across games. 6. Metaverse Services 🌌: A service for game developers to create high-density virtual worlds. CyberNetwork is set to transform how we experience and engage with games. In the upcoming days, we’ll be unveiling each of these exciting features one by one. Stay tuned for a journey of discovery! For more info: Telegram Discord #Web3Gaming #BlockchainGaming

Welcome to CyberNetwork, where the future of gaming meets the power of blockchain technology. CyberNetwork is revolutionising the gaming industry with a decentralised platform that offers unparalleled opportunities for gamers, developers, and investors. By leveraging blockchain technology, we ensure transparency, security, and fairness, transforming traditional gaming models into more immersive and rewarding experiences. CyberNetwork is built on SovereignChain technology from Multiversᕽ, featuring: ⚡️ 100K Transactions Per Second (TPS): Ensuring fast and seamless interactions. ⚡️ 1-Second Block Timing: Providing near-instantaneous transaction confirmations. ⚡️ 2-Second Finality: Guaranteeing rapid settlement and reduced waiting times. This robust infrastructure supports our mission to create a more equitable, transparent, and rewarding gaming ecosystem. Leveraging #MultiversX's Interoperability layer, CyberNetwork is able to seamlessly interact with major blockchains like #Bitcoin, #Ethereum, and #Solana. CyberNetwork Features: 1. Gaming Launchpad 🚀: Empowering game developers to launch and scale their projects. 2. Game Store 🏪: A comprehensive game distribution platform for discovering and purchasing games. 3. Digital Asset Marketplace 🎨: Facilitating the creation, buying, and selling of in-game assets. 4. DEX 💱: Enabling decentralized exchanges of gaming tokens and assets. 5. Gamer's Passport 🎮: A unified identity and achievement system across games. 6. Metaverse Services 🌌: A service for game developers to create high-density virtual worlds. CyberNetwork is set to transform how we experience and engage with games. In the upcoming days, we’ll be unveiling each of these exciting features one by one. Stay tuned for a journey of discovery! For more info: Telegram Discord #Web3Gaming #BlockchainGaming

CyberNetwork

74,805 Aufrufe • vor 2 Jahren

This guy built a visual scanner that reads 468 points on his face and 42 points on his hands from a regular webcam and turns them into a cloud of thousands of particles right between his palms. Inside, MediaPipe and TouchDesigner are linked: the first captures hands and face from the webcam with high accuracy, the second turns those coordinates into a live plane and feeds it into a POP system that instantly generates a swarm of particles in the shape of a head. No studio, no render farmer, no VR headset. Just a laptop, a webcam, and 1 TouchDesigner session. And traditional VJ studios keep teams of 5 people on a setup with lighting, custom hardware, and commercial plugins, while his expenses are only a TouchDesigner subscription and a regular USB camera. One laptop runs MediaPipe and TouchDesigner simultaneously, holds the camera stream at 60 FPS without drops, and in parallel processes 468 face points + 21 points on each hand. The camera captures frame after frame, MediaPipe in real time sends TouchDesigner the finger coordinates and face geometry, and the POP operator inside the engine translates those numbers into thousands of particle points with colors from bright pink to gold. This setup immediately defines the role of the tool and the limits of its autonomy. It knows where the fingertips are at every moment of the frame. It knows how to read the face geometry at any angle to the camera. It knows how to draw a swarm of particles between them with the right color and contour. → MediaPipe pulls 468 points from the face and 21 points from each hand, 60 times per second → TouchDesigner receives those coordinates, builds a virtual rectangle between the fingertips, and feeds it into the POP system → POP generates thousands of particle points in the shape of a head, coloring them in a gradient from bright pink to gold → The HUD layer adds green corners and a blue neon frame, styling the image like an AR interface → All layers assemble into 1 real-time frame that projects back onto the video in the camera window → The final image is recorded to a file or broadcast to a projector for a live installation And only when the guy spreads his hands wider does the plane between the palms stretch; brings them together, it narrows. Otherwise the system runs on its own. And when he moves from his home room to a concert hall, the same laptop with the same webcam launches the same TouchDesigner session in just 5 minutes, without reconfiguration, without a new team, and without a single line of new code. In his work setup there is no studio of his own and no team for assembly. On the desk sits a laptop with a webcam, on top run MediaPipe and TouchDesigner with POP operators, and the same setup through a USB camera moves to any concert without a new configuration. Out of everything I have seen this year, this is the cleanest Creative Coding setup on 1 laptop: 0 render farms, 0 studio lighting, and between them 3 libraries, thousands of particle points, and 1 webcam.

This guy built a visual scanner that reads 468 points on his face and 42 points on his hands from a regular webcam and turns them into a cloud of thousands of particles right between his palms. Inside, MediaPipe and TouchDesigner are linked: the first captures hands and face from the webcam with high accuracy, the second turns those coordinates into a live plane and feeds it into a POP system that instantly generates a swarm of particles in the shape of a head. No studio, no render farmer, no VR headset. Just a laptop, a webcam, and 1 TouchDesigner session. And traditional VJ studios keep teams of 5 people on a setup with lighting, custom hardware, and commercial plugins, while his expenses are only a TouchDesigner subscription and a regular USB camera. One laptop runs MediaPipe and TouchDesigner simultaneously, holds the camera stream at 60 FPS without drops, and in parallel processes 468 face points + 21 points on each hand. The camera captures frame after frame, MediaPipe in real time sends TouchDesigner the finger coordinates and face geometry, and the POP operator inside the engine translates those numbers into thousands of particle points with colors from bright pink to gold. This setup immediately defines the role of the tool and the limits of its autonomy. It knows where the fingertips are at every moment of the frame. It knows how to read the face geometry at any angle to the camera. It knows how to draw a swarm of particles between them with the right color and contour. → MediaPipe pulls 468 points from the face and 21 points from each hand, 60 times per second → TouchDesigner receives those coordinates, builds a virtual rectangle between the fingertips, and feeds it into the POP system → POP generates thousands of particle points in the shape of a head, coloring them in a gradient from bright pink to gold → The HUD layer adds green corners and a blue neon frame, styling the image like an AR interface → All layers assemble into 1 real-time frame that projects back onto the video in the camera window → The final image is recorded to a file or broadcast to a projector for a live installation And only when the guy spreads his hands wider does the plane between the palms stretch; brings them together, it narrows. Otherwise the system runs on its own. And when he moves from his home room to a concert hall, the same laptop with the same webcam launches the same TouchDesigner session in just 5 minutes, without reconfiguration, without a new team, and without a single line of new code. In his work setup there is no studio of his own and no team for assembly. On the desk sits a laptop with a webcam, on top run MediaPipe and TouchDesigner with POP operators, and the same setup through a USB camera moves to any concert without a new configuration. Out of everything I have seen this year, this is the cleanest Creative Coding setup on 1 laptop: 0 render farms, 0 studio lighting, and between them 3 libraries, thousands of particle points, and 1 webcam.

Blaze

38,242 Aufrufe • vor 2 Monaten