TensorBlock

@tensorblock_aoi • 1,601 subscribers

the runtime protocol for intelligent systems

Shorts

TensorBlock 🤝 Z.ai GLM-4.5, most advanced LLM, is now live on TensorBlock Forge. Just connect your API key to access it seamlessly through Forge’s unified interface. This collaboration brings state-of-the-art AI capabilities to users—enabling smarter code generation, reasoning and stronger agentic functionalities. Learn more and get started here:

18,434 просмотров

Videos

LIVE

1.2k

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Streaming Now

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

HD live stream

Exclusive private shows

1.2k viewers online

Current Status

Live

Private Show

Join now for exclusive access

Free preview available • Premium content

2:45

QwQ-32B (Qwen ) model sharding between M1 MacBook Pro (16GB) + RTX 4060 Ti. Enabling efficient inference through model quantization and cross-device parallel computing. Demonstrating production-ready performance on consumer hardware. Technical benchmarks coming soon.

TensorBlock

24,147 просмотров • 1 год назад

0:21

🚨 TensorBlock Forge is now live on Product Hunt! One API, all your AI models. OpenAI-compatible, privacy-first, and prod-ready. Help us hit the front page by leaving a comment and upvote!

TensorBlock

17,101 просмотров • 1 год назад

$Successfully deployed Deepseek R1 Distilled 70B (AWQ) across 8x NVIDIA RTX 3080 10G GPUs, achieving 60 tokens/s with full tensor parallelism via PCIe. Total hardware cost: $6,400 This demonstrates that consumer GPUs can deliver substantial ML inference capabilities at a fraction of the cost of datacenter hardware. For perspective, a single A100 80G costs $17,550, while a H100 80G runs $25,000. Our testing validates an often-overlooked opportunity: millions of idle crypto mining rigs could be repurposed into a powerful, distributed AI infrastructure. The performance-to-cost ratio of consumer GPUs, especially when properly optimized for tensor operations, presents a compelling case for decentralized AI compute. We're excited to share more insights as we continue pushing the boundaries of consumer hardware in AI workloads.$

1:20

Successfully deployed Deepseek R1 Distilled 70B (AWQ) across 8x NVIDIA RTX 3080 10G GPUs, achieving 60 tokens/s with full tensor parallelism via PCIe. Total hardware cost: $6,400 This demonstrates that consumer GPUs can deliver substantial ML inference capabilities at a fraction of the cost of datacenter hardware. For perspective, a single A100 80G costs $17,550, while a H100 80G runs $25,000. Our testing validates an often-overlooked opportunity: millions of idle crypto mining rigs could be repurposed into a powerful, distributed AI infrastructure. The performance-to-cost ratio of consumer GPUs, especially when properly optimized for tensor operations, presents a compelling case for decentralized AI compute. We're excited to share more insights as we continue pushing the boundaries of consumer hardware in AI workloads.

TensorBlock

16,219 просмотров • 1 год назад

Больше нет контента для загрузки

Live Cam

TensorBlock

Shorts

Videos

Watch Anya Live

QwQ-32B (Qwen ) model sharding between M1 MacBook Pro (16GB) + RTX 4060 Ti. Enabling efficient inference through model quantization and cross-device parallel computing. Demonstrating production-ready performance on consumer hardware. Technical benchmarks coming soon.

🚨 TensorBlock Forge is now live on Product Hunt! One API, all your AI models. OpenAI-compatible, privacy-first, and prod-ready. Help us hit the front page by leaving a comment and upvote!