正在加载视频...

视频加载失败

Introducing Parallax, the first fully distributed inference and serving engine for large language models. Try it now: 🧵

160,943 次观看 • 11 个月前 •via X (Twitter)

10 条评论

Gradient Network 的头像
Gradient Network11 个月前

AI is reaching a bottleneck. LLMs are reshaping how we think, build, and create, but their demand for tokens is outpacing what centralized infra can deliver. Chips saturated; Power grids strained; Intelligence remains locked behind high-cost silos. We need a new paradigm.

Gradient Network 的头像
Gradient Network11 个月前

Parallax reimagines model inference as a global, collaborative process, one where models are no longer chained to centralized infrastructure, but are instead recomposed, executed, and verified across a global mesh of compute.

Gradient Network 的头像
Gradient Network11 个月前

The engine introduces 3 foundational shifts: – Intelligence sovereignty: serve models from the hardware you trust – Composable inference: GPUs, Apple Silicon, desktops working in harmony – Latent compute: activate into the world’s untapped compute

Gradient Network 的头像
Gradient Network11 个月前

The Parallax Runtime Layer is the core orchestration engine for high-throughput, server-side LLM serving across distributed, heterogeneous networks. It delivers server-grade optimizations—from continuous batching to paged KV-cache—and is the first MLX-based framework to enable professional-grade inference on Apple Silicon. By unifying NVIDIA GPUs and Apple devices into a single compute fabric, Parallax brings frictionless decentralized AI to everyone.

Gradient Network 的头像
Gradient Network11 个月前

Parallax runs on a distributed architecture called the Swarm: a dynamic network of nodes that collaboratively serve LLMs. Each prompt is processed across heterogeneous nodes, with each handling a segment of the model. The result: real-time inference that is decentralized, fluid, and verifiable.

Gradient Network 的头像
Gradient Network11 个月前

Compared to Petals (BitTorrent-style serving), Parallax running Qwen2.5-72B on 2× RTX 5090s achieved: – 3.1× lower end-to-end latency, 5.3× faster inter-token latency – 2.9× faster time-to-first-token, 3.1× higher I/O throughput Results were consistent and showed great scalability across different input configurations, and this is just the beginning.

Gradient Network 的头像
Gradient Network11 个月前

Now live: a chatbot fully powered by Parallax. Every response is generated peer-to-peer with no centralized server involved. Experience decentralized LLM inference:

Gradient Network 的头像
Gradient Network11 个月前

The swarm is growing. Apply to join the Edge Host Pilot Program to scale the world’s intelligence:

Gradient Network 的头像
Gradient Network11 个月前

Parallax is a major step toward a future where intelligence is hosted, served, and owned by all. Together with Lattica, the decentralized AI base stack is taking shape. Parallax will be open-sourced soon. Read our full blog on Parallax 👇

rw./ 🌐 的头像
rw./ 🌐11 个月前

./ This is just the start… 👀🌐

相关视频