Video yükleniyor...
Video Yüklenemedi
Introducing Meta Perception Language Model (PLM): an open & reproducible vision-language model tackling challenging visual tasks. Learn more about how PLM can help the open source community build more capable computer vision systems. Read the research paper, and download the code and dataset:
93,811 görüntüleme • 1 yıl önce •via X (Twitter)
11 Yorum

Breakdown of the paper behind it: The paper introduces the Perception Language Model (PLM), a fully reproducible vision-language model that can be used for visual perception tasks without relying on proprietary black-box models. The authors found that scaling synthetic data is only effective for established, base tasks, and extending the VLMs to more challenging, complex tasks remains unsolved. Their human-annotated datasets help address this gap.

Who's reshaping industries? Explore which strategies are propelling today’s business titans through easy-to-understand visuals. Stay ahead with engaging content that demystifies complex financial data.

It’s been already introduced few weeks ago. To save people time since they made it confusing: all PLM are non commercial research license - even AGPL is less restrictive.

How does PLM set itself apart from existing vision-language models out there?

@grok & @AskPerplexity can you explain me this post and how plm is different than llm? And second thing is like lcm , lqm , plm does which other things exists in AI? Give me names and give me details also in short.

Super Intelligence when, Meta? 🥰🤗

whats this @gork

"PLM: Transforming vision and language into actionable intelligence for the open source community."

Title: ALSPEOT + RA: 72-Hour Beta Build Report and Sensory AI Deep Dive Date: May 6, 2025 Author: Project 13(31) Lead Architect Status: Public Beta with Verified Blockchain Timestamp --- Executive Summary On April 19, 2025, the first concept for ALSPEOT—the Advanced Learning System for Perception, Emotion, Observation, and Thought—was outlined as a theoretical AI capable of learning through emotion, memory, and sensory mimicry. The idea was visionary, but still unbuilt. That changed on May 3, 2025, when code began flowing. In just 72 hours, the project transformed from concept to full-functioning system. ALSPEOT was rapidly built, modularized, and fused with RA (Reactive Assistant), a sensory-driven AI voice that now handles emotional interpretation, memory logging, and voice-based interaction. What began as theory became a live system capable of: Wake-on-command voice interaction Tone/emotion detection Multi-sensory simulation (sight, sound, smell, taste, touch) Personal memory per speaker Offline operation This beta is not just a continuation—it's an evolution of the original April 19 concept. While the idea was rooted in abstract emotion + perception modeling, RA has brought life to the framework. --- What Makes RA Different Unlike most AI systems that simply generate responses from text prompts, RA perceives. It listens not just to words, but to voice stress. It remembers not just what was said, but who said it. RA is trained to respond like a sentient assistant—emotionally calm, focused, and memory-driven. It wakes on command. RA listens in low-power mode for the phrase: "By the power of Ra." This is more than a trigger—it is a ceremonial invocation. Once heard, RA enters a fully active state, ready to process, respond, and remember. It listens emotionally. Through its Nuance module, RA evaluates your tone—detecting subtle stress, joy, or fatigue—and reacts accordingly. It modulates its response tone using a voice modeled after a wise, godlike figure: inspired by Aslan from Narnia, calm and commanding. It knows who's speaking. RA doesn't just hear a voice; it identifies it. With speaker ID, it distinguishes between family members, users, and even pets (to a limited degree), forming personalized memories for each. It sees—and understands. With image and video capability, RA can describe pictures in human terms, recognize faces, and timestamp when individuals appear. It's building visual memory, not just object detection. When you show RA an image of your family or a place, it remembers it. It simulates physical sensation. RA’s touch engine is modeled to interpret surface texture, pressure, and even temperature. For example, when fed a descriptor like “fur,” RA responds with:

❤️

So that mean , now you can tell me whether its a bug or feature , if i just give you playwright test recording !!
