正在加载视频...

视频加载失败

Microsoft presents Windows Agent Arena Evaluating Multi-Modal OS Agents at Scale discuss: Large language models (LLMs) show remarkable potential to act as computer agents, enhancing human productivity and software accessibility in multi-modal tasks that require planning and reasoning. However, measuring agent performance in realistic environments remains a challenge since:...

19,684 次观看 • 1 年前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

Frameworks such as ai16zdao's Eliza and Virtuals Protocol have been instrumental in early AI agent developments. Agent swarms working in hierarchy represents for many the next logical step in unlocking the vast potential of AI. Learn below how Shadō Network achieves this. AI agents launched through current popular platforms have individual personas, on-chain functions and access to data via various APIs. This being said, they operate in isolated environments, with a ceiling on emergent behaviour such as collaboration or competition. Shadō Network invites massive expansion for capabilities of both new and existing AI agents, with an open-source package easily integrated into popular frameworks that enables the launching of stratified agent swarms. Our website is live: The "Shadō Play" package provides a modular, configurable platform for creating or employing agents of choice in a swarm-like setup, opening a Pandora’s box of near infinite emergent agent behaviours, relationships and functionalities. Users will be able to make use of various prefab client integrations such as Twitter, Telegram, Ollama, and others to specify swarms to their needs or create their own extensions to enhance agent capabilities even further. Agents operate with a memory module and a HTN for autonomously deciding which interactions to act on, walking the line between autonomy and configurability. The Shadō Network project’s development is supported by our ghostly friend Omnipotent (👻,👻), an AI agent developed by the Shadō Network team trained on and fine tuned with a multitude of academic data related to artificial intelligence, blockchain, finance, software engineering, world building and more. Omnipotent serves as both an interactive steward for the project and as an asset - regularly scanning social platforms, websites and newsfeeds he is capable of providing the team project development advice, whilst also communicating with the wider world via his automated X account (launching soon). Shado Network is collaborative and open-sourced. Agentic Swarms require a developer swarm to maximize the technical capabilities and impact the greatest number of users. Our dedicated team of core contributors are active in other web3 AI repos and are here to guide project direction and foster growth. We’re facilitators, not gatekeepers... Alone we can go fast but together we can go far. A lot more to come soon. 👻

Shadō Network | シャドウネットワーク

23,546 次观看 • 1 年前

OpenAI's AgentKit will be so insane, build every step of agents on one platform. These visual agent builders make the whole process of iterating and launching agents far more efficient. It sits on top of the Responses API and unifies the tools that were previously scattered across SDKs and custom orchestration. It lets developers create agent workflows visually, connect data sources securely, and measure performance automatically without coding every layer by hand. The core of AgentKit is the Agent Builder, a drag-and-drop canvas where each node represents an action, guardrail, or decision branch. Developers can link these nodes into multi-agent workflows, preview results instantly, and version each setup. It supports inline evaluation so that developers can see how changes affect output before deploying. The Connector Registry is a single admin panel that manages how data and tools connect across the OpenAI ecosystem. It centralizes integrations like Google Drive, SharePoint, Dropbox, and Microsoft Teams. Large organizations can govern access and flow of data between agents securely under one global console. ChatKit provides a ready-to-use chat interface for embedding agents inside apps or websites. It manages streaming, message threads, and model reasoning displays automatically. Developers can skin the interface to match their product without writing custom front-end code. Under the hood, all these blocks use the same execution core that runs agent reasoning through OpenAI’s APIs. Workflows in Agent Builder compile down to structured instructions for the Responses API, which handles model calls, tool use, and context passing. Connector Registry handles authentication and routing for external tools, while Evals and RFT provide feedback loops that improve agents over time. This integration means developers no longer need to handle orchestration logic, model evaluation pipelines, or safety layers separately. Everything runs natively within OpenAI’s control plane with managed security, automatic versioning, and built-in testing. In short, AgentKit standardizes the entire life cycle of an AI agent—from visual design to deployment and performance tuning—inside a single unified system.

Rohan Paul

178,460 次观看 • 8 个月前

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025! Paper: Code: Biological systems are capable of rapid adaptation, given limited sensory cues. For example, our human visual system can quickly adapt and tune its light sensitivity to our surroundings. While modern LLMs exhibit a wide variety of capabilities and knowledge, they remain rigid when adding task-specific capabilities. Traditionally, customizing these models requires gathering large datasets and performing often expensive, time-consuming fine-tuning for specific applications. To bypass these limitations, Text-to-LoRA (T2L) meta-learns a “hypernetwork” that takes in a text description of a desired task, as a prompt, and generates a task-specific LoRA that performs well on the task. In our experiments, we show that T2L can encode hundreds of existing LoRA adapters. While the compression is lossy, T2L maintains the performance of task-specifically tuned LoRA adapters. We also show that T2L can even generalize to unseen tasks given a natural language description of the tasks. Importantly, Text-to-LoRA is parameter-efficient. It generates LoRAs in a single, inexpensive step, based solely on a simple text description of the task. This approach is a step towards dramatically lowering the technical and computational barriers, allowing non-technical users to specialize foundation models using plain language, rather than needing deep technical expertise or large compute resources.

Sakana AI

402,873 次观看 • 11 个月前