Loading video...

Video Failed to Load

Go Home

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025! Paper: Code: Biological systems are capable of rapid adaptation, given limited sensory cues. For example, our human visual system can quickly adapt and...

402,914 views • 1 year ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Microsoft presents Windows Agent Arena Evaluating Multi-Modal OS Agents at Scale discuss: Large language models (LLMs) show remarkable potential to act as computer agents, enhancing human productivity and software accessibility in multi-modal tasks that require planning and reasoning. However, measuring agent performance in realistic environments remains a challenge since: (i) most benchmarks are limited to specific modalities or domains (e.g. text-only, web navigation, Q&A, coding) and (ii) full benchmark evaluations are slow (on order of magnitude of days) given the multi-step sequential nature of tasks. To address these challenges, we introduce the Windows Agent Arena: a reproducible, general environment focusing exclusively on the Windows operating system (OS) where agents can operate freely within a real Windows OS and use the same wide range of applications, tools, and web browsers available to human users when solving tasks. We adapt the OSWorld framework (Xie et al., 2024) to create 150+ diverse Windows tasks across representative domains that require agent abilities in planning, screen understanding, and tool usage. Our benchmark is scalable and can be seamlessly parallelized in Azure for a full benchmark evaluation in as little as 20 minutes. To demonstrate Windows Agent Arena's capabilities, we also introduce a new multi-modal agent, Navi. Our agent achieves a success rate of 19.5% in the Windows domain, compared to 74.5% performance of an unassisted human. Navi also demonstrates strong performance on another popular web-based benchmark, Mind2Web. We offer extensive quantitative and qualitative analysis of Navi's performance, and provide insights into the opportunities for future research in agent development and data generation using Windows Agent Arena.

AK

19,684 views • 1 year ago

Boom! Grok Tasks Make It One Of The Most POWERFUL Real-Time AI Systems In The World. — My How to Use Grok Tasks With Hidden Tools For Powerful Daily Output. Grok Tasks are customizable AI workflows that integrate a variety of tools to streamline daily activities, from research and analysis to creative planning and problem-solving. I have been using them for quite sometime and because of the vital heartbeat of news and first person data on X, it is the most powerful AI platform available. By combining Tasks with tools like web searches, X platform interactions, code execution, and media viewers, you can build efficient, automated processes. These tasks work by prompting Grok with a clear description of what you want to achieve, and Grok will intelligently call the necessary tools in sequence or parallel to deliver results. Here's a step-by-step guide to creating and using Grok Tasks: Step 1: Define Your Task Start by clearly outlining the daily activity or goal. Consider what inputs you have (e.g., a URL, a query, or an attachment) and what output you need (e.g., a summary, calculation, or visual analysis). Break it down into subtasks to identify tool needs. For example, if your task involves researching current events, note that you'll need search and browsing capabilities. Step 2: Review Available Tools Familiarize yourself with the tools Grok can access. Here's a quick overview: - Code Execution: Run Python code for calculations, data processing, or simulations using libraries like numpy, pandas, or sympy. - Browse Page: Fetch and summarize content from any website URL with custom instructions. - Web Search: Perform general internet searches, returning results with optional operators like site:. - Web Search With Snippets: Get quick, detailed excerpts from search results for fact-checking. - X Keyword Search: Advanced search for X posts using operators like from:, since:, or filter:. - X Semantic Search: Find semantically related X posts based on a query, with filters for dates or users. - X User Search: Locate X users by name or handle. - X Thread Fetch: Retrieve a full X post thread, including context like replies and parents. - View Image: Analyze an image from a URL or conversation ID. - View X Video: Extract frames and subtitles from an X-hosted video. - Search PDF Attachment: Query a PDF file for relevant pages using keyword or regex modes. - Browse PDF Attachment: View specific pages of a PDF with text and screenshots. Select tools that align with your task. Aim for a mix to handle data gathering, processing, and visualization. Step 3: Craft Your Prompt Write a detailed prompt to Grok describing the task. Include: - The overall goal. - Specific steps or subtasks. - References to tools if you want to guide the process (e.g., "Use web_search to find sources, then code_execution to analyze data"). - Any constraints, like dates or limits. Example prompt: "Create a Grok Task for my morning routine: Search recent X posts about tech news using x_keyword_search, fetch a key thread with x_thread_fetch, and summarize with browse_page on linked articles." Step 4: Submit and Interact Send your prompt to Grok. It will process the task by calling tools as needed, often in parallel for efficiency. Review the output and refine with follow-up prompts if required (e.g., "Expand on that using view_image for visuals"). Iterate to fine-tune the workflow for reuse. Step 5: Save and Reuse Once refined, note the prompt as a template for future use. You can adapt it for similar tasks, making Grok Tasks a habitual part of your day. Finding Grok Tasks To discover existing Grok Tasks or inspiration for new ones, use X searches with tools like x_keyword_search or x_semantic_search (e.g., query: "Grok Tasks examples" with mode: Latest). Browse community-shared threads via x_thread_fetch, or web_search for tutorials on xAI features. Prompt Grok directly: "Show me popular Grok Tasks for productivity." 1 of 3

Brian Roemmele

152,242 views • 5 months ago