
elvis
@omarsar0 • 305,984 subscribers
Building self-improving AI @dair_ai • Prev: Meta AI | PhD • Learn about AI Agents for FREE here: https://t.co/P5SA9u54xO
Shorts
Videos

LLM Wikis + HTML Artifacts are insanely powerful. You should seriously consider this in your workflows. LLM Wikis captures all the important information that lets you and your agents do meaningful work. HTML artifacts present that information in interesting ways that allow you to take important actions along with your agents. My HTML artifacts sit on top of my LLM wikis. They are dynamic and are easily extended as needs arise. I have hooked my Artifacts to talk to my agents, and similarly, the agents can talk to artifacts. This has allowed me to build powerful artifacts that reduce my inbox to zero, keep me updated on any topic of interest, fast prototyping, do deep research, design/trigger new experiments, generate figures to improve understanding, schedule research, search relevant information, discover topics, and so much more. What you see in the clip is not a website. It's a simple interactive HTML artifact. HTML artifacts are useful for designers, engineers, researchers, students, and anyone working with agents. Lastly, HTML doesn't replace Markdown. They are a much better combination working together.
elvis244,834 次观看 • 26 天前

This is just mindblowing stuff! I couldn't resist replicating this workflow to generate 3D biological structures. In a few minutes, I designed an artifact specifically built to generate these for any topic. Stack: - HTML Artifact to view diagrams - Gemini Nano Pro for concept generation - Tripo for generative 3D - Codex for assembling everything AI will exponentially accelerate learning and democratize high-quality education. Stay tuned! We have a few releases on this front.
elvis106,045 次观看 • 24 天前

arXiv Papers → LLM Artifacts This is how I keep up with AI research now. It's like having access to the most personalized arXiv feed. Automations run everyday to curate papers based a set of rules and insights. Curated papers are indexed and power the artifacts. Agent convert papers to LLM wikis (based on Andrej Karpathy idea), which means insights are indexed and easily searchable and reusable. I feel like LLM Artifacts is the natural evolution to LLM Wikis. It's about making that knowledge actionable. Artifacts are customizable via agents. Artifacts can interact with agents and are dynamic in nature. Anything can be injected into the artifact as needed (insights, components, suggested experiments, action items, etc). I can take action on Artifact items with my agent orchestrator (Electron app). So I can ask questions about any paper and automate experiments in the background right from within the artifact. This is more than a visual. It's not a single prompt. It's several proactive agents coordinating to surface interesting facts, knowledge, and insights that I can act on a researcher. Agents are not just for generating useful artifacts, they are useful to keep learning and staying on the cutting edge of knowledge. Stay tuned for more.
elvis58,154 次观看 • 28 天前

I have been testing DeepSeek-V4-Pro with the Pi coding agent. I am mindblown by how well it works out of the box. A few notes: I spent a few hours building an LLM wiki with an agent powered entirely by DeepSeek-V4-Pro on Fireworks AI inference. This is the first time I feel like there is an open-weight model that can reason at the level of Claude and Codex. And it does this in a cost-effective way with support for 1M context length. To be clear, I am using DeepSeek-V4-Pro inside of Pi without any special configuration. It works out of the box. It's exciting that there is a model that can just be plugged into a basic harness like Pi, and it just works. I've never seen that before. Most models require lots of configuration and setup. DeepSeek's DeepSeek-V4-Pro is clearly good at agentic coding (probably the best from the open-weight models), but the model is also great on knowledge-intensive tasks where reasoning matters. The agent pulled agentic engineering best practices from different company docs (Anthropic, OpenAI, Google, Stripe, Meta, Modal, DeepSeek, Mistral, Cohere), searched and digested Reddit and HN threads, summarized arxiv papers, and surfaced trending GitHub repos. Then it distilled everything into actionable tips across categories. I love the Wiki it built. The quality is really good. Here is a snapshot of what the wiki looks like: DeepSeek-V4-Pro handled the task without breaking stride. Multi-step research queries, code generation for scaffolding, context-heavy reasoning across disparate sources. For coding specifically, this is the first open-weight model that genuinely feels like a Codex or Claude Code experience. It compares in capability and actual multi-turn agentic work. What made the loop feel so responsive was Fireworks' inference speed (the fastest in the market) and the fact that they actually validate models at the systems level before shipping. No corrupted reasoning traces. Just fast, reliable iteration. The hybrid CSA and HCA attention design cuts KV cache to just 10% and inference FLOPs by nearly 4x at 1M-token context. This is what makes the agent loop actually fast and cheap enough to run in practice. For devs who've been watching open-weight models close the gap but haven't found one that actually delivers in practice, this is the closest I've seen. Try it here:
elvis55,948 次观看 • 1 个月前

YT Podcast → LLM Artifact This is now my favorite way to consume podcasts. Knowledge artifacts generated by agents. The agent (Opus 4.7) spots important insights, does deep analysis, and generates thought-provoking observations that really get me curious to research further. All the research goes into a self-improving wiki for later use by any of my agents. I am using Elevenlabs Scribe for diarization. Skills and scripts to guide the artifact generation. Artifacts are just plain HTML + JS (e.g., chart.js). You can go further on anything by selecting text and components (charts) and doing deeper research, as I show in the clip. I expect more people to use agents in this way.
elvis30,921 次观看 • 1 个月前

Finally got a chance to play around with Andrej Karpathy's LLM Council. I built it as a plugin inside of Claude Code. Hooked it up with OpenRouter for models. The AskUserQuestion tool came in handy to select the council and chairman. This is my first test, but I agree with Karpathy that the concept of LLM ensembles can be used beyond models that offer perspectives on interesting questions. I feel like this could have really cool applications in agentic coding. More on that soon. I built this as a plugin, so next I will be exploring other user cases around agentic coding, like evaluation, tool building, designing, and research. If there is enough interest, I will clean it up and push it out as an open plugin.
elvis79,529 次观看 • 4 个月前