Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Most devs think LLM's are only good for producing text. But the real value prop of LLM's is turning raw text/data into structured, searchable objects. If you've never heard of structured outputs, now's your chance:

Matt Pocock

282,876 subscribers

211,726 views • 9 months ago •via X (Twitter)

Science & Technology Education

Anya Rossi• Live Now

Private livecam show

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

New package + paper drop 📄 - Introducing KGGen – a simple library to transform unstructured text into knowledge graphs. Text is abundant, but good knowledge graphs are scarce. Feed it raw text, and KGGen generates a structured network of entities and relationships. (1/7)

New package + paper drop 📄 - Introducing KGGen – a simple library to transform unstructured text into knowledge graphs. Text is abundant, but good knowledge graphs are scarce. Feed it raw text, and KGGen generates a structured network of entities and relationships. (1/7)

Belinda

11,041 views • 1 year ago

Most devs are relying heavily on LLM's. But a lot of them STILL don't know what tokens are. If that's you, no shame. Get caught up:

Most devs are relying heavily on LLM's. But a lot of them STILL don't know what tokens are. If that's you, no shame. Get caught up:

Matt Pocock

119,523 views • 9 months ago

New Short Course: Getting Structured LLM Output! Learn how to get structured outputs from your LLM applications in this course, built in partnership with .txt, and taught by Will Kurt, a Founding Engineer, and , Developer Relations Engineer. It's challenging for software to automatically parse through an LLM's freeform text outputs. Structured outputs—like JSON—solve this by converting natural language into consistent, clear, data that a machine can read and process. This course teaches you how to generate structured outputs while building several use cases, including a social media analysis agent. You’ll learn about structured outputs and efficient ways to generate outputs in your defined schema or format. You’ll begin by using structured output APIs, then use re-prompting libraries like “instructor” to generate structured output. Finally, you’ll learn how constrained decoding works; this is a very clever technique in which constraints are applied on each subsequent token generated, blocking any tokens that don’t fit your defined schema. In detail, you’ll: - Learn why structured outputs are important, how they allow for scalable software development, and the different approaches to generate them, including vendor-provided APIs, re-prompting libraries, and structured generation. - Build a simple social media agent using OpenAI’s structured output API, learn how to define a model's desired structured output using Pydantic, and perform basic programming with your outputs, such as importing structured data into a data frame using pandas. - Learn how to use the open-source library "instructor," which checks the structured output of the model and re-prompts the model until it validates the desired output, and explore the limitations of this approach. - Understand how structured generation by the “outlines” library works by modifying LLM logits, on a per-generated-token basis based on the desired format, to give a particular output structure. - Learn how regular expressions, which outlines works with, are represented as finite-state machines, and how they can be used to develop a range of structured outputs beyond JSON. By the end of this course, you’ll have broadened your knowledge of the approaches you can use to get structured outputs from your LLM applications. Please sign up here:

New Short Course: Getting Structured LLM Output! Learn how to get structured outputs from your LLM applications in this course, built in partnership with .txt, and taught by Will Kurt, a Founding Engineer, and , Developer Relations Engineer. It's challenging for software to automatically parse through an LLM's freeform text outputs. Structured outputs—like JSON—solve this by converting natural language into consistent, clear, data that a machine can read and process. This course teaches you how to generate structured outputs while building several use cases, including a social media analysis agent. You’ll learn about structured outputs and efficient ways to generate outputs in your defined schema or format. You’ll begin by using structured output APIs, then use re-prompting libraries like “instructor” to generate structured output. Finally, you’ll learn how constrained decoding works; this is a very clever technique in which constraints are applied on each subsequent token generated, blocking any tokens that don’t fit your defined schema. In detail, you’ll: - Learn why structured outputs are important, how they allow for scalable software development, and the different approaches to generate them, including vendor-provided APIs, re-prompting libraries, and structured generation. - Build a simple social media agent using OpenAI’s structured output API, learn how to define a model's desired structured output using Pydantic, and perform basic programming with your outputs, such as importing structured data into a data frame using pandas. - Learn how to use the open-source library "instructor," which checks the structured output of the model and re-prompts the model until it validates the desired output, and explore the limitations of this approach. - Understand how structured generation by the “outlines” library works by modifying LLM logits, on a per-generated-token basis based on the desired format, to give a particular output structure. - Learn how regular expressions, which outlines works with, are represented as finite-state machines, and how they can be used to develop a range of structured outputs beyond JSON. By the end of this course, you’ll have broadened your knowledge of the approaches you can use to get structured outputs from your LLM applications. Please sign up here:

Andrew Ng

89,703 views • 1 year ago

Less of a TS tip, more of a public safety announcement: The difference between key optional and value optional is ESSENTIAL in the world of LLM's. LLM's are addicted to { key?: string } and it is a KILLER

Less of a TS tip, more of a public safety announcement: The difference between key optional and value optional is ESSENTIAL in the world of LLM's. LLM's are addicted to { key?: string } and it is a KILLER

Matt Pocock

31,892 views • 1 year ago

Glider makes smart contract code searchable. It turns millions of contracts into structured data you can query. breaks it down:

Glider makes smart contract code searchable. It turns millions of contracts into structured data you can query. breaks it down:

hexens

40,013 views • 7 months ago

Jensen Huang explains why structured data is the foundation of trustworthy AI.

Jensen Huang explains why structured data is the foundation of trustworthy AI.

Yahoo Finance

5,680,616 views • 3 months ago

We’re “pivoting” Elicit with GPT-4 😉 Elicit in 2022 took unstructured text in papers and structured it into a table. Elicit in 2023 will take this structured text and enable you to “pivot” it, grouping it by concepts. Sign up here:

We’re “pivoting” Elicit with GPT-4 😉 Elicit in 2022 took unstructured text in papers and structured it into a table. Elicit in 2023 will take this structured text and enable you to “pivot” it, grouping it by concepts. Sign up here:

Jungwon

248,241 views • 3 years ago

GLM-4.6V can accept multimodal inputs of various types and automatically generate high-quality, structured image-text interleaved content.

GLM-4.6V can accept multimodal inputs of various types and automatically generate high-quality, structured image-text interleaved content.

Z.ai

13,655 views • 6 months ago

react-native-ai now supports ↓ ◆ streaming & generating text ◆ generating structured objects ◆ generating speech ◆ generating transcribes ◆ tool calling all running on-device, supported by  LLM🤯

react-native-ai now supports ↓ ◆ streaming & generating text ◆ generating structured objects ◆ generating speech ◆ generating transcribes ◆ tool calling all running on-device, supported by  LLM🤯

Szymon Rybczak

39,477 views • 10 months ago

Cardinal is a document intelligence platform that turns the trickiest PDFs and scans into structured, LLM-ready data. Most of the world’s data is still locked in PDFs, and after trying every solution and finding none that worked, Harvard + MIT alums Devi and Jianna Liu set out to deliver the high-accuracy outputs LLMs need.

Cardinal is a document intelligence platform that turns the trickiest PDFs and scans into structured, LLM-ready data. Most of the world’s data is still locked in PDFs, and after trying every solution and finding none that worked, Harvard + MIT alums Devi and Jianna Liu set out to deliver the high-accuracy outputs LLMs need.

Y Combinator

43,450 views • 10 months ago

Building automations in 10 seconds. Is it really possible? You've heard of no-code automations (useful but takes a while to build) What if you could write text.... And the automation was built on it's own? It's called "text-to-automation" 👇

Building automations in 10 seconds. Is it really possible? You've heard of no-code automations (useful but takes a while to build) What if you could write text.... And the automation was built on it's own? It's called "text-to-automation" 👇

Aadit Sheth

106,150 views • 3 years ago

I didn’t write this. I just talked and Typeless transformed my voice into perfectly structured text in seconds. - No keyboard. - No friction. - Just pure ideas, captured as they come. If you think faster than you type, this is your superpower. 👉 Try Typeless: Let your voice do the writing.

I didn’t write this. I just talked and Typeless transformed my voice into perfectly structured text in seconds. - No keyboard. - No friction. - Just pure ideas, captured as they come. If you think faster than you type, this is your superpower. 👉 Try Typeless: Let your voice do the writing.

Parul Gautam

83,262 views • 7 months ago

🐪camelAI is the deep research agent for your structured data, and the last BI tool you will ever need. It connects directly to your database, turning natural language questions into in-depth reports complete with beautiful visualizations in seconds.

🐪camelAI is the deep research agent for your structured data, and the last BI tool you will ever need. It connects directly to your database, turning natural language questions into in-depth reports complete with beautiful visualizations in seconds.

Y Combinator

29,739 views • 1 year ago

What if you and your agent had all the data that always stays fresh? Structured, on demand, never stale. Introducing BigSet. Describe the data you need in plain English → get a structured dataset built from the live web, that refreshes regularly. It's live and open-source.

What if you and your agent had all the data that always stays fresh? Structured, on demand, never stale. Introducing BigSet. Describe the data you need in plain English → get a structured dataset built from the live web, that refreshes regularly. It's live and open-source.

TinyFish

209,804 views • 29 days ago

Studies have shown ChatGPT outperforms human annotators for Structured Data by about 25% and costs 30x less. 1 In just 2 months, miners on SN33 running ChatGPT without optimization can’t survive. Today we announce SN33 is now ReadyAI to fully align with our mission 👇 SN33 is building a more performant and significantly cheaper alternative to Scale AI Today structured data is performed primarily by human annotation services like Amazon’s Mechanical Turk and Scale AI It is now more important than ever for every business and individual to make their data AI Ready. However, taking unstructured data and making it Structured Data using today’s tools is extremely costly. SN33 revolutionizes this process, unlocking immense opportunities for commercialization. We lay out the vision for it in this detailed blog post: Validators TODAY can monetize access to this structured data pipeline independently, but we’re streamlining this process, launching a frontend soon that any validator can opt into to provide bandwidth. We've received great feedback from the community, recognizing that what we're building goes far beyond Conversational AI. Building the world's largest annotated conversational dataset (which we've already accomplished) is just one of countless real-world applications for SN33's Structured Data pipeline. We're building a decentralized Scale AI, offering a full suite of Structured Data commodities—from text metadata tagging (available today) to fully customizable queries for company-specific data annotation use cases and image metadata tagging coming soon 👀. Thanks for all the feedback! It has been invaluable so keep bringing it to us! 🙏$TAO Openτensor Foundaτion 1 “ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks” shows “The zero-shot accuracy of ChatGPT exceeds that of crowd-workers by about 25 percentage points on average [...] Moreover, the per-annotation cost of ChatGPT is less than $0.003—about thirty times cheaper than MTurk”

Studies have shown ChatGPT outperforms human annotators for Structured Data by about 25% and costs 30x less. 1 In just 2 months, miners on SN33 running ChatGPT without optimization can’t survive. Today we announce SN33 is now ReadyAI to fully align with our mission 👇 SN33 is building a more performant and significantly cheaper alternative to Scale AI Today structured data is performed primarily by human annotation services like Amazon’s Mechanical Turk and Scale AI It is now more important than ever for every business and individual to make their data AI Ready. However, taking unstructured data and making it Structured Data using today’s tools is extremely costly. SN33 revolutionizes this process, unlocking immense opportunities for commercialization. We lay out the vision for it in this detailed blog post: Validators TODAY can monetize access to this structured data pipeline independently, but we’re streamlining this process, launching a frontend soon that any validator can opt into to provide bandwidth. We've received great feedback from the community, recognizing that what we're building goes far beyond Conversational AI. Building the world's largest annotated conversational dataset (which we've already accomplished) is just one of countless real-world applications for SN33's Structured Data pipeline. We're building a decentralized Scale AI, offering a full suite of Structured Data commodities—from text metadata tagging (available today) to fully customizable queries for company-specific data annotation use cases and image metadata tagging coming soon 👀. Thanks for all the feedback! It has been invaluable so keep bringing it to us! 🙏$TAO Openτensor Foundaτion 1 “ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks” shows “The zero-shot accuracy of ChatGPT exceeds that of crowd-workers by about 25 percentage points on average [...] Moreover, the per-annotation cost of ChatGPT is less than $0.003—about thirty times cheaper than MTurk”

David Fields

13,638 views • 1 year ago

From fluid state to structured text. No idea where this would end. #troikatext #threejs #shaders

From fluid state to structured text. No idea where this would end. #troikatext #threejs #shaders

Felix Martinez

29,688 views • 1 year ago

Roman Nebo from Ghost Drive highlights the challenge of managing unstructured data as daily uploads surpass 2.5 quintillion bytes. GhostDrive, built on Filecoin, offers AI-powered tools to transform raw data into structured, secure assets.

Roman Nebo from Ghost Drive highlights the challenge of managing unstructured data as daily uploads surpass 2.5 quintillion bytes. GhostDrive, built on Filecoin, offers AI-powered tools to transform raw data into structured, secure assets.

Filecoin

15,309 views • 1 year ago

4/ Content Creation • DeepSeek-V3: Data-driven and structured for investors. • Qwen2.5: Narrative-driven and engaging, but less structured. Winner: DeepSeek-V3 (3-1)

4/ Content Creation • DeepSeek-V3: Data-driven and structured for investors. • Qwen2.5: Narrative-driven and engaging, but less structured. Winner: DeepSeek-V3 (3-1)

Alex Prompter

14,440 views • 1 year ago

Introducing --agent flag in CodeRabbit CLI 🎉 The new --agent flag turns CodeRabbit into a tool your AI agent can use, providing structured JSON output instead of terminal text. Your agent writes code, CodeRabbit reviews it, reads the JSON, and fixes what's flagged.

Introducing --agent flag in CodeRabbit CLI 🎉 The new --agent flag turns CodeRabbit into a tool your AI agent can use, providing structured JSON output instead of terminal text. Your agent writes code, CodeRabbit reviews it, reads the JSON, and fixes what's flagged.

CodeRabbit

22,404 views • 2 months ago

INTRODUCING Notte Building the agentic internet with the strongest web browser for LLM agents. We transform ANY webpage into structured text, enabling better web understanding and navigation. Plug any LLM to to build your own AI agent

INTRODUCING Notte Building the agentic internet with the strongest web browser for LLM agents. We transform ANY webpage into structured text, enabling better web understanding and navigation. Plug any LLM to to build your own AI agent

Notte

225,211 views • 1 year ago