Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Sam Altman says the next big thrust is making models handle much longer tasks We have already moved from models handling 5-second coding tasks with GPT-3.5 to 5-hour tasks with GPT-5 The goal now is to integrate enterprise context so AI can handle tasks that require months or years

Haider.

56,121 subscribers

122,717 Aufrufe • vor 7 Monaten •via X (Twitter)

Bildung Nachrichten & Politik Wissenschaft & Technologie

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Sam Altman says Codex isn't far from week-long tasks and is already doing day-long runs The jump in task length feels disorientingly fast Reaching week-long tasks will require smarter models, longer context, and better memory

Sam Altman says Codex isn't far from week-long tasks and is already doing day-long runs The jump in task length feels disorientingly fast Reaching week-long tasks will require smarter models, longer context, and better memory

Haider.

69,294 Aufrufe • vor 8 Monaten

Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything

Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything

Tsarathustra

224,102 Aufrufe • vor 1 Jahr

Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything - that’s the AGI

Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything - that’s the AGI

Chubby♨️

231,778 Aufrufe • vor 1 Jahr

OpenAI’s new GPT-5 Codex is impressive at coding tasks. But, in a huge codebase, intelligence isn’t the main bottleneck—context is.

OpenAI’s new GPT-5 Codex is impressive at coding tasks. But, in a huge codebase, intelligence isn’t the main bottleneck—context is.

Augment Code

1,188,147 Aufrufe • vor 9 Monaten

.The Information Reporter Stephanie Palazzolo on OpenAI's GPT-5—a major step up for coding and what it means for competitors. "GPT-5 is a step up on a lot of different domains...one area that really stood out to [sources] was with coding." "Not only is GPT-5 better on more academic...tasks, but it's also better on the more practical programming tasks...working with very large and complex code bases." "If GPT-5 is going to be significantly better at these more practical everyday programming tasks, that could prove to be bad news for Anthropic." Watch the full episode on

.The Information Reporter Stephanie Palazzolo on OpenAI's GPT-5—a major step up for coding and what it means for competitors. "GPT-5 is a step up on a lot of different domains...one area that really stood out to [sources] was with coding." "Not only is GPT-5 better on more academic...tasks, but it's also better on the more practical programming tasks...working with very large and complex code bases." "If GPT-5 is going to be significantly better at these more practical everyday programming tasks, that could prove to be bad news for Anthropic." Watch the full episode on

The Information

52,067 Aufrufe • vor 11 Monaten

Sam Altman says GPT-5 is not finished yet, but he expects it to be a "significant leap forward" and better at a wider variety of tasks than GPT-4

Sam Altman says GPT-5 is not finished yet, but he expects it to be a "significant leap forward" and better at a wider variety of tasks than GPT-4

Tsarathustra

111,151 Aufrufe • vor 2 Jahren

OpenAI Lead Researcher, Lukasz Kaiser: Reasoning models already handle most computer tasks — writing, coding, clicking, though still quirky Competition among labs is speeding progress, with research improvements in the pipeline "before 2030, much desk work will shift to AI"

OpenAI Lead Researcher, Lukasz Kaiser: Reasoning models already handle most computer tasks — writing, coding, clicking, though still quirky Competition among labs is speeding progress, with research improvements in the pipeline "before 2030, much desk work will shift to AI"

Haider.

82,622 Aufrufe • vor 8 Monaten

Geoffrey Hinton says that in 2026, AI will gain the capability to replace many jobs Every 7 months, AI doubles the length of tasks it can handle. It went from 1-minute coding to hour-long projects Soon, it'll manage month-long software engineering tasks "then few people will be needed for SWE projects"

Geoffrey Hinton says that in 2026, AI will gain the capability to replace many jobs Every 7 months, AI doubles the length of tasks it can handle. It went from 1-minute coding to hour-long projects Soon, it'll manage month-long software engineering tasks "then few people will be needed for SWE projects"

Haider.

66,008 Aufrufe • vor 5 Monaten

Sam Altman says OpenAI's Deep Research can do 5% of all tasks in the economy today and while it usually takes society two generations to adapt to change, it has never had to do so in 5-10 years as is happening now

Sam Altman says OpenAI's Deep Research can do 5% of all tasks in the economy today and while it usually takes society two generations to adapt to change, it has never had to do so in 5-10 years as is happening now

Tsarathustra

185,410 Aufrufe • vor 1 Jahr

📁 Sam Altman says the real Turing test is when AI can do science, and with GPT-5 we’re already seeing the first signs. In two years, these models will make key discoveries that transform the world, because scientific progress is what drives humanity forward.

📁 Sam Altman says the real Turing test is when AI can do science, and with GPT-5 we’re already seeing the first signs. In two years, these models will make key discoveries that transform the world, because scientific progress is what drives humanity forward.

Jon Hernandez

27,052 Aufrufe • vor 8 Monaten

Ray Kurzweil says AI models will exceed human verbal ability in a year or two and the next step is large event models that enable robots to perform physical tasks

Ray Kurzweil says AI models will exceed human verbal ability in a year or two and the next step is large event models that enable robots to perform physical tasks

Tsarathustra

71,501 Aufrufe • vor 2 Jahren

🚀 MCP support is now available in Early Access. Connect the Rive Editor with AI tools to handle repetitive tasks, like creating complex View Models, State Machines, Layouts, Shapes, and more.

🚀 MCP support is now available in Early Access. Connect the Rive Editor with AI tools to handle repetitive tasks, like creating complex View Models, State Machines, Layouts, Shapes, and more.

Rive

71,873 Aufrufe • vor 1 Jahr

Sam Altman says Next year AI won’t just automate tasks....it’ll start solving problems entire teams struggle with.

Sam Altman says Next year AI won’t just automate tasks....it’ll start solving problems entire teams struggle with.

Chubby♨️

68,455 Aufrufe • vor 1 Jahr

Sam Altman: Codex will handle multi-day tasks next year. Small scientific discoveries are likely by 2026 - and there’s much more to be excited about!

Sam Altman: Codex will handle multi-day tasks next year. Small scientific discoveries are likely by 2026 - and there’s much more to be excited about!

Chubby♨️

18,483 Aufrufe • vor 7 Monaten

We benchmarked leading multimodal foundation models (GPT-4o, Claude 3.5 Sonnet, Gemini, Llama, etc.) on standard computer vision tasks—from segmentation to surface normal estimation—using standard datasets like COCO and ImageNet. These models have made remarkable progress; however, it is unclear exactly where they stand in terms of understanding vision in detail. Especially when it comes to tasks beyond question-answering. How well do they understand an object's segments or geometry? Our analyses yield an assessment that is quantitatively and qualitatively detailed and is compatible with evaluations developed in the field of computer vision over the past decades. Observed trends: 🔹 The foundation models consistently underperform task-specific SOTA models across all tasks. However, they are respectable generalists, which is remarkable as they are presumably trained primarily on image-text-based tasks. 🔹 They perform semantic tasks notably better than geometric ones. 🔹 GPT-4o performs the best among non-reasoning models, getting the top position in 4 out of 6 tasks. 🔹 Reasoning models, e.g., o3, show improvements in geometric tasks. 🔹 The 'image generation' models, e.g., GPT-40 Image Generation, which have been natively trained multimodally, exhibit quirks. E.g., hallucinated objects, misalignment between the input and output, etc. 🔹 While the prompting techniques affect performance, better models exhibit less sensitivity to variations in prompts. We control for the variance introduced by the prompting methods in our experiments. 🌐 Detailed analyses, visualizations: ⌨️ code: 🧵 1/n

We benchmarked leading multimodal foundation models (GPT-4o, Claude 3.5 Sonnet, Gemini, Llama, etc.) on standard computer vision tasks—from segmentation to surface normal estimation—using standard datasets like COCO and ImageNet. These models have made remarkable progress; however, it is unclear exactly where they stand in terms of understanding vision in detail. Especially when it comes to tasks beyond question-answering. How well do they understand an object's segments or geometry? Our analyses yield an assessment that is quantitatively and qualitatively detailed and is compatible with evaluations developed in the field of computer vision over the past decades. Observed trends: 🔹 The foundation models consistently underperform task-specific SOTA models across all tasks. However, they are respectable generalists, which is remarkable as they are presumably trained primarily on image-text-based tasks. 🔹 They perform semantic tasks notably better than geometric ones. 🔹 GPT-4o performs the best among non-reasoning models, getting the top position in 4 out of 6 tasks. 🔹 Reasoning models, e.g., o3, show improvements in geometric tasks. 🔹 The 'image generation' models, e.g., GPT-40 Image Generation, which have been natively trained multimodally, exhibit quirks. E.g., hallucinated objects, misalignment between the input and output, etc. 🔹 While the prompting techniques affect performance, better models exhibit less sensitivity to variations in prompts. We control for the variance introduced by the prompting methods in our experiments. 🌐 Detailed analyses, visualizations: ⌨️ code: 🧵 1/n

Amir Zamir

73,074 Aufrufe • vor 11 Monaten

Sam Altman says each increase in capability and decrease in cost creates massive new demand "even 30 GW with today's models would saturate fast" If GPT-6 is 30 IQ points above GPT-5 and can work for days to months, the economic value will skyrocket

Sam Altman says each increase in capability and decrease in cost creates massive new demand "even 30 GW with today's models would saturate fast" If GPT-6 is 30 IQ points above GPT-5 and can work for days to months, the economic value will skyrocket

Haider.

213,352 Aufrufe • vor 8 Monaten

Absolutely insane: "I don't think Codex is that far away from a week of work." It can already do "day long tasks now." Next steps: "Smarter models, long context, better memory."

Absolutely insane: "I don't think Codex is that far away from a week of work." It can already do "day long tasks now." Next steps: "Smarter models, long context, better memory."

Chubby♨️

28,413 Aufrufe • vor 8 Monaten

🕵‍Jules is an AI coding agent that can handle real coding challenges, improve and understand large codebases, and asynchronously tackle tasks to help you work more efficiently.

🕵‍Jules is an AI coding agent that can handle real coding challenges, improve and understand large codebases, and asynchronously tackle tasks to help you work more efficiently.

Google AI Developers

36,897 Aufrufe • vor 1 Jahr

🧠 Wintermute Alpha Challenge 2025 is live Week 1 starts now with 5 case studies Solve open-ended and coding tasks to climb the leaderboard

🧠 Wintermute Alpha Challenge 2025 is live Week 1 starts now with 5 case studies Solve open-ended and coding tasks to climb the leaderboard

Wintermute

36,196 Aufrufe • vor 10 Monaten