Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

We have HOT3D! I've started using Claude to port more datasets into Rerun and exoego-forge. I'd been meaning to bring in the HOT3D dataset from Meta for a while, but with Claude, it's way easier. My goal is to take any egocentric, exocentric, or both datasets and ingest them... into a standardized schema. Getting everything into Rerun means we can easily query and transform data via the in-memory OSS server. This lets us generate SQL-like queries such as: "Find me all frames that only contain left hands in the leftmost camera view." Most people think of Rerun as a viewer, but this is the actual superpower. So far we have: 1. HOT3D 2. Hocap 3. UmeTrack 4. Assembly101 5. EgoDex Planning to add more, and with every addition, it gets easier as we build up agent skills and better code examples. Hoping to make it almost fully automatic for adding new datasets. The next few I'm looking at are Harmony4D and Aria Pilot Gen2 After we have enough samples, I'll work on bringing in all the different algorithms I've worked on to transform the data 🙂show more

Pablo Vela

2,652 subscribers

35,662 Aufrufe • vor 3 Monaten •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Working on adding a new dataset to the lineup. Ported ego-dex over to Rerun With rerun now stabilizing RRD format between versions (0.23 -> 0.24), this is the perfect time to start encoding all of the datasets I've been using to RRD 1. I'm starting with ego-dex and then adding others, such as HOCAP/Assembly 101 2. Looking to see if it also makes sense to port to webdatasets RRD 3. I've started including visualizing confidence — green (high), yellow (medium), red (low). More info on Friday

Working on adding a new dataset to the lineup. Ported ego-dex over to Rerun With rerun now stabilizing RRD format between versions (0.23 -> 0.24), this is the perfect time to start encoding all of the datasets I've been using to RRD 1. I'm starting with ego-dex and then adding others, such as HOCAP/Assembly 101 2. Looking to see if it also makes sense to port to webdatasets RRD 3. I've started including visualizing confidence — green (high), yellow (medium), red (low). More info on Friday

Pablo Vela

34,253 Aufrufe • vor 1 Jahr

Most people think Rerun is a visualization tool. In reality, it's a database masquerading as a visualizer. I wanted to showcase this functionality by building a full data pipeline consisting of: ingestion → baseline method → eval → finetuning for SLAM on egocentric data. I'll eventually extend this to the rest of my ego/exo datasets, but I wanted to start with a smaller bunch of datasets first. Rerun allows you to expose your saved .rrd files to a catalog where you store datasets. You can query, filter, and join them like any database using DataFusion under the hood. These are the same .rrd files that are automatically generated whenever you visualize anything in Rerun and decide to save it to disk. I brought in 109 VSLAM-LAB sequences across 14 datasets into the Rerun catalog as an example. These include 7Scenes, Euroc, eth3d, and others. Now I can query them with segment_table, filter_segments, and filter_contents instead of parsing CSVs and YAML files. With a strong set of ground-truth datasets for SLAM, baseline additions become nearly automatic with agents like Opus/Codex. This unification of data and visualization is imo the largest missing part for Physical AI. Visualization becomes a natural byproduct of having your data properly structured and queryable. The catalog API is what makes it a database, not just a viewer. I initially focused on VSLAM-LAB data, but I'll migrate all the egoexo data to this format in the coming days to really show just how useful this is.

Most people think Rerun is a visualization tool. In reality, it's a database masquerading as a visualizer. I wanted to showcase this functionality by building a full data pipeline consisting of: ingestion → baseline method → eval → finetuning for SLAM on egocentric data. I'll eventually extend this to the rest of my ego/exo datasets, but I wanted to start with a smaller bunch of datasets first. Rerun allows you to expose your saved .rrd files to a catalog where you store datasets. You can query, filter, and join them like any database using DataFusion under the hood. These are the same .rrd files that are automatically generated whenever you visualize anything in Rerun and decide to save it to disk. I brought in 109 VSLAM-LAB sequences across 14 datasets into the Rerun catalog as an example. These include 7Scenes, Euroc, eth3d, and others. Now I can query them with segment_table, filter_segments, and filter_contents instead of parsing CSVs and YAML files. With a strong set of ground-truth datasets for SLAM, baseline additions become nearly automatic with agents like Opus/Codex. This unification of data and visualization is imo the largest missing part for Physical AI. Visualization becomes a natural byproduct of having your data properly structured and queryable. The catalog API is what makes it a database, not just a viewer. I initially focused on VSLAM-LAB data, but I'll migrate all the egoexo data to this format in the coming days to really show just how useful this is.

Pablo Vela

34,937 Aufrufe • vor 2 Monaten

I've been on a SLAM/SFM kick. It's one of the more underexplored and lacking areas when it comes to human teleop/data collections, so I've brought over Deep Patch Visual Odometry/SLAM to Rerun and Gradio. With this example, we now have 1. pycuvslam 2. pycolmap/glomap 3. mast3r-slam 4. dpvo/slam all integrated into rerun. The question becomes, which method should be used in what situations? They all make different trade-offs with different camera requirements and throughput/accuracy. What about when a new method comes out? Now that I have several different methods, I plan to use VSLAM-LAB for evaluation. It uses prefix.dev to isolate all the dependencies of each of these methods and easily compare them against each other. In particular, I'll be converting the data preprocessing, algorithm outputs, and evaluation into rerun recordings (rrd files). This will allow both programmatic querying of anything stored in the files (which method had the highest ATE-to-FPS ratio? Which dataset/sequence caused the most difficulty? etc. etc.), all with easy visual inspection using the rerun server to link them all together. Another really important side effect of this is how it impacts agents. As Karpathy said ``` LLMs are exceptionally good at looping until they meet specific goals, and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria, and watch it go. ``` by having accuracy and throughput metrics deeply tied with human inspectable artifacts. One can really accelerate agentic development with an actual understanding of how the method/data performs. I think this is another killer use case that I'll be really leaning into to make ingestion of new datasets/methods trivial with an agent. I'm making it my mission for folks to understand that rerun as a visualization tool only scratches the surface of what its true benefit is. Deep integration between data and visuals, with powerful query capabilities. I'll be focusing on the SLAM use case first and then bringing this into the full egocentric/exocentric data collection domain!

I've been on a SLAM/SFM kick. It's one of the more underexplored and lacking areas when it comes to human teleop/data collections, so I've brought over Deep Patch Visual Odometry/SLAM to Rerun and Gradio. With this example, we now have 1. pycuvslam 2. pycolmap/glomap 3. mast3r-slam 4. dpvo/slam all integrated into rerun. The question becomes, which method should be used in what situations? They all make different trade-offs with different camera requirements and throughput/accuracy. What about when a new method comes out? Now that I have several different methods, I plan to use VSLAM-LAB for evaluation. It uses prefix.dev to isolate all the dependencies of each of these methods and easily compare them against each other. In particular, I'll be converting the data preprocessing, algorithm outputs, and evaluation into rerun recordings (rrd files). This will allow both programmatic querying of anything stored in the files (which method had the highest ATE-to-FPS ratio? Which dataset/sequence caused the most difficulty? etc. etc.), all with easy visual inspection using the rerun server to link them all together. Another really important side effect of this is how it impacts agents. As Karpathy said ``` LLMs are exceptionally good at looping until they meet specific goals, and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria, and watch it go. ``` by having accuracy and throughput metrics deeply tied with human inspectable artifacts. One can really accelerate agentic development with an actual understanding of how the method/data performs. I think this is another killer use case that I'll be really leaning into to make ingestion of new datasets/methods trivial with an agent. I'm making it my mission for folks to understand that rerun as a visualization tool only scratches the surface of what its true benefit is. Deep integration between data and visuals, with powerful query capabilities. I'll be focusing on the SLAM use case first and then bringing this into the full egocentric/exocentric data collection domain!

Pablo Vela

40,744 Aufrufe • vor 2 Monaten

We have SLAM on the Robocap! 🎉 Visualized with Rerun Using NVIDIA AI Developer cuVSLAM for GPU-accelerated multicamera tracking. I basically wrote zero code myself and fully used Claude Code for this. It worked because I had so many existing examples to point to that it just wrote everything the way I would have. A few technical wins: 1. Used rattler-build from prefix.dev to package the compiled cuVSLAM CUDA binaries, which made it SUPER easy to use across repos. This also means it works on the DGX Spark (ARM64) out of the box. 2. Zero-setup experience: git clone && pixi run track-robocap auto-downloads a 100MB dataset from HuggingFace and tracks frames. 3. Real-time 3D visualization with trajectories, landmarks, pose graphs, and video playback in Rerun. Still visual-only (not visual-inertial yet), and loop closure needs some debugging. Next steps are getting this into a Gradio interface, then into daggr, and extending it to work with other datasets from exoego-forge. The last piece I'm excited about: Rerun's RRD files now support layers for incremental data. Planning to build pipelines that go from raw sensor data → slam -> human pose → depth estimation → etc. Repo here:

We have SLAM on the Robocap! 🎉 Visualized with Rerun Using NVIDIA AI Developer cuVSLAM for GPU-accelerated multicamera tracking. I basically wrote zero code myself and fully used Claude Code for this. It worked because I had so many existing examples to point to that it just wrote everything the way I would have. A few technical wins: 1. Used rattler-build from prefix.dev to package the compiled cuVSLAM CUDA binaries, which made it SUPER easy to use across repos. This also means it works on the DGX Spark (ARM64) out of the box. 2. Zero-setup experience: git clone && pixi run track-robocap auto-downloads a 100MB dataset from HuggingFace and tracks frames. 3. Real-time 3D visualization with trajectories, landmarks, pose graphs, and video playback in Rerun. Still visual-only (not visual-inertial yet), and loop closure needs some debugging. Next steps are getting this into a Gradio interface, then into daggr, and extending it to work with other datasets from exoego-forge. The last piece I'm excited about: Rerun's RRD files now support layers for incremental data. Planning to build pipelines that go from raw sensor data → slam -> human pose → depth estimation → etc. Repo here:

Pablo Vela

50,244 Aufrufe • vor 3 Monaten

✨ Massive Pipeline Refactor → One Framework for Ego + Exo Datasets, Visualized with Rerun 🚀 After a deep refactoring and cleanup, my entire egocentric/exocentric pipeline is now fully modular. One codebase handles different sensor layouts and generates a unified, multimodal timeseries RRD file that you can open instantly in Rerun. --- The first three datasets that are already supported 1. Assembly101 – 4 ego Quest‑style fisheye cams + 8 exo pinhole cams 2. HO‑Cap – 1 ego HoloLens pinhole cam + 8 exo pinhole cams 3. EgoDex – 1 ego Apple Vision Pro pinhole cam Unified geometry: Each frame now logs _both_ camera intrinsics / extrinsics and COCO Whole-Body 133-kp keypoints in the same stream. Everything is canonicalized at import time, so there’s zero OpenCV vs OpenGL guess-work—Rerun reads it all in the correct coordinate system automatically. --- Why this matters - Consistent schema ✚ live visuals – Rerun’s deep link between data & rendering means every experiment comes with a built‑in viewer. No more ad‑hoc OpenCV/matplotlib hacks just to sanity‑check a dataset. - Multi‑terabyte friendly – The next step is bulk‑ingest these giants into Rerun and wrap them in a Gradio UI for point‑and‑click exploration, as I've already done for EgoDex!

✨ Massive Pipeline Refactor → One Framework for Ego + Exo Datasets, Visualized with Rerun 🚀 After a deep refactoring and cleanup, my entire egocentric/exocentric pipeline is now fully modular. One codebase handles different sensor layouts and generates a unified, multimodal timeseries RRD file that you can open instantly in Rerun. --- The first three datasets that are already supported 1. Assembly101 – 4 ego Quest‑style fisheye cams + 8 exo pinhole cams 2. HO‑Cap – 1 ego HoloLens pinhole cam + 8 exo pinhole cams 3. EgoDex – 1 ego Apple Vision Pro pinhole cam Unified geometry: Each frame now logs _both_ camera intrinsics / extrinsics and COCO Whole-Body 133-kp keypoints in the same stream. Everything is canonicalized at import time, so there’s zero OpenCV vs OpenGL guess-work—Rerun reads it all in the correct coordinate system automatically. --- Why this matters - Consistent schema ✚ live visuals – Rerun’s deep link between data & rendering means every experiment comes with a built‑in viewer. No more ad‑hoc OpenCV/matplotlib hacks just to sanity‑check a dataset. - Multi‑terabyte friendly – The next step is bulk‑ingest these giants into Rerun and wrap them in a Gradio UI for point‑and‑click exploration, as I've already done for EgoDex!

Pablo Vela

20,836 Aufrufe • vor 1 Jahr

Colmap 4.0 was very recently released, so it inspired me to do some work to better understand it and its new capabilities with Rerun. I want to really understand how Colmap, and in particular, pycolmap, works outside of just calling it via the CLI. So my goal is to use the low-level pycolmap API to log every part of the pipeline. The explicit goal is to have an alternative to the SQLite database that I can utilize. Instead of SQLite, I want to try logging everything directly to rerun and use RRD. This means I can have deep inspectability and still save the features/matches/2D view geometry, but be able to view it directly in rerun. I think this is one of the superpowers that rerun provides; data and visualizations are deeply integrated. As I'm often working with sequential data (videos), I'm going to specifically focus on four things: 1. Monocular Video Simple: Calls high-level APIs such as pycolmap.extract_features, pycolmap.match_sequential, pycolmap.incremental_mapping. These are basically identical to the CLI options and provide a good baseline. 2. Monocular Video Streamed: Take the above high-level APIs and break them down to their iterator version, logging each component in a streamed manner. This way, I can stream the intermediate features to rerun while the extraction/matching/mapping is happening. 3. Rig with unknown calibration: <- WHAT THE VIDEO SHOWS This is probably the most interesting version and the first one I've been working on. It allows one to set a rig between known sensors, such as in VR/AR devices, leading to much better reconstructions with multiple cameras. This is the case where we don't know the calibration a priori, so we have to run a reconstruction twice: once as a normal Colmap reconstruction with no rig constraints, use this to generate the constraints, and then do it again with the newly found rig. 4. Rig with known calibration: This is the RoboCap example, where we have a pre-calibrated set of sensors, so we don't need to run the two reconstructions and also gain better matching between cameras, both spatially and temporally. Again, this leads to a much better reconstruction! Along with all this, GLOMAP has become a first-class global mapper, making it super easy to use directly within pycolmap! I'm excited to do more with this and compare it to things like pycuvslam, vipe, and other alternatives.

Colmap 4.0 was very recently released, so it inspired me to do some work to better understand it and its new capabilities with Rerun. I want to really understand how Colmap, and in particular, pycolmap, works outside of just calling it via the CLI. So my goal is to use the low-level pycolmap API to log every part of the pipeline. The explicit goal is to have an alternative to the SQLite database that I can utilize. Instead of SQLite, I want to try logging everything directly to rerun and use RRD. This means I can have deep inspectability and still save the features/matches/2D view geometry, but be able to view it directly in rerun. I think this is one of the superpowers that rerun provides; data and visualizations are deeply integrated. As I'm often working with sequential data (videos), I'm going to specifically focus on four things: 1. Monocular Video Simple: Calls high-level APIs such as pycolmap.extract_features, pycolmap.match_sequential, pycolmap.incremental_mapping. These are basically identical to the CLI options and provide a good baseline. 2. Monocular Video Streamed: Take the above high-level APIs and break them down to their iterator version, logging each component in a streamed manner. This way, I can stream the intermediate features to rerun while the extraction/matching/mapping is happening. 3. Rig with unknown calibration: <- WHAT THE VIDEO SHOWS This is probably the most interesting version and the first one I've been working on. It allows one to set a rig between known sensors, such as in VR/AR devices, leading to much better reconstructions with multiple cameras. This is the case where we don't know the calibration a priori, so we have to run a reconstruction twice: once as a normal Colmap reconstruction with no rig constraints, use this to generate the constraints, and then do it again with the newly found rig. 4. Rig with known calibration: This is the RoboCap example, where we have a pre-calibrated set of sensors, so we don't need to run the two reconstructions and also gain better matching between cameras, both spatially and temporally. Again, this leads to a much better reconstruction! Along with all this, GLOMAP has become a first-class global mapper, making it super easy to use directly within pycolmap! I'm excited to do more with this and compare it to things like pycuvslam, vipe, and other alternatives.

Pablo Vela

30,070 Aufrufe • vor 3 Monaten

0.32 has shipped, and it's a massive release from Rerun. There's a ton of cool new features, and I wanted to highlight 2 in particular 1. OSS Server streaming from disk 2. Dataset review I walk you through them in the video, so take a look. I'll have a much longer blog post next week about the entire pipeline. With 0.32, much of the foundation is set for a unified data layer for physical data, and I'll be getting into the details of it with all that I've built over the past year. This will cover 1. Raw Data Collection 2. Data Ingestion 3. Catalog Registration 4. Query and Review 5. Post Process 6. Training so lots to share

0.32 has shipped, and it's a massive release from Rerun. There's a ton of cool new features, and I wanted to highlight 2 in particular 1. OSS Server streaming from disk 2. Dataset review I walk you through them in the video, so take a look. I'll have a much longer blog post next week about the entire pipeline. With 0.32, much of the foundation is set for a unified data layer for physical data, and I'll be getting into the details of it with all that I've built over the past year. This will cover 1. Raw Data Collection 2. Data Ingestion 3. Catalog Registration 4. Query and Review 5. Post Process 6. Training so lots to share

Pablo Vela

11,264 Aufrufe • vor 1 Monat

The era of manually analyzing data will come to an end. AI can now do a lot of this automatically. It's a huge time saver. I do data analysis for a living, and I'm a huge fan of writing Jupyter notebooks to do it all, but it's now hard to justify manually writing code that you can generate in a few seconds. I still check everything manually, but breaking down datasets into tables and charts is now 10x easier than it's ever been. Here is a video where I'm using Retool. I load a dataset and generate a few charts as quickly as I can think. The speed at which we can go from one idea to a working solution is astonishing.

The era of manually analyzing data will come to an end. AI can now do a lot of this automatically. It's a huge time saver. I do data analysis for a living, and I'm a huge fan of writing Jupyter notebooks to do it all, but it's now hard to justify manually writing code that you can generate in a few seconds. I still check everything manually, but breaking down datasets into tables and charts is now 10x easier than it's ever been. Here is a video where I'm using Retool. I load a dataset and generate a few charts as quickly as I can think. The speed at which we can go from one idea to a working solution is astonishing.

Santiago

70,786 Aufrufe • vor 8 Monaten

More progress! I now have two Dockerized Gradio | Rerun apps. The first one takes as input a "raw" rrd file that consists of the synchronized egocentric and exocentric MP4 files. This runs the pipeline and produces an "annotated" rrd file. This has the camera parameters, 3D joints, and projected 2D joints (with 6DOF mano soon). The second app takes this "annotated" rrd file and allows for manual labeling. This is a crucial step in addressing any major failures in the pipeline. Right now, it is only the ego view that can be modified. But I'll eventually extend to all. This results in a final "gt" rrd file. From here, the plan is to improve quality and start building a data loop. Excited to start really scaling this. I'm basically going all in on keeping my data stored as Rerun rrd files. As always, I want to emphasize how crucial it is to LOOK AT YOUR data! The rrd format makes it incredibly easy to do so. Getting the data out to use is a bit of a hassle right now, but for me, it's well worth the tradeoff.

More progress! I now have two Dockerized Gradio | Rerun apps. The first one takes as input a "raw" rrd file that consists of the synchronized egocentric and exocentric MP4 files. This runs the pipeline and produces an "annotated" rrd file. This has the camera parameters, 3D joints, and projected 2D joints (with 6DOF mano soon). The second app takes this "annotated" rrd file and allows for manual labeling. This is a crucial step in addressing any major failures in the pipeline. Right now, it is only the ego view that can be modified. But I'll eventually extend to all. This results in a final "gt" rrd file. From here, the plan is to improve quality and start building a data loop. Excited to start really scaling this. I'm basically going all in on keeping my data stored as Rerun rrd files. As always, I want to emphasize how crucial it is to LOOK AT YOUR data! The rrd format makes it incredibly easy to do so. Getting the data out to use is a bit of a hassle right now, but for me, it's well worth the tradeoff.

Pablo Vela

19,527 Aufrufe • vor 8 Monaten

Arteta asked if the current team are better than the invincibles: “No, because Invincibles won a lot and they won consistently and they created a history and legacy and we have to do that. “Obviously there is a lot of stats, but in the last two or three years, as well, we had stats and more points and more goals and the history of that, but at the end we have to translate that into major trophies and what we want to do, and probably now what we are doing. “It would have been enough, but now it's not enough, and we have to make it the margins even bigger, and that's what we have to aim for.”

Arteta asked if the current team are better than the invincibles: “No, because Invincibles won a lot and they won consistently and they created a history and legacy and we have to do that. “Obviously there is a lot of stats, but in the last two or three years, as well, we had stats and more points and more goals and the history of that, but at the end we have to translate that into major trophies and what we want to do, and probably now what we are doing. “It would have been enough, but now it's not enough, and we have to make it the margins even bigger, and that's what we have to aim for.”

Connor Humm

83,193 Aufrufe • vor 5 Monaten

There's been a few cool updates recently. In particular, Rerun 0.33 released headless rendering. This, along with the Fable 5 release pushed me to work torwards making MAMMA realtime! I threw Fable at the problem, and it was able to take original implementation that was ~12 seconds / frame and get it all the way down to 40ms /frame, or nearly a 300x speedup 🏎️ How did I achieve this? TLDR: - Use rerun's headless rendering as supervision when optimizing - Save rrd file as test fixture to guide model optiziation with /goal - create an html artifact with headless rendering to provide detailed breakdown of what it did and how it actually looks like in the viewer There were a few critical bits to make sure that this ACTUALLY worked and that Fable didn't just cheat or delete something and declare victory. The first is that the original version used Rerun, this allowed us to save things to disk as an RRD file, meaning we could query the contents and use this as a sort of test fixture or golden artifact that held EXACTLY what all of the values should be. Then we can use this with /goal as a metric when doing the optimization to ensure there are no regressions. The second bit is the headless rendering, this gave us the ability to check that not only did the test fixture pass, but it also looked visually correct. This made a huge difference, and an awesome side affect of it is that we can use the headless rendering to create an implementations.html file. This gives a visual guide as to what the agent did (I walk through it in the video below) Along with this, we're working on an MCP server for rerun that allows full interactivity with the rerun viewer for your agent. So for example the agent can click, drag, move views, scroll timelines, ect. I used this to help the agent debug certain parts such as when the 2d sam masks didn't line up, or if the triangulated keypoints werent correctly matching with the optimized mesh. The agents could go, click into the view, scroll through the timeline and see where things went wrong. Fable + Headless Rendering + Rerun MCP == 300x speedup in less then a days work With these new tools, I'm planning on going back to my gaussian splatting implemntation and cleaning it up + making it fast!

There's been a few cool updates recently. In particular, Rerun 0.33 released headless rendering. This, along with the Fable 5 release pushed me to work torwards making MAMMA realtime! I threw Fable at the problem, and it was able to take original implementation that was ~12 seconds / frame and get it all the way down to 40ms /frame, or nearly a 300x speedup 🏎️ How did I achieve this? TLDR: - Use rerun's headless rendering as supervision when optimizing - Save rrd file as test fixture to guide model optiziation with /goal - create an html artifact with headless rendering to provide detailed breakdown of what it did and how it actually looks like in the viewer There were a few critical bits to make sure that this ACTUALLY worked and that Fable didn't just cheat or delete something and declare victory. The first is that the original version used Rerun, this allowed us to save things to disk as an RRD file, meaning we could query the contents and use this as a sort of test fixture or golden artifact that held EXACTLY what all of the values should be. Then we can use this with /goal as a metric when doing the optimization to ensure there are no regressions. The second bit is the headless rendering, this gave us the ability to check that not only did the test fixture pass, but it also looked visually correct. This made a huge difference, and an awesome side affect of it is that we can use the headless rendering to create an implementations.html file. This gives a visual guide as to what the agent did (I walk through it in the video below) Along with this, we're working on an MCP server for rerun that allows full interactivity with the rerun viewer for your agent. So for example the agent can click, drag, move views, scroll timelines, ect. I used this to help the agent debug certain parts such as when the 2d sam masks didn't line up, or if the triangulated keypoints werent correctly matching with the optimized mesh. The agents could go, click into the view, scroll through the timeline and see where things went wrong. Fable + Headless Rendering + Rerun MCP == 300x speedup in less then a days work With these new tools, I'm planning on going back to my gaussian splatting implemntation and cleaning it up + making it fast!

Pablo Vela

10,338 Aufrufe • vor 21 Tagen

I've been using Claude Code for non-coding tasks, like sorting emails and analyzing my spending This was annoying to setup because Claude doesn’t have the right context - it can’t see my email or my credit card statements So we built a virtual filesystem that turns all of my data sources (gmail, notion, gdrive and more) into local folders on my computer I can just describe what I need (e.g. "invoices from the last week") and a folder appears on my computer with exactly those files This gives Claude context, but it also gives it memory - since it has a local filesystem that syncs in the background, it continuously gets access to my data The more things you expose as files, the more Claude can do for you

I've been using Claude Code for non-coding tasks, like sorting emails and analyzing my spending This was annoying to setup because Claude doesn’t have the right context - it can’t see my email or my credit card statements So we built a virtual filesystem that turns all of my data sources (gmail, notion, gdrive and more) into local folders on my computer I can just describe what I need (e.g. "invoices from the last week") and a folder appears on my computer with exactly those files This gives Claude context, but it also gives it memory - since it has a local filesystem that syncs in the background, it continuously gets access to my data The more things you expose as files, the more Claude can do for you

Eli Mernit

67,006 Aufrufe • vor 4 Monaten

Consistent Hashing Explained Simply Let's break down consistent hashing in a way that's easy to get. It's a handy technique for distributing data evenly across servers. And if you're more of a visual learner, don't miss the attached video which brings this concept to life. Here's how it works: We don't just hash our data keys; we also hash the server names using the same method. This puts everything on a scale, known as the hash space, which we think of as a big circular ring. So, imagine we've got a few servers. We hash them and place them on this ring. We do the same with our data, hashing it by key. To figure out where to store a piece of data, we travel clockwise on the ring from where the data lands until we bump into a server. Like, data key 0 might end up on server 0, and key 1 on server 1. Adding a new server? Only the data near the new server needs to shift. And if a server has to go? Only the data it was holding needs a new spot. But there's a little challenge: sometimes, data doesn't spread out evenly. The fix? Virtual nodes. They're like mini-replicas of a server placed around the ring. The more virtual nodes we have, the better the balance. But there's a catch: more virtual nodes mean more memory usage for tracking them, and it can make the whole system a bit more complex to manage. – Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages):

Consistent Hashing Explained Simply Let's break down consistent hashing in a way that's easy to get. It's a handy technique for distributing data evenly across servers. And if you're more of a visual learner, don't miss the attached video which brings this concept to life. Here's how it works: We don't just hash our data keys; we also hash the server names using the same method. This puts everything on a scale, known as the hash space, which we think of as a big circular ring. So, imagine we've got a few servers. We hash them and place them on this ring. We do the same with our data, hashing it by key. To figure out where to store a piece of data, we travel clockwise on the ring from where the data lands until we bump into a server. Like, data key 0 might end up on server 0, and key 1 on server 1. Adding a new server? Only the data near the new server needs to shift. And if a server has to go? Only the data it was holding needs a new spot. But there's a little challenge: sometimes, data doesn't spread out evenly. The fix? Virtual nodes. They're like mini-replicas of a server placed around the ring. The more virtual nodes we have, the better the balance. But there's a catch: more virtual nodes mean more memory usage for tracking them, and it can make the whole system a bit more complex to manage. – Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages):

Sahn Lam

36,787 Aufrufe • vor 2 Jahren

🚨OpenAI CFO speaks on AI boom and data centers “I think we are sooo in the early innings. I think a lot of prognosticators wanna call it “we’re on the sugar rush”… WE ARE NOT! more like the railroads or the build out of electricity. The internet turns out in hindsight a relatively capex light build out. I think we are just getting started! Now.. we have to do a lot to make data centers more efficient.. we need to think about new ways to power them. BUT in terms of AI, it is voracious right now for GPUs and COMPUTE. the biggest thing we face is WE ARE CONSTANTLY UNDER COMPUTE That’s why we launched Stargate. That’s why we are doing the BIGGER builds with ..Microsoft*, with ORACLE, CoreWeave and so on.. And we are just getting started!”

🚨OpenAI CFO speaks on AI boom and data centers “I think we are sooo in the early innings. I think a lot of prognosticators wanna call it “we’re on the sugar rush”… WE ARE NOT! more like the railroads or the build out of electricity. The internet turns out in hindsight a relatively capex light build out. I think we are just getting started! Now.. we have to do a lot to make data centers more efficient.. we need to think about new ways to power them. BUT in terms of AI, it is voracious right now for GPUs and COMPUTE. the biggest thing we face is WE ARE CONSTANTLY UNDER COMPUTE That’s why we launched Stargate. That’s why we are doing the BIGGER builds with ..Microsoft*, with ORACLE, CoreWeave and so on.. And we are just getting started!”

NIK

102,475 Aufrufe • vor 10 Monaten

"Claude Code is the first AI tool that can actually recreate how a data scientist thinks." Here's my new episode with Sumeet Marwaha (Head of Data at Brex) on exactly how to build an AI analyst with Claude Code. We covered: ✅ 3 queries to build a data analyst MCP ✅ How to give the analyst both data + context ✅ Exclusive Brex data on which AI tools are crushing it in startups and enterprise Some quotes from Sumeet: "I've set up dashboards many times in my career. They end up getting ignored. Claude will always read it. Claude will always ask the right questions." "You have to set up Claude Code to not just look at the data. It should be able to search Slack, find the incident, and understand why your metrics look wrong." "This tool is absolutely crushing it based on our data. It's both the startup AND enterprise coding tool of choice." 📌 Watch now:

"Claude Code is the first AI tool that can actually recreate how a data scientist thinks." Here's my new episode with Sumeet Marwaha (Head of Data at Brex) on exactly how to build an AI analyst with Claude Code. We covered: ✅ 3 queries to build a data analyst MCP ✅ How to give the analyst both data + context ✅ Exclusive Brex data on which AI tools are crushing it in startups and enterprise Some quotes from Sumeet: "I've set up dashboards many times in my career. They end up getting ignored. Claude will always read it. Claude will always ask the right questions." "You have to set up Claude Code to not just look at the data. It should be able to search Slack, find the incident, and understand why your metrics look wrong." "This tool is absolutely crushing it based on our data. It's both the startup AND enterprise coding tool of choice." 📌 Watch now:

Peter Yang

39,600 Aufrufe • vor 5 Monaten

Nearly 100 days in, and I’m proud of what we have been able to accomplish so far, but we are just getting started. We have a lot more to do as we continue to deliver on the mandate given to us by the American people. This Republican team will get it done.

Nearly 100 days in, and I’m proud of what we have been able to accomplish so far, but we are just getting started. We have a lot more to do as we continue to deliver on the mandate given to us by the American people. This Republican team will get it done.

Leader John Thune

129,058 Aufrufe • vor 1 Jahr

It is launch day. And it is my birthday 🫶 SURREAL. But.. I am ready to make a statement. Everything I stand for is put into this world I've created. Everything I have said, I am living up to. The STRenght of Breeze, lies in the people that support it. While much of this space is all talk. This is all doing. Let it be known, we can build. We can deliver. We can innovate. so. Today. we say. F IT. WE BREEZE. Link in bio. 5 hours left.

It is launch day. And it is my birthday 🫶 SURREAL. But.. I am ready to make a statement. Everything I stand for is put into this world I've created. Everything I have said, I am living up to. The STRenght of Breeze, lies in the people that support it. While much of this space is all talk. This is all doing. Let it be known, we can build. We can deliver. We can innovate. so. Today. we say. F IT. WE BREEZE. Link in bio. 5 hours left.

Dutchtide.eth

53,757 Aufrufe • vor 8 Monaten

IN NEWS: Attio raises $52M series B led by GV. Nicolas Sharp on AI in CRM: "LLMs work great when they have lots of context and you need to feed them that. One of the first things we worked on was this data model and being able to instantly ingest emails, calendars, product data, etc." "We have a thesis that code is becoming the new no-code. Software becomes so much more malleable because code is so much easier to generate."

IN NEWS: Attio raises $52M series B led by GV. Nicolas Sharp on AI in CRM: "LLMs work great when they have lots of context and you need to feed them that. One of the first things we worked on was this data model and being able to instantly ingest emails, calendars, product data, etc." "We have a thesis that code is becoming the new no-code. Software becomes so much more malleable because code is so much easier to generate."

TBPN

16,200 Aufrufe • vor 10 Monaten

Arteta: “We need everything. We need the first of all, we need everybody fit and available. So the ones that are not involved, the ones that are not with us, that are really big, important players, we need them immediately with us because then we're going to be much stronger. “And then, the other ones, they need to stand up me, the first one and embrace this challenge and go for it. So today we have to suffer. It's painful. It's a terrible feeling. But tomorrow is a different day and if somebody would have said to me in August, 'You are in this position right now in April,' I'm sure we would all take it.”

Arteta: “We need everything. We need the first of all, we need everybody fit and available. So the ones that are not involved, the ones that are not with us, that are really big, important players, we need them immediately with us because then we're going to be much stronger. “And then, the other ones, they need to stand up me, the first one and embrace this challenge and go for it. So today we have to suffer. It's painful. It's a terrible feeling. But tomorrow is a different day and if somebody would have said to me in August, 'You are in this position right now in April,' I'm sure we would all take it.”

Connor Humm

59,679 Aufrufe • vor 2 Monaten

Before Claude Fable 5 got banned, I turned all my fine-tuning research and experiments into a product: (A CLI for data generation to fine-tune AI models). Finetuner uses Codex 5.5 or Opus 4.8 as the orchestrator and Chinese models (DeepSeek v4 Pro, Kimi K2.7, MiMo v2.5, etc.) to generate dataset rows. This system is unique because these are not Python-script-paraphrased datasets. Each and every row is handcrafted by Chinese models based on the orchestrator model's instructions. Now anyone can generate datasets to fine-tune small language models (1B to 30B models). I achieved 10x lower costs and 5x better dataset quality. Releasing the product in a few days. (I'll open-source the skills.)

Before Claude Fable 5 got banned, I turned all my fine-tuning research and experiments into a product: (A CLI for data generation to fine-tune AI models). Finetuner uses Codex 5.5 or Opus 4.8 as the orchestrator and Chinese models (DeepSeek v4 Pro, Kimi K2.7, MiMo v2.5, etc.) to generate dataset rows. This system is unique because these are not Python-script-paraphrased datasets. Each and every row is handcrafted by Chinese models based on the orchestrator model's instructions. Now anyone can generate datasets to fine-tune small language models (1B to 30B models). I achieved 10x lower costs and 5x better dataset quality. Releasing the product in a few days. (I'll open-source the skills.)

CJ Zafir

39,205 Aufrufe • vor 17 Tagen