Ben Dicken's banner

Ben Dicken

@BenjDicken • 41,373 subscribers

databases → @planetscale videos → https://t.co/J5soMyfENa writing → https://t.co/YpfkF0JIKG

Shorts

Whoah he's right.

Whoah he's right.

2,445,584 просмотров

oops did it again

oops did it again

4,025,738 просмотров

You asked for it, so here it is. Visualizing CPU cache speeds relative to RAM. Cache optimization is important too!

You asked for it, so here it is. Visualizing CPU cache speeds relative to RAM. Cache optimization is important too!

1,474,277 просмотров

Been working on this for 2 months, and it's finally ready to share! This piece not only *describes* how B-trees and database indexes work, but gives you the opportunity to interact with them, reinforcing the concepts, and providing a bit of fun along the way. Link in thread.

Been working on this for 2 months, and it's finally ready to share! This piece not only describes how B-trees and database indexes work, but gives you the opportunity to interact with them, reinforcing the concepts, and providing a bit of fun along the way. Link in thread.

339,965 просмотров

1 of N reasons why sharding is a great way to scale a database? Backup speed. This is one of the fun little animations I worked on in my recent interactive sharding article. In case you missed it, link below.

1 of N reasons why sharding is a great way to scale a database? Backup speed. This is one of the fun little animations I worked on in my recent interactive sharding article. In case you missed it, link below.

13,157 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Daniel clearly hasn't seen the language balls.

Daniel clearly hasn't seen the language balls.

8,598,040 просмотров • 11 месяцев назад

In case you forgot how good we have it with M.2 NVMe SSD.

In case you forgot how good we have it with M.2 NVMe SSD.

1,372,858 просмотров • 3 месяцев назад

FYI physics is still physics. NVMe is fast. Felt like a good day to update you all on this

FYI physics is still physics. NVMe is fast. Felt like a good day to update you all on this

686,268 просмотров • 2 месяцев назад

There's a million reasons why your app can be slow. The stack of software and hardware beneath are complex. Pinning down performance issues is challenging (but also, fun!) A clip from the chapter 1 stream. More to come this week.

There's a million reasons why your app can be slow. The stack of software and hardware beneath are complex. Pinning down performance issues is challenging (but also, fun!) A clip from the chapter 1 stream. More to come this week.

235,943 просмотров • 28 дней назад

I am once again begging you to put your database servers and application servers in the same region.

I am once again begging you to put your database servers and application servers in the same region.

1,421,247 просмотров • 10 месяцев назад

Remember that time Brendan Gregg shouted at hard drives to make them slower? Someone please confirm nvme ssd is immune.

Remember that time Brendan Gregg shouted at hard drives to make them slower? Someone please confirm nvme ssd is immune.

163,831 просмотров • 1 месяц назад

tuitter - twitter for your terminal

tuitter - twitter for your terminal

388,935 просмотров • 3 месяцев назад

Glad OpenAI bought the fast Python package manager.

Glad OpenAI bought the fast Python package manager.

395,548 просмотров • 4 месяцев назад

Ensuring everyone really understands.

Ensuring everyone really understands.

502,653 просмотров • 5 месяцев назад

*Finally* read through Sam Rose's blog on LLM quantization. It's incredible. For many (even in tech) the understanding of how LLMs work stops at the surface level. Sam is helping us all go deeper, digging into the interesting facets of how AI models truly work. Read it!

Finally read through Sam Rose's blog on LLM quantization. It's incredible. For many (even in tech) the understanding of how LLMs work stops at the surface level. Sam is helping us all go deeper, digging into the interesting facets of how AI models truly work. Read it!

272,603 просмотров • 3 месяцев назад

Know how long every operation takes on a computer. Once you do, identifying good vs bad vs acceptable performance becomes way easier. It all boils down to knowing your napkin math.

Know how long every operation takes on a computer. Once you do, identifying good vs bad vs acceptable performance becomes way easier. It all boils down to knowing your napkin math.

56,686 просмотров • 21 дней назад

Over the past 8 days, I received over 100 PRs on the languages repo with additions and improvements. - A bunch of languages were added - Some implementations got tweaks to modify performance - The run script now uses hyperfine for timing Thanks to all the contributors.

Over the past 8 days, I received over 100 PRs on the languages repo with additions and improvements. - A bunch of languages were added - Some implementations got tweaks to modify performance - The run script now uses hyperfine for timing Thanks to all the contributors.

1,200,259 просмотров • 1 год назад

Database table size impacts performance in more ways than one: a) B-tree depth. Using 8k pages and a 16b uuid: 1 level = ~370 rows 2 levels = ~138k rows 3 levels = ~50m rows 4 levels = ~20b rows The lookup cost on a table with 100k rows is not the same as one with 1b rows. This can apply both to the table itself (MySQL cluster index) as well as the indexes. Sometimes a single query requires many of them. b) Small table → fits in RAM → fast reads. The larger the table, the more likely to read from disk plus churn the cache. c) # of indexes. Each adds maintenance overhead for insertions, and for Postgres vacuum overhead as well. Keep an eye on this! It's useful to take regular stock of your tables + indexes. Clean bloat. Remove unused indexes. Partition if needed.

Database table size impacts performance in more ways than one: a) B-tree depth. Using 8k pages and a 16b uuid: 1 level = ~370 rows 2 levels = ~138k rows 3 levels = ~50m rows 4 levels = ~20b rows The lookup cost on a table with 100k rows is not the same as one with 1b rows. This can apply both to the table itself (MySQL cluster index) as well as the indexes. Sometimes a single query requires many of them. b) Small table → fits in RAM → fast reads. The larger the table, the more likely to read from disk plus churn the cache. c) # of indexes. Each adds maintenance overhead for insertions, and for Postgres vacuum overhead as well. Keep an eye on this! It's useful to take regular stock of your tables + indexes. Clean bloat. Remove unused indexes. Partition if needed.

211,974 просмотров • 3 месяцев назад

1 billion loop iterations. 4 languages. I wrote the same code in js, python, go, and c. Timed executions on a digital ocean dedicated cpu vm, and here are the visualized results. Not all programming languages are created equal!

1 billion loop iterations. 4 languages. I wrote the same code in js, python, go, and c. Timed executions on a digital ocean dedicated cpu vm, and here are the visualized results. Not all programming languages are created equal!

1,202,741 просмотров • 1 год назад

I implemented an ssd, hdd, and tape device. In javascript. For a blog.

I implemented an ssd, hdd, and tape device. In javascript. For a blog.

698,231 просмотров • 1 год назад

Merkle trees are everywhere: - ZFS uses them to detect data corruption - Git uses them to verify repo integrity - Cursor uses them for codebase sync - Bitcoin uses them for transaction verification Talked through how they work on the latest database stream.

Merkle trees are everywhere: - ZFS uses them to detect data corruption - Git uses them to verify repo integrity - Cursor uses them for codebase sync - Bitcoin uses them for transaction verification Talked through how they work on the latest database stream.

134,344 просмотров • 3 месяцев назад

Important to know that Postgres/MySQL on local SSD is *the* best way to get great performance. The reason it could get *even faster* boils down to abstraction boundaries. SSDs are built to be workload agnostic. They have their own controllers and garbage collectors meant to balance performance of many different I/O patterns. DBMS are written to support arbitrary backing I/O devices: SSD, HDD, or even a remote storage system. It's good that they're written to be broadly performant! The point of the paper was that, if you're willing to build something that specifically targets local SSDs (and even specific models of local SSDs) you can optimize the full DBMS -> OS -> hardware stack for the exact characteristics of a workload and hardware specifications. This is a classic tradeoff. You can usually get better performance as you narrow scope for specific hardware, architectures, etc. Maintainers of OSS must walk a careful balance of optimizing for performance while also maintaining wide compatibility. For the curious-minded, I wrote an interactive article on the history of I/O devices, and how that interplays with database performance:

Important to know that Postgres/MySQL on local SSD is the best way to get great performance. The reason it could get even faster boils down to abstraction boundaries. SSDs are built to be workload agnostic. They have their own controllers and garbage collectors meant to balance performance of many different I/O patterns. DBMS are written to support arbitrary backing I/O devices: SSD, HDD, or even a remote storage system. It's good that they're written to be broadly performant! The point of the paper was that, if you're willing to build something that specifically targets local SSDs (and even specific models of local SSDs) you can optimize the full DBMS -> OS -> hardware stack for the exact characteristics of a workload and hardware specifications. This is a classic tradeoff. You can usually get better performance as you narrow scope for specific hardware, architectures, etc. Maintainers of OSS must walk a careful balance of optimizing for performance while also maintaining wide compatibility. For the curious-minded, I wrote an interactive article on the history of I/O devices, and how that interplays with database performance:

79,406 просмотров • 2 месяцев назад

You're probably sick of me saying "B-tree" but these impact SO MUCH of database performance. They're used all over the place in Postgres, MySQL, and SQLite. This week I broke down B-tree lookups and how the page cache makes lookups faster.

You're probably sick of me saying "B-tree" but these impact SO MUCH of database performance. They're used all over the place in Postgres, MySQL, and SQLite. This week I broke down B-tree lookups and how the page cache makes lookups faster.

187,011 просмотров • 6 месяцев назад

Check out the results for Levenshtein distance. Fortran out here impressing. I'd like to turn this repository into something special, with a wide variety of problems to solve with each language. If you have ideas, head on over and make issues / PRs!

Check out the results for Levenshtein distance. Fortran out here impressing. I'd like to turn this repository into something special, with a wide variety of problems to solve with each language. If you have ideas, head on over and make issues / PRs!

529,156 просмотров • 1 год назад

Js people: Ok your language is actually fast Python people: Come defend your language Go people: Needs more optimizations C people: Don't worry. If you enable the optimizer, this thing goes BRRRRRRRRRRRR.

Js people: Ok your language is actually fast Python people: Come defend your language Go people: Needs more optimizations C people: Don't worry. If you enable the optimizer, this thing goes BRRRRRRRRRRRR.

492,004 просмотров • 1 год назад