
Ben Dicken
@BenjDicken • 35,240 subscribers
databases → @planetscale videos → https://t.co/J5soMyfENa writing → https://t.co/YpfkF0JIKG
Shorts
Videos

FYI physics is still physics. NVMe is fast. Felt like a good day to update you all on this
Ben Dicken685,311 просмотров • 27 дней назад

In case you forgot how good we have it with M.2 NVMe SSD.
Ben Dicken1,372,387 просмотров • 1 месяц назад

I am once again begging you to put your database servers and application servers in the same region.
Ben Dicken1,420,944 просмотров • 9 месяцев назад

*Finally* read through Sam Rose's blog on LLM quantization. It's incredible. For many (even in tech) the understanding of how LLMs work stops at the surface level. Sam is helping us all go deeper, digging into the interesting facets of how AI models truly work. Read it!
Ben Dicken271,695 просмотров • 1 месяц назад

Important to know that Postgres/MySQL on local SSD is *the* best way to get great performance. The reason it could get *even faster* boils down to abstraction boundaries. SSDs are built to be workload agnostic. They have their own controllers and garbage collectors meant to balance performance of many different I/O patterns. DBMS are written to support arbitrary backing I/O devices: SSD, HDD, or even a remote storage system. It's good that they're written to be broadly performant! The point of the paper was that, if you're willing to build something that specifically targets local SSDs (and even specific models of local SSDs) you can optimize the full DBMS -> OS -> hardware stack for the exact characteristics of a workload and hardware specifications. This is a classic tradeoff. You can usually get better performance as you narrow scope for specific hardware, architectures, etc. Maintainers of OSS must walk a careful balance of optimizing for performance while also maintaining wide compatibility. For the curious-minded, I wrote an interactive article on the history of I/O devices, and how that interplays with database performance:
Ben Dicken79,264 просмотров • 17 дней назад

Database table size impacts performance in more ways than one: a) B-tree depth. Using 8k pages and a 16b uuid: 1 level = ~370 rows 2 levels = ~138k rows 3 levels = ~50m rows 4 levels = ~20b rows The lookup cost on a table with 100k rows is not the same as one with 1b rows. This can apply both to the table itself (MySQL cluster index) as well as the indexes. Sometimes a single query requires many of them. b) Small table → fits in RAM → fast reads. The larger the table, the more likely to read from disk plus churn the cache. c) # of indexes. Each adds maintenance overhead for insertions, and for Postgres vacuum overhead as well. Keep an eye on this! It's useful to take regular stock of your tables + indexes. Clean bloat. Remove unused indexes. Partition if needed.
Ben Dicken211,360 просмотров • 1 месяц назад

Over the past 8 days, I received over 100 PRs on the languages repo with additions and improvements. - A bunch of languages were added - Some implementations got tweaks to modify performance - The run script now uses hyperfine for timing Thanks to all the contributors.
Ben Dicken1,200,016 просмотров • 1 год назад

Merkle trees are everywhere: - ZFS uses them to detect data corruption - Git uses them to verify repo integrity - Cursor uses them for codebase sync - Bitcoin uses them for transaction verification Talked through how they work on the latest database stream.
Ben Dicken134,117 просмотров • 1 месяц назад

I implemented an ssd, hdd, and tape device. In javascript. For a blog.
Ben Dicken681,069 просмотров • 1 год назад