Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

What Postgres can’t do alone, it can tackle with DuckDB. The pg_duckdb extension shows how two databases can work hand in hand by querying analytical data with DuckDB and joining it with transactional data in Postgres. The extension embeds DuckDB directly into Postgres. DuckDB literally runs inside a Postgres...

28,639 Aufrufe • vor 4 Monaten •via X (Twitter)

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Big moment for Postgres! Search has always been Postgres' weak spot, and everyone just accepted it. If you needed a real relevance-ranked keyword search, the default answer was to spin up Elasticsearch or add Algolia and deal with the data sync headaches forever. The problem isn't that Postgres can't do text search. It can. But the built-in `ts_rank` function uses a basic term frequency algorithm that doesn't come close to what modern search engines deliver. So teams end up: - Running a separate Elasticsearch cluster just for search - Building sync pipelines that inevitably drift out of consistency - Paying for managed search services that charge per query - Accepting mediocre search relevance because "good enough" ships faster But this is actually a solvable problem. You can realistically bring industry-standard search ranking directly into Postgres, which eliminates the need for external infra entirely. This exact solution is now available with the newly open-sourced pg_textsearch by Tiger Data - Creators of TimescaleDB, a Postgres extension that brings true BM25 relevance ranking into the database. BM25 is the algorithm behind Elasticsearch, Lucene, and most modern search engines. Now it runs natively in Postgres. Here's what pg_textsearch enables: - True BM25 ranking with configurable parameters (the same algorithm powering production search systems) - Simple SQL syntax: `ORDER BY content 'search terms'` - Works with Postgres text search configurations for multiple languages - Pairs naturally with pgvector for hybrid keyword + semantic search That last point matters a lot for RAG apps. The video below shows this in action, and I worked with the team to put this together. You can now do hybrid retrieval (combining keyword matching with vector similarity) in a single database, without stitching together multiple systems. The syntax is clean enough that you can add relevance-ranked search to existing queries in minutes. pg_textsearch is fully open-source under the PostgreSQL license. You can find a link to their GitHub repo in the next tweet.

Akshay 🚀

215,043 Aufrufe • vor 4 Monaten

Talking with someone the other day that estimated they had about 25,000 idle connections to Postgres. My actual response to them: "holy shit". They double checked, it was only about 12,000 Same day had a conversation with someone saying they didn't need pgbouncer because of activerecord's connection pooling. Let's dig into connection pooling in Postgres. Prior to Postgres 14 every connection to the database consumed memory, roughly 10MB, it may be slightly less but it still wasn't free. Even beyond Postgres 14 there is still various contention that happens when Postgres starts to use a connection. An application pooler maintains a set of connections and hands them out when needed on the application side. These are idle and real connections against the database that indeed do impact performance negatively. In contrast pgbouncer speaks the wire protocol, waits for the begin part of the transaction and then uses a connection. It more strictly manages how many idle ones it's having instead of per web server you're running. pgbouncer up until recently really needed to be run in transaction mode (which meant disabling prepared statements in your application framework). prepared_statement support in pgbouncer was added recently, and now you don't have to disable. Even when running with an older version of pgbouncer with prepared_statements disabled you'd still see a big performance gain. A quick check to know if you'd benefit from pgbouncer, run this query: SELECT count(*), state FROM pg_stat_activity GROUP BY 2; If you're idle account is high (yes this is dependent on your view, but to me if it's above 25-30 range, and especially if active is until half that) then you'd already start to benefit from pgbouncer. If it's at 10,000 then post haste get pgbouncer in place. Finally, you don't have to not use a framework pooler, they're fine, but don't think it replaces a native Postgres connection pooler.

Craig Kerstiens

12,357 Aufrufe • vor 1 Jahr