Skip to main content

Β· 32 min read

PostgresFM with Sai Srirampur: Logical replication common issues

[00:00:00] Nikolay: Hello, hello, this is PostgresFM, your favorite podcast about Postgres. And my name is Nikolay. My co-host is Michael.

[00:00:08] Michael: Hello, Nikolay.

[00:00:09] Nikolay: Hi, Michael. And we have a very good guest today, Sai, founder of PeerDB. We are very glad to have you here. And we just discussed that we are not inviting guests just for guests. We choose topic first, so it's topic-centric discussion. But my first thought when we discussed we need to talk about logical. And also I saw your blog post published recently. My first thought was you're a very great person to have for this discussion. So that's why we invited you. Glad you found time to join.

[00:00:47] Sai: Thanks Nik and Michael for inviting me. And yeah, logical replication and logical decoding has been my life since the past 7, 8 months. And we are trying to get into as much depth as possible to understand how it works and probably down the line contribute upstream as well. But yeah, I'm very excited to be here.

[00:01:05] Nikolay: Cool. I know Michael has a lot of questions, so I probably will be less active this episode. But I will have questions as well and maybe comments.

[00:01:14] Michael: Yeah, I definitely have questions. But yeah, we did a kind of a basics intro episode to logical, where Nikolay also talked us through how to handle scale in terms of major upgrades, at least. So that's a common use case for logical. But one thing we didn't even touch on, which was the topic of one of your recent blog posts, was I listened back and I think protocol versions got a single mention, kind of like an off-the-cuff mention at one point in the episode, 30, 40-minute episode. So that was where I was thinking of starting is, it'd be awesome to hear a little bit about the protocol versions, why you worry about them recently.

[00:01:56] Sai: Yeah, absolutely. Thanks, Michael. So the thing is logical replication has this concept of protocol versions that you could specify as a part of the subscription or if you are an external client who is reading the replication slot you could do it as a part of the start replication API which lets you read the replication slot and there are four versions that Postgres supports now with PostgreSQL 16, right. The first version is the default version which decodes transactions that are only committed so it doesn't start the decoding process before the transaction commits and the second is the more advanced option which does logical decoding of transactions that are not yet committed, right like so it basically does decoding for in-flight transactions, right that is number two and then third is it lets you decode two-phase commit transactions, that is third and the fourth is it lets you decode in-flight transactions and apply them in parallel to the target which is the subscriber which is PostgreSQL basically, right. Now in that blog which you're talking about we compared like version one and version two because they are more like the common ones. Two-phase commits are not very common, right that's the reason we didn't go into that piece but the biggest difference is the impact of two is it improves logical decoding performance because you are not you're actually decoding while the transaction is happening, right like you are not letting the transaction finish and then only start decoding but rather you're decoding as the transactions go on. So the idea is you're giving more time to the decoder, the WAL sender process to perform the decoding and now this has a lot of benefits, right like number one it reduces the slot growth, right like so in that blog we talk about a situation where we have two long running transactions that are being committed. And with version one, the decoding process doesn't start and the slot keeps growing, growing, growing. And only after the transactions are committed, the decoding starts. And then for a few minutes until the decoding is finished, the slot size remains the same. Right now, this is with one. With two, what happens is as the transactions happen, decoding starts. Right? Like, so you do see the decoder doing its job. And as in when the transactions finish, the slot immediately falls. And the advantage of this is so the summary was with version one, the slot growth of like, I think we showed like 5, 6 gigabytes remain for like 5, 6 minutes until the entire decoding finished. But with version two, it remained there for an instance, because you know, like the decoding was already happening as the transaction was going on, right so this can have tremendous impact in use cases which have long running or sizable transactions and that is very common, right like we recently working were working with like a fintech customer, right like where they had a lot they had like sizable transactions right like hundreds of like you know operations happening in a transaction and then these transactions were like interleaved and now the advantage with like to the two would have helped them a lot because as the you know operations are happening the decoder like you know just decodes them. With one the problem that was happening was there was one large long-running transaction which took like an hour and then every time the decoder needs to decode committed transactions it was decoding this long-running transaction also so so the order of time is more quadratic basically with like a version one right because you know wall is like sequential right like so you keep writing writing writing and then like and then as and when there is a commit like the decoder like starts like working now the problem is with version one as the existing like the long-running transaction is not getting decoded still the decoder process decodes it for like other transactions, right? So but with like, you know oh this problem wouldn't have happened because once this long-running transaction is committed I mean it's already consumed it's already decoded if that makes sense right so the high-level summary is version two is very helpful when there are like, you know, long-running transactions that are like, you know, interleaved.

[00:06:22] Nikolay: So the lesson is: upgrade, guys! Right? Just use the latest PostgreSQL version.

[00:06:29] Sai: Version two comes with PG 14 version four is in PG 16.

[00:06:32] Nikolay: Sorry. I meant a small thing. I've noticed you mentioned a subscriber, which is PostgreSQL. This phrase, attracted my attention. It's not always PostgreSQL. But parallelization which to apply to long transactions it's done only for PostgreSQL subscribers, not for others, right?

[00:06:51] Sai: Correct. Exactly. So the fourth version, which does like, you know, parallel apply of these in-flight transactions is more relevant for PostgreSQL to PostgreSQL replication. The apply piece is, you know, PostgreSQL and that's the standard logical replication. And logical decoding is more like a subset of logical replication where external clients like pgBadger can read the slot and replicate to homogeneous or heterogeneous targets.

[00:07:17] Nikolay: In your experience, does it mean that there's no sense to have this feature for other situations like to Snowflake or others.

[00:07:26] Sai: The parallel apply feature you mean,

[00:07:27] Nikolay: Right.

[00:07:28] Sai: Yeah. Good question. So that is the onus of the ETL tool. So in PeerDB we have an option to do parallel apply or not do parallel apply. So it's like a single flag that we provide. And the difference is with parallel apply, we cannot guarantee the order of transactions across tables.

[00:07:52] Nikolay: So foreign key and referential consistency, similar to multiple slots in regular Postgres to Postgres. Correct.

[00:08:00] Sai: Very, very similar. Right. And then both, I mean, customers do it, but the advantage of parallelism is it will improve latency, right? Like replication, latency reduces, lag reduces, etc. But it doesn't guarantee, you know, consistency across tables. But the sequential does that, right? So customers like we have both customers.

[00:08:19] Nikolay: Yeah, that's interesting. I know in many cases people think, okay, we work at like half of our capacity, no worries if TPS grows using a single slot. If TPS grows, we know the bottleneck is on the subscriber, we will just use multiple slots. But then they realize that with foreign keys, you need to basically agree that they are broken temporarily on the subscriber and can cause a lot of troubles if you point the application to such nodes, right? That's interesting. Yeah, absolutely. Makes sense.

[00:08:56] Sai: And then I did want to call out, right? Like for that, I mean, there is a setting for disabling foreign key just like, you know, complete that there is a setting called session replication role basically, you can set that as replication and that is pretty standard, it's like that's the reason Postgres gave this setting so you can set that to replication and it would disable foreign keys and triggers on the target.

[00:09:16] Nikolay: But by default, if you use multiple slots, it's disabled.

[00:09:21] Sai: By default, is it like disabled?

[00:09:23] Nikolay: Yes. I know it since recent work to use it for upgrade, zero-downtime upgrades.

[00:09:31] Michael: Got it.

[00:09:31] Nikolay: It's indeed disabled and it's interesting. So you need to think maybe you shouldn't use it. I mean maybe go back to a single slot and just choose a different time for your workload. I mean, work on weekends, guys, right? Lower traffic and you can afford working with a single slot. This means only for like logical for a temporary time, like for upgrades, we don't need it. Exactly. It really

[00:10:04] Sai: depends on the use case.

[00:10:06] Nikolay: Yeah, yeah. But if for a long time, forget about foreign keys maybe. And you said latency, but it's also about throughput. We need to process more bytes per time.

[00:10:18] Sai: Absolutely.

[00:10:19] Nikolay: Lags can be nasty. I don't like dealing with logical decoding in Postgres 11, 12. It's not pleasant at all. So these improvements are exciting and everyone should upgrade. That's what I see here. You just uncovered a lot of details. That's great.

[00:10:39] Sai: Yeah, absolutely. And the tricky thing I mean with the good beauty of like Postgres is that like this is already inbuilt in like Postgres core, right? Like logical replication does this by default. It's just like, you know, a setting you need to do say that like, Hey, I want to use version 2 and it will be significantly more performant, reduce replication slot growth. But if it's an ETL tool, who's using logical decoding, the transaction logic needs to be managed by the ETL tool. Because as you're reading like in-flight transactions, we need to keep track whether this transaction is committed or not, and then only we should push it to the target. It becomes tricky for ETL tools, which we are working on in PeerDB now as we speak. But the beauty of Postgres is that it just gives you the setting out of the box and you just need to upgrade your Postgres version to 14, 15, and 16.

[00:11:26] Nikolay: And with this, in 16 we have an example. I mean, the developers of these tools like your company, they now have an example of Postgres to Postgres native logical replication to show how it can be implemented. So the reference example exists. So it's good. Yeah, that's great. Sorry if Michael have interrupted you.

[00:11:49] Michael: No, not at all. This is a really good diversion. But we've also, we've talked about 3 things already, right? We've talked about throughput, we've talked about lag, and we've talked about slot size growth. And I feel like when you said initially, the performance of the version 2 protocol can be better, I'm guessing we're talking mostly in terms of that lag metric. Like we can start processing things faster, therefore the lag is lower, makes total sense. And when you said the slot size growth is reduced, I think I Understand where you're coming from. I think the specific benchmark in the blog post shows that the growth is the same, but then reduces it, like it kind of reduces quicker. But when you mentioned the like interleaved long running transactions I guess it's in a less synthetic workload where you've got lots of potentially overlapping long running transactions the peak will be lower in the version if you're using. Correct.

[00:12:47] Sai: Exactly right. Like because in the scenario we did it's just two transactions that are long running and then we just had two of them right but then and then they ended at the same time but in real-world workloads like it I mean it can be very arbitrary right but with like version 2 as the slot consumption is faster and like slot size falls quicker right the cumulative effect can be very significant in slot growth.

[00:13:15] Michael: Nice.

[00:13:16] Sai: If that makes sense. And that is the next benchmark we want to do, where in the real world benchmark, I want to see how does the peak slot size like compare with version 1 and version 2. We are actually building that feature so we can we plan to like, you know, we have a few like customers we are design partnering with, you know, to implement this feature. And we have like, we know, you know, what is the peak size slot size we are seeing with version 1. And with this design partnership, we will get to know, okay, what is the peak slot size? My like understanding is it will fall very quickly because like, you know, you're giving more time to the decoder, right? Like it's more efficient. And with these long running transactions, it cannot be quadratic, like the decoding process, right? But that's a very good question you pointed, Mike. And that is the next benchmark we want to do.

[00:14:00] Michael: Nice. So then the question then becomes, are there any workloads? And I'm thinking now that it must only be if you had, let's say you had a lot of long run transactions that end up getting aborted or rolled back instead of committed. I'm thinking that might be the only case where the version 1 protocol might be like you might prefer to carry on using it or are there other cases where you might prefer to carry on using the v1 protocol over the v2 one as long as you're on 14 and above.

[00:14:29] Sai: Even when the transactions are rolled back the WAL is not removed. So it still needs to be like decoded basically, right? Like for future transactions. That's exactly what happened in a customer scenario where there was a long running transaction, which the customer killed basically. It was running for like few hours, right? But still that impacted other transactions because Postgres is not yet smart where it removes the WAL.

[00:14:53] Nikolay: This is my favorite use case. So you can overload WAL sender. If you just do a very simple trick, you create a table with like a million, 10,000,000 rows and then delete rows in transaction and roll it back. Massive delete rollback, massive delete rollback. I like this workload in many cases, this included. And this makes WAL sender consume 100% CPU very quickly. Because this transaction spams WAL, writing xmax constantly for a lot of tuples. And then saying, okay, this xmax, this transaction ID was rolled back, so it doesn't mean... Like they're still alive, right? The next transaction comes, does the same, and the WAL sender is becoming crazy. And I think there should be some way to optimize decoding to understand it, maybe. Because if you start processing... Ah, when it starts processing, it's not known that this transaction is already rolled back. But maybe there is some opportunity to optimize because if it's known already it's already rolled back why we can probably skip better or something I don't know I don't know details here.

[00:16:10] Sai: That's a very good point Nik and I was recently chatting with one of the committers and we don't even like skip rows is my understanding basically like so every time like we try to decode basically So the immediate optimization is if there is an entry in the WAL of a rollback transaction, then we don't decode that entry in the WAL. But you bring up a great point here. So there is a lot of scope of improvement there.

[00:16:36] Nikolay: Yeah, and this led us to the conclusion that we won't be able to deal with logical at a few modifying transactions per second. It was wrong conclusion because this was pathological workload. And In production, I think it's a very rare case when a WAL sender is hitting 100% of a single core. Or it's not, I'm not sure. Because the problem is you cannot scale it. You add a second slot, but you see a WAL sender is hitting 100% of a single core CPU, with the constantly or from time to time, and you think, okay, I will try to distribute workload among 2 slots or 4 slots, 8 slots. But this doesn't help. All of them are hitting 100% of different cores and that's it. So this can be bottleneck, but it looks like it's very rare right now in production.

[00:17:34] Sai: Yeah, absolutely. And that is another optimization that like, I mean, the community could do is where we, I mean, currently if there are multiple slots, right? Like the decoding process runs for every slot, right? If there are ways in which the decoder runs for just 1 slot and the other slots reuse these like decoded changes, right? Like if that makes sense.

[00:17:56] Nikolay: Yeah, exactly. The same work that's done multiple times. Everyone is processing everything. This is what like definitely there is opportunity for improvement here. But I think people just don't see this as a bottleneck in production often. That's why it's not optimized yet.

[00:18:14] Sai: In production, what we see is like, it's like these 1 off batch operations where customers delete like a bunch of data or add like do copy with like millions of rows where there is like a WAL spike. And you know, the recommendation that we give right, like is guys have more disk. I mean, it is, I mean that it's hard because like faster and like, you know, larger disk because logical replication can only scale up to 25,000 to 30,000 like, you know, messages per second, basically. So have larger disk so that like once this small spike like falls, it'll catch up. Okay, sure. At that time, there is more latency.

[00:18:49] Nikolay: It's a good CPU if 25,000, 30,000 messages per second. Because for PgBouncer on modern Intel and AMD, I didn't see more than 20 messages per second. We can compare this like also messages per second. PgBouncer processes transactions just by passing them to backends, Postgres backends and returning the result. And the logical replication also similar, some messages, just some messages, right? Different type of work because decoding maybe is more consuming but 30,000 sounds very good.

[00:19:25] Sai: Yeah, 100%, but the thing is it has to be done well, no I mean like it has to be done properly because one of the tips that we do right, like which have seen lack in a few ETL tools, is always consuming the slot we cannot give up on consuming the slot we need to constantly consume the slot and flush the slot right and not to lag and the thing is if you give up and start reading it again, it doesn't read from the confirmed flush LSN it reads from the restart LSN. And sometimes that restart LSN can be very like old. And I don't want to get into when Postgres updates the restart LSN because it's a more complicated thing. But the idea is when I give up the connection and reacquire the connection again, it starts from the restart LSN, which can increase my decoding times a lot. So this is very common. When we initially built pglogical, if the slot size was 150, 200 gigs, we were giving up the connection periodically. And whenever this slot is big, for like 10, 15 minutes, it was just stuck in Walreader. It was not even getting 1 change. So that is one tip I recommend, where you always consume the replication slot.

[00:22:26] Nikolay: Good. What's next?

[00:22:30] Michael: Well, before we move on from that, I'm interested in, Are there any other like hard, so you mentioned that rule of thumb, 25 to 30,000 messages per second. Like, are there any other like rules of thumb or hard limits or anything that people could be thinking, oh, we've probably not tuned it enough because we're not getting that much throughput or that low lag or something like that.

[00:22:51] Nikolay: I guess this number depends on the, on the core, on the type of CPU you have. If it's old, it can be 10,000 for example.

[00:22:59] Sai: Yeah. A hundred percent. And then, you know, this setting is this number that I shared is more on the higher end in a non-local environment where the source, the target are not local. They are in the same region, but in different boxes. So network latency is a very big factor. One thing we do want to see is, what is the logical replication performance when both are on the same local server, which is not in the real world that doesn't happen. So this, whatever I'm saying, is network latency bound. Because it's single-threaded, and the network latency kicks in, and it can only scale up to say 20-30 Mbps if it's like you know 30-40 Mbps if it's done well also right so those are also some things to keep in mind now coming to gotchas right like a few things that I did like you know make a note of of logical replication first is it doesn't support like replication of DDL commands which includes like adding of columns like dropping of columns like you know adding new tables creating indexes like you know truncating tables right like so these are not supported out of the box. And this is one of the common concerns that we hear from customers, because in the real world, you know, people, I mean, you add tables, you like you have these Django migrations, which like add a bunch of indexes etc etc right.

[00:24:18] Michael: Partitions we talked about last time.

[00:24:21] Sai: Yeah, but they don't do that also.

[00:24:24] Michael: So yeah, when you say customers raise it as a concern, is that before starting or like realizing it later on? Where does that pop up for you?

[00:24:33] Sai: It happens both ways, right? Like because I mean, obviously, Postgres logical replication is the easiest way to like, you know, replicate databases, right? They start off and then there's a column that is added and then logical replication breaks. And now the good thing is you can add a column manually on the target and it would continue But then this becomes like difficult in production, which is when they reach out saying that hey you guys like are also, you know Doing using logical decoding and you support Postgres as a target and that's how they come to us. And we do support schema changes, but only add columns and drop columns. We don't yet support creating tables and indexes and truncate and all of that yet, basically. And sometimes it happens before itself. We're like, okay, this is a, I mean, we keep adding tables, we keep like adding columns very often and like this doesn't work for us. So like we want another solution.

[00:25:24] Michael: Yeah, nice. A couple of the other things that I see people confused or struggling with are things like sequence synchronization. I guess that's just a one-off task at the beginning. Do you see anything else catching people out?

[00:25:39] Sai: That's a good question, right? Like I think DDL commands is number 1, replication slot growth issues is number 2, which we talked a lot about basically. Yeah. It's like, and, you know, always keep reading the slot, avoid long running or like open transactions, you know, monitor slot growth, right? Like fourth is, you know, use protocol versions, right? Like upgrade to like latest Postgres versions and start using the version 2.3.4 which is faster. The third thing that we see is Postgres logical replication doesn't replicate toast columns out of the box. You need to set up replica identity full to make that happen. And replica identity full could be expensive if you are not on Postgres 16 because for updates and deletes it could lead to like a sequential scan. REPLICA IDENTITY FULL

[00:26:26] Nikolay: And this thing what happened CTID was used or is used or like why it's better than 16?

[00:26:34] Sai: It's using indexes now. It's like using indexes basically.

[00:26:37] Nikolay: Okay.

[00:26:39] Sai: And the thing is,

[00:26:41] Nikolay: If you have them.

[00:26:42] Sai: Yeah, if you have them and if you have the right indexes And another thing is some customers, right? Like who have primary keys, but they have toast columns. We still like need to do replica identity full and that helps. So replica identity full with primary keys is more efficient. And there is a good blog that the Zalando guys wrote, which I really liked. And that is something that I refer to customers where, hey guys, like you can add this. I think it increased like CPU and IO by 30%. That was their analysis. But that is about like toast columns where you need to have replica identity full and the impact of that can change based on scenarios.

[00:27:22] Michael: Yeah. Awesome. I didn't know about that one. I would love to read that blog post. If you send it to me afterward, I'll include it in the show notes as well.

[00:27:29] Sai: Absolutely. And the fourth thing that we have seen is logical replication, at least, you know, logical decoding did not support virtual and generated columns. If there are like generated columns, they didn't show up on WAL decoding. And the way we, I mean, we easily solve that, I mean, is on the target that customers set up like a virtual column or they use like DBT or like some transformation tool to, you know, backfill this, right? Like, so that is number four. And then last but not the least, the fifth one that we see, right. Like, I'm still, you know, trying to discover, it's like slot invalidation issues where the restart LSN becomes null, and have seen that happen in two scenarios. So the first scenario is there is a setting called max_replication_slot size or something, I mean, like which actually limits the size of the slot, and in this, this setting was recently added in 13 or 14 which lets you like safeguard from storage out of storage issues. So as and when the slot hits, you know, over a 2 terabyte, right, like or if that's the setting that you have, it basically invalidates the slot, and in that scenario, the restart LSN becomes null, and the slot gets invalidated, and you need to restart logical replication. The second scenarios we have seen is sometimes it happens randomly, also, unseen, and we are still figuring out why that can happen. And now we reached out to the community, right, like, and we heard that like it could be because of cloud providers' Postgres, right, like because cloud providers have a bunch of like backup operations, some, some forks, you know, of Postgres, right. And then it could be. I mean, the community doesn't say that, like, I mean, their point is, like, can we try to reproduce this on Vanilla Postgres, right? Like not cloud. And then that was not becoming easy for us. But like, we have seen that out of the 7 to 8 months, 1 to 2 times, randomly the slot gets invalidated and we don't have like an RCA on when that can happen. And immediately we think that we check, hey, is the max_wal_size for the slot, like, less than, is it set up, right? But it's still set to minus 1, it's disabled, but still, like we run into these issues. So I'm on that quest of figuring out when this happens because it's a scenario that we need to understand better.

[00:29:42] Michael: And that's terrifying. And yeah, how do you even begin to go about reproducing that? Good luck.

[00:29:47] Sai: Exactly. So we were able to reproduce this on one of the cloud providers. They're like for high throughputs, like over 50K transactions per second, every 15 to 16 hours, it was like getting invalidated. Right?

[00:30:00] Michael: 50,000 per second and every 15 to 16 hours.

[00:30:04] Sai: It's a lot. That's a lot

[00:30:07] Michael: of messages.

[00:30:08] Sai: That's not real. I mean, I don't think that happens often, but yeah.

[00:30:12] Michael: Right, wow. Cool. I don't feel like we talked about use cases much. Is that because it kind of these aren't use-case dependent? I guess other than people using two-phase commit, do you see any differences between people that are using logical replication for analytics databases versus for, do you see any difference between use cases basically?

[00:30:33] Sai: Good question. Like I mean, logical replication is very powerful. I mean, it supports like, you know, migration, online migrations, number 1, then it supports like online upgrades, right, like which Nik mentioned, right, like number 2. Number 3, it also supports like HA and backup scenarios. So I have seen some customers who use logical decoding or logical replication to have HA across regions or in like hybrid environments. Like, okay, I'm running stuff on my on-cloud and I still want to have a backup or like a replica on my on-premise like, you know, self-hosted environments, right? And the only way it's possible is via logical replication because cloud providers don't give access to like WAL, right? That is number 3, where like HA and backups is another use case, right? 4 is replication to like non-Postgres like targets or even Postgres targets for like workload isolation, use-case isolation, where, okay, I have my OLTP database running on Postgres. Now I want to pipe this to another database. Like it could be Snowflake or Postgres for analytics or for search, like I want to pipe this on like a Clickhouse or like Elasticsearch, right? Like for optimizing search. Yeah, logical replication is very powerful and I believe that it's going to be the future for Postgres and down the line it can open up like active-active use-cases also right like where you know, okay, you have like cross-region and I recently read a blog where folks were using logical replication for like active active I'd like so I mean it opens up like a plethora of use cases and makes Postgres more, you know, powerful extensible and yeah

[00:32:03] Michael: Yeah, I read that blog post too. I think it was version 16 change that made it possible to like not replicate changes that got made by logical replication. So you could kind of like set up logical replication in both directions. It seems super dangerous to me. Like, okay, cool. Would you recommend it at the moment or do you think we need more features before that?

[00:32:24] Sai: I think it's more, I mean, it's more like intense than that, right? Because like conflict resolution and all of this is tricky, right? Like, and, you know, I mean, if it's like very workload specific, where like, okay, I don't touch the same rows, right? Like, and, you know, maybe there it could work. But I mean, out of the box, like implementing that is tricky, and it requires more like effort. And maybe we go there in the future, right? Because we are seeing a few cases with customers where they want Active-Active and there is not an out-of-the-box solution.

[00:32:59] Nikolay: Why do people need Active-Active?

[00:33:02] Sai: Good question. I think I have like a lot of thoughts here. So like the thing is, I think it really helps with like HA, right? Like for example, I remember this scenario in Microsoft where customers were having like a bunch of like SQL Server Active-Active across regions and then this was a bank and then 1 of the region went down And then every second is like thousands of dollars. And then they immediately pivoted to the other region and it kind of worked seamlessly.

[00:33:30] Nikolay: But doesn't it mean that all clusters need to perform the same writes? And I didn't buy this idea that replaying changes through logical replication is less expensive than applying them initially, which was advertised in the BDR documentation. I didn't buy this. I hope to find time to test it properly and write a blog post about that. So far didn't find time yet for this. It's an interesting topic because like in BDR, it means like, for example, you have multiple regions, 4 or 8. I know such cases as well. And they suffer because everyone needs to perform the same writes. And if there is a spike in 1 region or this, it's not a resilient system at all. I don't understand this.

[00:34:21] Sai: Yeah. A hundred percent. I agree. I think, I mean, that's the reason there is a big opportunity there. And I recently saw one company, like what the pgEdge who are like trying to do something like this.

[00:34:33] Nikolay: Yeah, new wave of this. I know every CTO is dreaming about this, I know it. Because we are like, we work in multiple regions. Maybe we should first check that all our physical standbys are in different availability zones than primaries, right?

[00:34:53] Sai: There are more basic things.

[00:34:55] Nikolay: Yeah, but multiple regions is a great thing to have, but still I'm very skeptical. But I see in DBA's mind, there is a mind shift as well. Like 10 years ago, all DBAs said you don't need this. But things are changing. So in Postgres 16, this to avoid loops, infinite loops, right? This feature, it's interesting. So I don't understand use cases and how we can scale writes if we don't split them, like in sharding. We split them in sharding, and that's great. Here, everyone needs everything. Well, I'm very skeptical. I don't understand this part of the landscape.

[00:35:41] Sai: Yeah, 100%. And like, I think it becomes very critical in tier 0 use cases, guys. I mean, not like, I mean, tier 1, tier 2, tier 3, I think it's kind of lesser, but like, it's more these tier 0 where like, it's like a Chase bank or something like that right? Like when it becomes like hyper.

[00:35:55] Nikolay: But you think it's possible to build good system?

[00:35:57] Sai: No, I don't think it's, I mean with Postgres I am also very skeptical, but I think there is an opportunity there, right? Like, and community, both community and like, you know, I mean, community will be very critical here, right? Like, I don't think that it can happen just by logical replication.

[00:36:13] Nikolay: Maybe if we build system, categorizing data in tiers as well and replicating writes only for most critical data between regions, right?

[00:36:25] Michael: We're probably tiptoeing into a different topic here. It's

[00:36:29] Nikolay: not different. It's not different. Many people think logical leads to multi-master, definitely, like so-called old term.

[00:36:38] Michael: But I think the tier 0 use case feels like it's a long way from being supported by native Postgres. There are a lot of providers out there for that kind of thing. And I think the general term for it is being called distributed SQL or like that seems like DistSQL is what I've been referring to. But I would warn people against using logical for this in the short, like anytime soon. Yeah.

[00:37:04] Nikolay: Okay.

[00:37:05] Sai: I agree.

[00:37:06] Nikolay: Okay. Then what, what other questions do you have?

[00:37:10] Michael: I didn't have anything else. I wanted to thank Sai. Did you have anything else Nikolay?

[00:37:15] Nikolay: Well no. Thank you so much. It was interesting. Thank you for coming.

[00:37:19] Sai: Absolutely, guys. I really enjoyed chatting with you and thanks for inviting me.

[00:37:24] Nikolay: I hope you won't stop posting interesting technical posts.

[00:37:29] Sai: No, no, no. So content is our currency, guys. I mean, the thing is, for me, I'm building the company because I'm very curious, right? Like now the thing that is like haunting me is that slot invalidation, not like, I'm not understanding why it can happen. Right. Like, so, so I've been like, so because of curiosity, we will be publishing a lot of content, but

[00:37:50] Nikolay: yeah, benchmarks, graphs, data and so on. Reproducible also. Yeah, that's great. Thank you so much.

[00:37:56] Michael: And also, if any of our listeners have seen that as well, have any theories or have a reproduction case for it, let us know.

[00:38:04] Nikolay: Mm-hmm. Absolutely.

[00:38:06] Michael: Good. Wonderful. Thank you. Thank you both. Take care.

[00:38:09] Sai: Thanks, guys. Bye-bye. Bye-bye.

Β· 17 min read

Postgres.AI Bot. Towards LLM OS for Postgres

I'm happy to present our new product, Postgres.AI Bot. It is powered by OpenAI's GPT-4 Turbo and is designed to help engineers improve their experience when working with PostgreSQL. This bot has a vast knowledge base with over 110,000 entries, including documentation, source code for different PostgreSQL versions, and related software like PgBouncer, Patroni, and pgvector. It also integrates expert articles and blogs.

In addition, the bot can conduct two types of experiments to verify ideas:

  1. Single-session on thin clones provided by Postgres.AI DBLab Engine to check SQL syntax and behavior of PostgreSQL planner and executor, and
  2. Full-fledged benchmarks (pgbench) on separate VMs in Google Cloud to study Postgres behavior under various workloads. For each iteration, 70+ artifacts are automatically collected and used by the bot to analyze and visualize experiment results.

Our ambitious goal for 2024 – conduct 1 million database experiments in both shared and dedicated environments, aiding both the bot's knowledge and the improvement of these projects, particularly in performance. To achieve this, we became a part of Google Cloud's AI startup program.

In this blog post, we discuss some details of how the bot is implemented, what it is capable of, share its first exciting achievements, and talk about the future.

Β· 6 min read

DBLab Engine 3.4: new name, SE installer, and lots of improvements

DBLab Engine version 3.4, an open-source tool for PostgreSQL thin cloning and database branching, has been released with numerous improvements.

Rapid, cost-effective cloning and branching are extremely valuable when you need to enhance the development process. DBLab Engine can handle numerous independent clones of your database on a single machine, so each engineer or automated process can work with their own database created within seconds without additional expenses. This enables testing of any changes and optimization concepts, whether manually or in CI/CD pipelines, as well as validating all the concepts suggested by ChatGPT or another LLM. This effectively addresses the issue of LLM hallucinations.

Β· 17 min read

10 Postgres tips for beginners

Getting started with PostgreSQL can be both exciting and challenging. It's more than just another databaseβ€”it's a system packed with features that can change how you handle data. Every Friday, Michael Christofides (pgMustard) and I discuss these features on our podcast, Postgres.FM (there is also a video version on YouTube). We've been at it for 55 weeks straight since July 2022, and we're not stopping anytime soon. Our latest episode was all about helping newcomers to PostgreSQL. After seeing the huge response to my tweet, which got over 200k views, 1200+ likes, and 200+ retweets, I wanted to dig deeper and share more about these essential tips.

Here are those 10 tips (+bonus) Michael and I have discussed.

Β· 3 min read

Blue elephants in skies

Today, developers who need to work with full scale data have three options:

  1. Develop scripts to backup & restore their production database
  2. Use expensive cloud provided database cloning features
  3. Deploy and test on production 😱

In other words, they can pay with time, money, risk, or some combination of all three.

But! There is a better, faster, cheaper, and completely safe way for developers to get the data they need to do their work properly. With the Database Lab Engine, developers can instantly provision as many full scale copies of their database as needed.

Check out this table which makes a direct comparison of the most common methods (We omitted testing on production. Don't do that.):

Provisioning TimeMonthly Compute CostsMonthly Storage CostsTotal Monthly Costs for 10 Clones
Traditional Thick Cloning (EC2 + EBS) Hours $185 $100 per clone $1185
EC2 + EBS restored from snaphot ~10 mins plus an hour for warmup (lazy load) $185 $100 per clone $1185
RDS Clones ~10 mins plus an hour for warmup (lazy load) $365 per clone $100 per clone $4650
Aurora thin clones ~5 mins $417 per clone $100 $4270 (+ IO costs)
Database Lab Engine thin clones ~5 seconds $345 $100 $445

The price comparison here makes the following assumptions:

  1. Once provisioned, the clone remains continuosly available for development and CI testing
  2. The clones all run on an r5.xlarge instance
  3. The database size is 100 GiB

As a result, running your development or staging environments with the Database Lab Engine (how it works) is more than 10x cheaper and 10x faster than the best available alternative.

Try it now in the AWS Marketplace. Setup docs are here and contact us if you have questions!

Share this blog post:

Dante Cassanego
Dante Cassanego

Postgres.ai

Database Lab
Database Lab by Postgres.ai

An open-source experimentation platform for PostgreSQL databases. Instantly create full-size clones of your production database and use them to test your database migrations, optimize SQL, or deploy full-size staging apps.

Β· 4 min read

Database Lab Engine 3.2 by Postgres.ai: config and logs in UI, Postgres 15, AWS Marketplace version is GA

The Postgres.ai team is happy to announce the release of version 3.2 of Database Lab Engine (DLE), an open-source tool that provides blazing-fast database cloning and branching for any PostgreSQL database to build powerful development, test, QA, and staging environments. DLE can run dozens of independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs. Data size and the number of snapshots/branches are virtually unlimited.

This release focuses on improving the admin experience during DLE configuration and troubleshooting. The full list of changes you can find in Release Notes.

Improved configuration and troubleshooting​

DLE UI now has two new tabs: "Logs" and "Configuration". They help DLE admins troubleshoot issues with database refresh, adjust configuration, and make another attempt to get data without the need to use SSH.

DLE UI new tabs: Logs and Configuration

For logical mode (dump/restore), it is now possible to observe the current activity on both source and target sides:

DLE logical data provisioning activity

Postgres clone errors are now stored in a special "diagnostic" directory (see new configuration section – diagnostic) that allows analyzing them even if the corresponding Postgres clone was already deleted or lost.

More flexibility for schema and data patching​

The configuration subsection queryPreprocessing is now present in the section logicalRestore, in addition to sections logicalSnapshot and physicalSnapshot. This can be helpful, for example, to mitigate errors related to CREATE POLICY and missing users – the queries provided in logicalRestore / queryPreprocessing are executed right before schema and data restoration attempt, so one can create "helper" database objects before the main body of the schema is created. For convenience, it is now also possible to provide inline SQL in queryPreprocessing in any section, instead of a path to SQL queries, for example:

retrieval:
jobs:
- logicalRestore
- logicalSnapshot
spec:
logicalRestore:
options:
queryPreprocessing:
queryPath: ""
maxParallelWorkers: 2
inline: |
CREATE ROLE tony;
CREATE ROLE mary;
logicalSnapshot:
options:
dataPatching:
queryPreprocessing:
inline: |
TRUNCATE audit_log;

Recently released PostgreSQL 15 is now fully supported as well as all previous versions starting with 9.6. The full list of supported PostgreSQL versions and extensions available in the default Docker images can be found in the docs).

Demo DLE​

You can see new UI working with Demo DLE: https://demo.aws.postgres.ai (token: demo-token).

DLE in AWS Marketplace is GA​

The AWS Marketplace version of Database Lab is now GA, and it's already updated to version 3.2. This offering is equipped with Standard license that includes guaranteed support from Postgres.ai, simplified setup, embedded monitoring system (Netdata), optional SSL. Paying less than for an RDS clone, you can have dozens of full-size clones running on a single machine with DLE. This short video demonstrates the steps of installation and configuration of DLE for a 100 GiB RDS database:

You can try out installing DLE from AWS Marketplace for your database here: https://aws.amazon.com/marketplace/pp/prodview-wlmm2satykuec.

Community news:

  • 🌠 DLE repository on GitHub reached 1.4k stars; many thanks to everyone who supports the project in any way
  • πŸ“ˆ The Twitter account has reached 800 followers – please follow @Database_Lab
  • πŸŽ‰ DLE now has 19 contributors. More contributions are welcome! See "good first issues"
  • πŸ₯‡ Please consider various ways to contribute – read CONTRIBUTING.md

Further reading​

tip

To get help, reach out to the Postgres.ai team and the growing community of Database Lab users and contributors: https://postgres.ai/contact.

Request for feedback and contributions​

Feedback and contributions would be greatly appreciated:

Like Database Lab? Give us a GitHub star: https://github.com/postgres-ai/database-lab.

Share this blog post:

Nikolay Samokhvalov
Nikolay Samokhvalov

CEO & Founder of Postgres.ai

Working on tools to balance Dev with Ops in DevOps

Database Lab
Database Lab by Postgres.ai

An open-source experimentation platform for PostgreSQL databases. Instantly create full-size clones of your production database and use them to test your database migrations, optimize SQL, or deploy full-size staging apps.

Β· 3 min read

Database Lab Engine for AWS Marketplace. Fast, fixed-cost branching for your RDS Postgres is just a step awa

I'm very pleased to announce the very first preview version of Database Lab Engine (DLE) for AWS Marketplace. If you're using AWS, this is the fastest way to have powerful database branching for any database, including RDS and RDS Aurora. But not only RDS: any Postgres and Postgres-compatible database is supported as a source for DLE.

Now, for a fixed price (paying just for one EC2 instance and an EBS volume), you can have dozens of DB clones being provisioned in seconds and delivering independent databases for your Git branches, CI/CD pipelines, as well as manual optimization and testing activities.

Achieving the lowest entry barrier for the new DLE users remains to be one of our primary goals. In addition to the tutorials (we have several, including one for the RDS users) and Terraform module template for DLE, we now offer a way to install DLE on AWS only using a web browser.

What's included:

  • DLE (same as Community Edition; the latest minor versions of DLE are available)
  • Automated data provisioning at the logical level from live Postgres database (can be any Postgres, version 9.6 or newer)
  • Automated refresh using two or more snapshots, using one EBS volume (of a bigger size), with configurable full refresh schedule (read about DLE's automated full refresh on schedule in the docs; note in AWS Marketplace version, we use a single ZFS pool but 2 (or more, if you request it) datasets to enable work with multiple full snapshots of the database – therefore, during a full refresh, users can still work with multiple clones created for the snapshot that is not being refreshed at the moment.)
  • Full refresh schedule can be defined when DLE instance is created
  • UI, CLI, and API to work with any number of thin clones – as usual – Optional generation of certificates and access to UI and API via HTTPS (via Envoy proxy), as well as access to the Postgres clones created by DLE users
  • Guaranteed support via one of the available channels

What's not yet included in this "preview" version:

  • Physical mode (for those who manage Postgres themselves)
  • Many advanced DLE configuration options are not available in AWS Marketplace / CloudFormation interface; however, they can still be adjusted once the instance is created (most of them can be changed without DLE restart – see the docs)

To start, please read the documentation: "How to install DLE from the AWS Marketplace". Below you can watch a 3-minute video demonstrating the setup process.

If you have any questions, use this page to contact us: https://postgres.ai/contact.

Share this blog post:

Nikolay Samokhvalov
Nikolay Samokhvalov

CEO & Founder of Postgres.ai

Working on tools to balance Dev with Ops in DevOps

Database Lab
Database Lab by Postgres.ai

An open-source experimentation platform for PostgreSQL databases. Instantly create full-size clones of your production database and use them to test your database migrations, optimize SQL, or deploy full-size staging apps.

Β· 26 min read

Stepping on a 'rake db:migrate'

'rake db:migrate' – a command known to any Ruby developer. But how many times did we step on the same rake?

In his article "Lesser Known PostgreSQL Features", @be_haki describes 18 Postgres features many people don't know. I enjoyed that article, and it inspired me to write about "anti-features" – things that everyone should avoid when working in probably the riskiest field of application development – so-called "schema migrations".

This is one of my favorite topics in the field of relational databases. We all remember how MongoDB entered the stage with two clear messages: "web-scale" (let's have sharding out-of-the-box) and "schemaless" (let's avoid designing schemas and allow full flexibility). In my opinion, both buzzwords are an oversimplification, but if you have experience in reviewing and deploying schema changes in relational databases, you probably understand the level of difficulty, risks, and pain of scaling the process of making schema changes. My personal score: 1000+ migrations designed/reviewed/deployed during 17+ years of using Postgres in my own companies and when consulting others such as GitLab, Chewy, Miro. Here I'm going to share what I've learned, describing some mistakes I've made or observed – so probably next time you'll avoid them.

Moreover, a strong desire to help people avoid such mistakes led me to invent the Database Lab Engine – a technology for thin cloning of databases, essential for development and testing. With it, you can clone a 10 TiB database in 10 seconds, test schema changes, and understand the risks before deployment. Most cases discussed in this article can be easily detected by such testing, and it can be done automatically in CI/CD pipelines.

As usual, I'll be focusing on OLTP use cases (mobile and web apps), for which query execution that exceeds 1 second is normally considered too slow. Some cases discussed here are hard to notice in small databases with low activity. But I'm pretty confident that you'll encounter most of them when your database grows to ~10 TiB in size and its load reaches ~105–106 transactions per second (of course, some cases will be seen – unless deliberately prevented. – much, much earlier).

I advise you to read GitLab's great documentation – their Migration Style Guide is full of wisdom written by those who have experience in deploying numerous Postgres schema changes in a fully automated fashion to a huge number of instances, including GitLab.com itself.

I also encourage everyone to watch PGCon-2022 – one of the key Postgres conferences; this time, it's happening online again. On Thursday, May 26, I'll give two talks, and one of them is called "Common DB schema change mistakes", you find the slide deck here. If you missed it, no worries – @DLangille, who has organized the conference since 2006 (thank you, Dan!), promises to publish talk videos in a few weeks.

Β· 3 min read

Database Lab Engine 3.1 by Postgres.ai: pgBackRest, timezones for CLI, DLE community, more

The Postgres.ai team is happy to announce the release of version 3.1 of Database Lab Engine (DLE), the most advanced open-source software ever released that empowers development, testing, and troubleshooting environments for fast-growing projects. The use of Database Lab Engine 3.1 provides a competitive advantage to companies via implementing the "Shift-left testing" approach in software development.

Database Lab Engine is an open-source technology that enables thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs.

In this release, the development team has also focused on the Database Lab Engine community, making it easier to get help or contribute. The team greets all new contributors: @Nikolay Devxx, @asotolongo, @Tanya301, @denis-boost, @pietervincken, @ane4ka

caution

Action required to migrate from a previous version. If you are running DLE 3.0 or older, to upgrade to DLE 3.1, please read the Migration notes.

In DLE 3.1:

  • Native support for pgBackRest as a tool to restore data from archives (physical mode, including continuously updated state), in addition to the existing support of WAL-G
  • Allow configuring timezone in DLE CLI configuration to improve the experience of using DLE in CI/CD pipelines
  • Improved README.md, translated to four languages, added CONTRIBUTING.md, SECURITY.md, and CODE_OF_CONDUCT.md
  • Many improvements in the engine and UI to improve work both in logical and physical modes

Community news:

  • 🌠 DLE repository on GitHub now has 1,100+ stars; many thanks to everyone who supports the project in any way
  • πŸ’₯ Pieter Vincken has published a blog post describing their experience of using DLE: "Testing with production data made easy"
  • πŸ“ˆ The Twitter account has reached 400 followers – please follow @Database_Lab
  • πŸŽ‰ DLE now has 15 contributors. More contributions are welcome! See "good first issues"
  • πŸ₯‡ Please consider various ways to contribute – read CONTRIBUTING.md

Further reading​

tip

To get help, reach out to the Postgres.ai team and the growing community of Database Lab users and contributors: https://postgres.ai/contact.

Request for feedback and contributions​

Feedback and contributions would be greatly appreciated:

Like Database Lab? Give us a GitHub star: https://github.com/postgres-ai/database-lab.

Share this blog post:

Nikolay Samokhvalov
Nikolay Samokhvalov

CEO & Founder of Postgres.ai

Working on tools to balance Dev with Ops in DevOps

Database Lab
Database Lab by Postgres.ai

An open-source experimentation platform for PostgreSQL databases. Instantly create full-size clones of your production database and use them to test your database migrations, optimize SQL, or deploy full-size staging apps.

Β· 12 min read

Progress bar for Postgres queries – let's dive deeper

Recently, I have read a nice post titled "Query Progress Bar", by Brian Davis. It describes an interesting approach to observing the progress of slow query execution.

At some point, the author mentions:

Don't use this in prod.

And I agree. The article discusses long-running queries such as SELECTs, UPDATEs, DELETEs, and quite "invasive" methods of progress monitoring. In an OLTP production scenario, in most cases, we should try to limit the duration of such queries, setting statement_timeout to a very low value – such as 30 or even 15 seconds.

Let's dive deeper into the topic of query progress monitoring, and discuss various types of queries, how to monitor their progress, considering production and non-production environments separately.