Skip to main content

· 12 min read

Progress bar for Postgres queries – let's dive deeper

Recently, I have read a nice post titled "Query Progress Bar", by Brian Davis. It describes an interesting approach to observing the progress of slow query execution.

At some point, the author mentions:

Don't use this in prod.

And I agree. The article discusses long-running queries such as SELECTs, UPDATEs, DELETEs, and quite "invasive" methods of progress monitoring. In an OLTP production scenario, in most cases, we should try to limit the duration of such queries, setting statement_timeout to a very low value – such as 30 or even 15 seconds.

Let's dive deeper into the topic of query progress monitoring, and discuss various types of queries, how to monitor their progress, considering production and non-production environments separately.

· 22 min read

Jupiter's moon Io. Credit: ALMA (ESO/NAOJ/NRAO), I. de Pater et al.; NRAO/AUI NSF, S. Dagnello; NASA/JPL/Space Science InstituteJupiter's moon Io. Credit: ALMA (ESO/NAOJ/NRAO), I. de Pater et al.; NRAO/AUI NSF, S. Dagnello; NASA/JPL/Space Science Institute

SQL query optimization is challenging for those who have just started working with PostgreSQL. There are many objective reasons for this, such as:

  • the difficulty of the field of system performance in general,
  • lack of good "playground" environments where people can experience how databases work at a larger scale,
  • lack of certain capabilities in Postgres observability tools that are still developing (though, at a good pace),
  • insufficiency of good educational materials.

All these barriers are reasonable. They limit the number of engineers possessing well-developed Postgres query optimization skills. However, there is a specific artificial barrier that is rather influential and which is relatively easy to eliminate.

Here it is: the EXPLAIN command has the BUFFERS option disabled by default. I am sure it has to be enabled and used by everyone who needs to do some SQL optimization work.

· 7 min read

Database Lab Engine 3.0 by Postgres.ai: UI, persistent clones, PostgreSQL 14, more

note

Action required to migrate from a previous version. If you are running DLE 2.5 or older, please read carefully and follow the Migration notes.

What's new in DLE 3.0?

The Postgres.ai team is happy to announce the release of version 3.0 of Database Lab Engine (DLE), the most advanced open-source software ever released that empowers development, testing, and troubleshooting environments for fast-growing projects. The use of Database Lab Engine 3.0 provides a competitive advantage to companies via implementing the "Shift-left testing" approach in software development.

Database Lab Engine is an open-source technology that enables thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of independent clones of your database on a single machine, so each engineer or automation process works with their very own database provisioned in seconds without extra costs.

Among major changes in DLE 3.0:

  • UI included to the core, it allows working with a single DLE instance,
  • persistent clones: clones now survive DLE (or VM) restart,
  • for the "logical" data provisioning mode: the ability to switch reset clone's state using a snapshot from different pool/dataset,
  • better logging and configuration simplicity,
  • improvements for the cases when multiple DLEs are running on a single machine,
  • PostgreSQL 14 support.

Starting with version 3.0.0, DLE collects non-personally identifiable telemetry data. This feature is enabled by default but can be switched off. Read more in the DLE documentation. Keeping telemetry enabled can be considered your contribution to the DLE development because it helps make decisions down the road of the open-source product development.

Further, we discuss the most requested changes that were implemented in DLE 3.0 – all of them were created based on real-life user experience and invaluable feedback from the growing community of users and contributors.

· 8 min read

EXISTS OR NOT EXISTS (meme)

What is this about?#

Many DDL statements in PostgreSQL support modifiers IF EXISTS / IF NOT EXISTS. For example:

test=# create table if not exists mytable();CREATE TABLEtest=# drop table if exists mytable;DROP TABLE

I recommend using IF EXISTS / IF NOT EXISTS in DDL only when necessary. Here are three examples that demonstrate how the overuse of these words may lead to negative consequences.

· 17 min read

How partial, covering, and multicolumn indexes may slow down UPDATEs in PostgreSQL

Based on a true story.

This article was originally published in 2018. This is a reviewed and extended version of it. The discussed findings can be applied to any actual major version of PostgreSQL.

Primum non nocere#

"Primum non nocere" – this is a fundamental principle that is well-known to anyone working in healthcare: "first, do no harm". It is a reminder: when considering any action that is supposed to improve something, we always need to look at the global picture to see if there might be something else that be damaged by the same action.

This is a great principle and it is used not only in healthcare, of course. I strongly believe that it has to be used in database optimization too, and we need better tools to make it happen.

· 2 min read

The Database Lab Platform now supports the ability to reset a clone's state using any available data snapshot on the Database Lab Engine (DLE) instance. It means that now you can get fresh data very quickly (a few seconds, as usual) to your clone, keeping all the credentials the same – including the port. Previously, it was a problem that one would need to create a new clone to get new data. This implies getting database credentials, most likely including a new port value, which was not always convenient.

It might be helpful in the cases when long-living clones need to be refreshed to get the most up-to-date data from the source (usually, production) or when troubleshooting a non-trivial issue requiring the use of a specific database version from the past (specific point in time).

Reset options demo

· 6 min read

Example output of the query for 'lock trees' analysis

For OLTP workloads (such as web and mobile applications), it is important to understand object-level and row-level locks in PostgreSQL. There are several good materials that I can recommend reading:

· 21 min read

Graceful Postgres database schema migrations

Zero-downtime database schema migrations#

This is one of those topics that hit experienced Postgres DBAs badly. So badly that hypothetical Socrates of 21th century could be one of those Postgres DBAs and the words "I know that I know nothing" would sound natural. I've seen dozens of projects with busy Postgres setups serving lots of TPS, in mission-critical systems in zillion-dollar companies, where database engineers were thinking that they are experienced and know what they are doing – and then suddenly this, quite a basic topic, made them say "ouch" and quickly fix their DDL deploy systems. Well, let's be frank: I was in this position myself, learning this after more than 10 (!) years of Postgres experience.

I'm sure some of you know this very well – if so, scroll down to see some bits of advanced material on the matter. However, I'm 100% sure that many of my readers will be really surprised right now.

We won't talk about "how to change a column's data type" or "how to add a foreign key" – those questions are all interesting too, and there are efforts to document and automate each of such steps for heavily-loaded systems with strict uptime requirements (a great example here is GitLab's "Migration Style Guide"). Instead, we will discuss something that affects any Postgres setup where schema needs to be changed from time to time, where downtime is considered as a huge problem, where DDLs are automated in one way or another (I mean DB migration tools such as Flyway, Sqitch, Liquibase, Ruby on Rails AR DB Migrations, and so on), but where DDLs are deployed without certain trick, so downtime can happen suddenly and unpredictably. Even if TPS numbers are not big. Without that trick in place, anyone using Postgres can (and will) hit that wall one day. Therefore, any engineer working with Postgres should know this trick and, perhaps, implement it in all systems to prevent downtime.

It is time to dive into technical details...

· 4 min read

DLE 2.5: Better data extraction for logical mode and configuration improvements

note

Action required to migrate from a previous version. See Migration notes.

The Database Lab Engine (DLE) is an open-source technology that enables thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs.

DLE 2.5 significantly expands the capabilities of automatic preparation of snapshots directly from managed database services, as well as from logical dumps, namely:

  • restoring of multiple databases
  • various pg_dump output formats and file compression formats

Since version 2.5, it becomes possible to reset the clone's database state to a specific snapshot if multiple snapshots are available. See DLE CLI reference. There is also a new option for the reset command, --latest, that allows resetting to the latest available state not knowing the snapshot name. This can be very useful in situations when a clone lives long, occupying a specific port, and some applications (e.g., analytical tools) are configured to work with it – users can periodically switch to the freshest database state without a need to reconfigure their applications.

All new restore features are also already available in the Terraform module (currently works with AWS only).

Additionally, this release has a lot of improvements and fixes. Read the full changelog.

· 7 min read

What is a slow SQL query? (SRT - Server Response Time)

User interface performance#

For web and mobile applications (here, let's talk only about them, not touching analytical workloads, background processing, and so on), engineers usually aim to have as few SQL queries per each HTTP(S) request as possible, and keep the execution time of those queries very low.

How much low?