10 Postgres tips for beginners

July 22, 2023 · 17 min read

Nikolay Samokhvalov

10 Postgres tips for beginners

Getting started with PostgreSQL can be both exciting and challenging. It's more than just another database—it's a system packed with features that can change how you handle data. Every Friday, Michael Christofides (pgMustard) and I discuss these features on our podcast, Postgres.FM (there is also a video version on YouTube). We've been at it for 55 weeks straight since July 2022, and we're not stopping anytime soon. Our latest episode was all about helping newcomers to PostgreSQL. After seeing the huge response to my tweet, which got over 200k views, 1200+ likes, and 200+ retweets, I wanted to dig deeper and share more about these essential tips.

Here are those 10 tips (+bonus) Michael and I have discussed.

Test environments that are 10x cheaper and 10x faster than RDS clones

November 25, 2022 · 3 min read

Dante Cassanego

Blue elephants in skies

Today, developers who need to work with full scale data have three options:

Develop scripts to backup & restore their production database
Use expensive cloud provided database cloning features
Deploy and test on production 😱

In other words, they can pay with time, money, risk, or some combination of all three.

But! There is a better, faster, cheaper, and completely safe way for developers to get the data they need to do their work properly. With the Database Lab Engine, developers can instantly provision as many full scale copies of their database as needed.

Check out this table which makes a direct comparison of the most common methods (We omitted testing on production. Don't do that.):

	Provisioning Time	Monthly Compute Costs	Monthly Storage Costs	Total Monthly Costs for 10 Clones
Traditional Thick Cloning (EC2 + EBS)	Hours	$185	$100 per clone	$1185
EC2 + EBS restored from snaphot	~10 mins plus an hour for warmup (lazy load)	$185	$100 per clone	$1185
RDS Clones	~10 mins plus an hour for warmup (lazy load)	$365 per clone	$100 per clone	$4650
Aurora thin clones	~5 mins	$417 per clone	$100	$4270 (+ IO costs)
Database Lab Engine thin clones	~5 seconds	$345	$100	$445

The price comparison here makes the following assumptions:

Once provisioned, the clone remains continuosly available for development and CI testing
The clones all run on an r5.xlarge instance
The database size is 100 GiB

As a result, running your development or staging environments with the Database Lab Engine (how it works) is more than 10x cheaper and 10x faster than the best available alternative.

Try it now in the AWS Marketplace. Setup docs are here and contact us if you have questions!

Dante Cassanego

DBLab Engine 4.0

Instant database branching with O(1) economics.

DLE 3.2: config and logs in UI, Postgres 15, AWS Marketplace version is GA

October 20, 2022 · 4 min read

Nikolay Samokhvalov

Database Lab Engine 3.2 by Postgres.ai: config and logs in UI, Postgres 15, AWS Marketplace version is GA

The Postgres.ai team is happy to announce the release of version 3.2 of Database Lab Engine (DLE), an open-source tool that provides blazing-fast database cloning and branching for any PostgreSQL database to build powerful development, test, QA, and staging environments. DLE can run dozens of independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs. Data size and the number of snapshots/branches are virtually unlimited.

This release focuses on improving the admin experience during DLE configuration and troubleshooting. The full list of changes you can find in Release Notes.

Improved configuration and troubleshooting

DLE UI now has two new tabs: "Logs" and "Configuration". They help DLE admins troubleshoot issues with database refresh, adjust configuration, and make another attempt to get data without the need to use SSH.

DLE UI new tabs: Logs and Configuration

For logical mode (dump/restore), it is now possible to observe the current activity on both source and target sides:

DLE logical data provisioning activity

Postgres clone errors are now stored in a special "diagnostic" directory (see new configuration section – diagnostic) that allows analyzing them even if the corresponding Postgres clone was already deleted or lost.

More flexibility for schema and data patching

The configuration subsection queryPreprocessing is now present in the section logicalRestore, in addition to sections logicalSnapshot and physicalSnapshot. This can be helpful, for example, to mitigate errors related to CREATE POLICY and missing users – the queries provided in logicalRestore / queryPreprocessing are executed right before schema and data restoration attempt, so one can create "helper" database objects before the main body of the schema is created. For convenience, it is now also possible to provide inline SQL in queryPreprocessing in any section, instead of a path to SQL queries, for example:

retrieval:
    jobs:
        - logicalRestore
        - logicalSnapshot
    spec:
        logicalRestore:
            options:
                queryPreprocessing:
                    queryPath: ""
                    maxParallelWorkers: 2
                    inline: |
                      CREATE ROLE tony;
                      CREATE ROLE mary;
        logicalSnapshot:
            options:
                dataPatching:
                    queryPreprocessing:
                        inline: |
                            TRUNCATE audit_log;

Recently released PostgreSQL 15 is now fully supported as well as all previous versions starting with 9.6. The full list of supported PostgreSQL versions and extensions available in the default Docker images can be found in the docs).

Demo DLE

You can see new UI working with Demo DLE: https://demo.aws.postgres.ai (token: demo-token).

DLE in AWS Marketplace is GA

The AWS Marketplace version of Database Lab is now GA, and it's already updated to version 3.2. This offering is equipped with Standard license that includes guaranteed support from Postgres.ai, simplified setup, embedded monitoring system (Netdata), optional SSL. Paying less than for an RDS clone, you can have dozens of full-size clones running on a single machine with DLE. This short video demonstrates the steps of installation and configuration of DLE for a 100 GiB RDS database:

You can try out installing DLE from AWS Marketplace for your database here: https://aws.amazon.com/marketplace/pp/prodview-wlmm2satykuec.

Community news:

🌠 DLE repository on GitHub reached 1.4k stars; many thanks to everyone who supports the project in any way
📈 The Twitter account has reached 800 followers – please follow @Database_Lab
🎉 DLE now has 19 contributors. More contributions are welcome! See "good first issues"
🥇 Please consider various ways to contribute – read CONTRIBUTING.md

Request for feedback and contributions

Feedback and contributions would be greatly appreciated:

Like Database Lab? Give us a GitHub star: https://github.com/postgres-ai/database-lab.

Nikolay Samokhvalov

CEO & Founder of PostgresAI

Working on tools to balance Dev with Ops in DevOps

DBLab Engine 4.0

Instant database branching with O(1) economics.

Database Lab Engine for AWS Marketplace. Fast, fixed-cost branching for your Postgres is just a step away

July 3, 2022 · 3 min read

Nikolay Samokhvalov

Database Lab Engine for AWS Marketplace. Fast, fixed-cost branching for your RDS Postgres is just a step awa

I'm very pleased to announce the very first preview version of Database Lab Engine (DLE) for AWS Marketplace. If you're using AWS, this is the fastest way to have powerful database branching for any database, including RDS and RDS Aurora. But not only RDS: any Postgres and Postgres-compatible database is supported as a source for DLE.

Now, for a fixed price (paying just for one EC2 instance and an EBS volume), you can have dozens of DB clones being provisioned in seconds and delivering independent databases for your Git branches, CI/CD pipelines, as well as manual optimization and testing activities.

Achieving the lowest entry barrier for the new DLE users remains to be one of our primary goals. In addition to the tutorials (we have several, including one for the RDS users) and Terraform module template for DLE, we now offer a way to install DLE on AWS only using a web browser.

What's included:

DLE (same as Community Edition; the latest minor versions of DLE are available)
Automated data provisioning at the logical level from live Postgres database (can be any Postgres, version 9.6 or newer)
Automated refresh using two or more snapshots, using one EBS volume (of a bigger size), with configurable full refresh schedule (read about DLE's automated full refresh on schedule in the docs; note in AWS Marketplace version, we use a single ZFS pool but 2 (or more, if you request it) datasets to enable work with multiple full snapshots of the database – therefore, during a full refresh, users can still work with multiple clones created for the snapshot that is not being refreshed at the moment.)
Full refresh schedule can be defined when DLE instance is created
UI, CLI, and API to work with any number of thin clones – as usual – Optional generation of certificates and access to UI and API via HTTPS (via Envoy proxy), as well as access to the Postgres clones created by DLE users
Guaranteed support via one of the available channels

What's not yet included in this "preview" version:

Physical mode (for those who manage Postgres themselves)
Many advanced DLE configuration options are not available in AWS Marketplace / CloudFormation interface; however, they can still be adjusted once the instance is created (most of them can be changed without DLE restart – see the docs)

To start, please read the documentation: "How to install DLE from the AWS Marketplace". Below you can watch a 3-minute video demonstrating the setup process.

If you have any questions, use this page to contact us: https://postgres.ai/contact.

Nikolay Samokhvalov

CEO & Founder of PostgresAI

Working on tools to balance Dev with Ops in DevOps

DBLab Engine 4.0

Instant database branching with O(1) economics.

Common DB schema change mistakes

May 25, 2022 · 26 min read

Nikolay Samokhvalov

Stepping on a 'rake db:migrate'

'rake db:migrate' – a command known to any Ruby developer. But how many times did we step on the same rake?

In his article "Lesser Known PostgreSQL Features", @be_haki describes 18 Postgres features many people don't know. I enjoyed that article, and it inspired me to write about "anti-features" – things that everyone should avoid when working in probably the riskiest field of application development – so-called "schema migrations".

This is one of my favorite topics in the field of relational databases. We all remember how MongoDB entered the stage with two clear messages: "web-scale" (let's have sharding out-of-the-box) and "schemaless" (let's avoid designing schemas and allow full flexibility). In my opinion, both buzzwords are an oversimplification, but if you have experience in reviewing and deploying schema changes in relational databases, you probably understand the level of difficulty, risks, and pain of scaling the process of making schema changes. My personal score: 1000+ migrations designed/reviewed/deployed during 17+ years of using Postgres in my own companies and when consulting others such as GitLab, Chewy, Miro. Here I'm going to share what I've learned, describing some mistakes I've made or observed – so probably next time you'll avoid them.

Moreover, a strong desire to help people avoid such mistakes led me to invent the Database Lab Engine – a technology for thin cloning of databases, essential for development and testing. With it, you can clone a 10 TiB database in 10 seconds, test schema changes, and understand the risks before deployment. Most cases discussed in this article can be easily detected by such testing, and it can be done automatically in CI/CD pipelines.

As usual, I'll be focusing on OLTP use cases (mobile and web apps), for which query execution that exceeds 1 second is normally considered too slow. Some cases discussed here are hard to notice in small databases with low activity. But I'm pretty confident that you'll encounter most of them when your database grows to ~10 TiB in size and its load reaches ~10⁵–10⁶ transactions per second (of course, some cases will be seen – unless deliberately prevented – much, much earlier).

I advise you to read GitLab's great documentation – their Migration Style Guide is full of wisdom written by those who have experience in deploying numerous Postgres schema changes in a fully automated fashion to a huge number of instances, including GitLab.com itself.

I also encourage everyone to watch PGCon-2022 – one of the key Postgres conferences; this time, it's happening online again. On Thursday, May 26, I'll give two talks, and one of them is called "Common DB schema change mistakes", you can find the slide deck here. If you missed it, no worries – @DLangille, who has organized the conference since 2006 (thank you, Dan!), promises to publish talk videos in a few weeks.

DLE 3.1: pgBackRest, timezones for CLI, DLE community

April 5, 2022 · 3 min read

Nikolay Samokhvalov

Database Lab Engine 3.1 by Postgres.ai: pgBackRest, timezones for CLI, DLE community, more

The Postgres.ai team is happy to announce the release of version 3.1 of Database Lab Engine (DLE), the most advanced open-source software ever released that empowers development, testing, and troubleshooting environments for fast-growing projects. The use of Database Lab Engine 3.1 provides a competitive advantage to companies via implementing the "Shift-left testing" approach in software development.

Database Lab Engine is an open-source technology that enables thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs.

In this release, the development team has also focused on the Database Lab Engine community, making it easier to get help or contribute. The team greets all new contributors: @Nikolay Devxx, @asotolongo, @Tanya301, @denis-boost, @pietervincken, @ane4ka

caution

Action required to migrate from a previous version. If you are running DLE 3.0 or older, to upgrade to DLE 3.1, please read the Migration notes.

In DLE 3.1:

Native support for pgBackRest as a tool to restore data from archives (physical mode, including continuously updated state), in addition to the existing support of WAL-G
Allow configuring timezone in DLE CLI configuration to improve the experience of using DLE in CI/CD pipelines
Improved README.md, translated to four languages, added CONTRIBUTING.md, SECURITY.md, and CODE_OF_CONDUCT.md
Many improvements in the engine and UI to improve work both in logical and physical modes

Community news:

🌠 DLE repository on GitHub now has 1,100+ stars; many thanks to everyone who supports the project in any way
💥 Pieter Vincken has published a blog post describing their experience of using DLE: "Testing with production data made easy"
📈 The Twitter account has reached 400 followers – please follow @Database_Lab
🎉 DLE now has 15 contributors. More contributions are welcome! See "good first issues"
🥇 Please consider various ways to contribute – read CONTRIBUTING.md

Request for feedback and contributions

Feedback and contributions would be greatly appreciated:

Like Database Lab? Give us a GitHub star: https://github.com/postgres-ai/database-lab.

Nikolay Samokhvalov

CEO & Founder of PostgresAI

Working on tools to balance Dev with Ops in DevOps

DBLab Engine 4.0

Instant database branching with O(1) economics.

Progress bar for Postgres queries – let's dive deeper

January 14, 2022 · 12 min read

Nikolay Samokhvalov

Progress bar for Postgres queries – let's dive deeper

Recently, I have read a nice post titled "Query Progress Bar", by Brian Davis. It describes an interesting approach to observing the progress of slow query execution.

At some point, the author mentions:

Don't use this in prod.

And I agree. The article discusses long-running queries such as SELECTs, UPDATEs, DELETEs, and quite "invasive" methods of progress monitoring. In an OLTP production scenario, in most cases, we should try to limit the duration of such queries, setting statement_timeout to a very low value – such as 30 or even 15 seconds.

Let's dive deeper into the topic of query progress monitoring, and discuss various types of queries, how to monitor their progress, considering production and non-production environments separately.

EXPLAIN (ANALYZE) needs BUFFERS to improve the Postgres query optimization process

January 6, 2022 · 22 min read

Nikolay Samokhvalov

Jupiter's moon IO. Credit: ALMA (ESO/NAOJ/NRAO), I. de Pater et al.; NRAO/AUI NSF, S. Dagnello; NASA/JPL/Space Science Institute

SQL query optimization is challenging for those who have just started working with PostgreSQL. There are many objective reasons for this, such as:

the difficulty of the field of system performance in general,
lack of good "playground" environments where people can experience how databases work at a larger scale,
lack of certain capabilities in Postgres observability tools that are still developing (though, at a good pace),
insufficiency of good educational materials.

All these barriers are reasonable. They limit the number of engineers possessing well-developed Postgres query optimization skills. However, there is a specific artificial barrier that is rather influential and which is relatively easy to eliminate.

Here it is: the EXPLAIN command has the BUFFERS option disabled by default. I am sure it has to be enabled and used by everyone who needs to do some SQL optimization work.

DLE 3.0: UI, persistent clones, PostgreSQL 14, more

December 21, 2021 · 7 min read

Nikolay Samokhvalov

Database Lab Engine 3.0 by Postgres.ai: UI, persistent clones, PostgreSQL 14, more

note

Action required to migrate from a previous version. If you are running DLE 2.5 or older, please read carefully and follow the Migration notes.

What's new in DLE 3.0?

The Postgres.ai team is happy to announce the release of version 3.0 of Database Lab Engine (DLE), the most advanced open-source software ever released that empowers development, testing, and troubleshooting environments for fast-growing projects. The use of Database Lab Engine 3.0 provides a competitive advantage to companies via implementing the "Shift-left testing" approach in software development.

Database Lab Engine is an open-source technology that enables thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of independent clones of your database on a single machine, so each engineer or automation process works with their very own database provisioned in seconds without extra costs.

Among major changes in DLE 3.0:

UI included to the core, it allows working with a single DLE instance,
persistent clones: clones now survive DLE (or VM) restart,
for the "logical" data provisioning mode: the ability to switch reset clone's state using a snapshot from different pool/dataset,
better logging and configuration simplicity,
improvements for the cases when multiple DLEs are running on a single machine,
PostgreSQL 14 support.

Starting with version 3.0.0, DLE collects non-personally identifiable telemetry data. This feature is enabled by default but can be switched off. Read more in the DLE documentation. Keeping telemetry enabled can be considered your contribution to the DLE development because it helps make decisions down the road of the open-source product development.

Further, we discuss the most requested changes that were implemented in DLE 3.0 – all of them were created based on real-life user experience and invaluable feedback from the growing community of users and contributors.

Three cases against IF NOT EXISTS / IF EXISTS in Postgres DDL

November 3, 2021 · 8 min read

Nikolay Samokhvalov

EXISTS OR NOT EXISTS (meme)

What is this about?

Many DDL statements in PostgreSQL support modifiers IF EXISTS / IF NOT EXISTS. For example:

test=# create table if not exists mytable();
CREATE TABLE
test=# drop table if exists mytable;
DROP TABLE

I recommend using IF EXISTS / IF NOT EXISTS in DDL only when necessary. Here are three examples that demonstrate how the overuse of these words may lead to negative consequences.

10 Postgres tips for beginners

Test environments that are 10x cheaper and 10x faster than RDS clones

Dante Cassanego

DBLab Engine 4.0

DLE 3.2: config and logs in UI, Postgres 15, AWS Marketplace version is GA

Improved configuration and troubleshooting

More flexibility for schema and data patching

Demo DLE

DLE in AWS Marketplace is GA

Further reading

Request for feedback and contributions

Nikolay Samokhvalov

DBLab Engine 4.0

Database Lab Engine for AWS Marketplace. Fast, fixed-cost branching for your Postgres is just a step away

Nikolay Samokhvalov

DBLab Engine 4.0

Common DB schema change mistakes

DLE 3.1: pgBackRest, timezones for CLI, DLE community

Further reading

Request for feedback and contributions

Nikolay Samokhvalov

DBLab Engine 4.0

Progress bar for Postgres queries – let's dive deeper

EXPLAIN (ANALYZE) needs BUFFERS to improve the Postgres query optimization process

DLE 3.0: UI, persistent clones, PostgreSQL 14, more

What's new in DLE 3.0?

Three cases against IF NOT EXISTS / IF EXISTS in Postgres DDL

What is this about?

Dante Cassanego

DBLab Engine 4.0

Improved configuration and troubleshooting​

More flexibility for schema and data patching​

Demo DLE​

DLE in AWS Marketplace is GA​

Further reading​

Request for feedback and contributions​

Nikolay Samokhvalov

DBLab Engine 4.0

Nikolay Samokhvalov

DBLab Engine 4.0

Further reading​

Request for feedback and contributions​

Nikolay Samokhvalov

DBLab Engine 4.0

What's new in DLE 3.0?

What is this about?​

Improved configuration and troubleshooting

More flexibility for schema and data patching

Demo DLE

DLE in AWS Marketplace is GA

Further reading

Request for feedback and contributions

Further reading

Request for feedback and contributions

What is this about?