Introduction

SQL is a declarative query language. That means, you tell it what you want, you do not tell it how it's supposed to get it.

This blogs aims to give an intuition about how a query is internally run (on a very high-level). Understanding this would enable you to structure better queries, and debug queries when they are taking unexpectedly longer execution time.

(PS: You can skip to section Time to dig into a lot of examples! if you are familiar with reading query plans)

In a typical database system like PostgreSQL, a query goes through four stages:

Parsing: The system checks the query for syntax errors and ensures it is correctly structured.
Transformation: The query is converted into an internal representation for optimization and execution. This involves tasks like alias resolution and object existence checks.
Planning: An execution plan is created, determining the most efficient way to retrieve data based on various factors.
Execution: The system executes the plan generated in the previous step.

This blog will go over the following aspects:

What a Query Plan look like
An overview of the "Planning" step using EXPLAIN ANALYSE
Understand the kind of decisions the query planner takes to generate an optimised execution plan. Since this isn't a topic that can be exhaustively summed up in a blog, we will go over a bunch of examples to develop an intuition about the kind of factors that contribute to a query plan.

This blog will NOT go over:

How the query planner comes up with the optimised plan
How the generated plan is actually executed

Note: All examples mentioned ahead are based on PostgreSQL 14.5, and there may be minor differences in other database systems or versions. However, the high-level concepts are consistent.

With that said, let's jump right into it!

Hey Akshat, what is a Query Plan?

Query Plan is essentially a step-by-step detailed "execution plan" that is generated by the database system for any query execution.

Let's see an example. I have a table table1 with 1000 rows:

postgres@localhost:akjn> select count(*) from table1;
+-------+
| count |
|-------|
| 1000  |
+-------+

Let's try to find out "what" it did behind the scenes to give us this output.

EXPLAIN and ANALYSE

Before we begin to understand the query plan of specific queries, we need to understand EXPLAIN and EXPLAIN ANALYSE.

In simple terms, EXPLAIN gives an execution plan by estimating the "cost", whereas EXPLAIN ANALYSE gives you the execution plan by actually executing the query.

Let's use these for our simple query that counts the number of rows in table1

EXPLAIN gives:

postgres@localhost:akjn> explain select count(*) from table1;
+----------------------------------------------------------------+
| QUERY PLAN                                                     |
|----------------------------------------------------------------|
| Aggregate  (cost=17.50..17.51 rows=1 width=8)                  |
|   ->  Seq Scan on table1  (cost=0.00..15.00 rows=1000 width=0) |
+----------------------------------------------------------------+

Whereas EXPLAIN ANALYSE gives:

postgres@localhost:akjn> explain analyse select count(*) from table1;
+-------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                  |
|-------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=17.50..17.51 rows=1 width=8) (actual time=0.188..0.189 rows=1 loops=1)                     |
|   ->  Seq Scan on table1  (cost=0.00..15.00 rows=1000 width=0) (actual time=0.010..0.110 rows=1000 loops=1) |
| Planning Time: 0.028 ms                                                                                     |
| Execution Time: 0.203 ms                                                                                    |
+-------------------------------------------------------------------------------------------------------------+

Let's ignore the complicated terms in the above outputs for a bit, and focus on the differences. On a quick look, the output of EXPLAIN ANALYSE has this additional info:

A second set of parenthesis having some "actual" values (like (actual time=0.188..0.189 rows=1 loops=1) and (actual time=0.010..0.110 rows=1000 loops=1)
Planning time
Execution time

This is because EXPLAIN only gives an "estimate" of the cost. You need to actually run the query to get the actual time, which is done by EXPLAIN ANALYSE.

Hey Akshat, how to read the complicated `EXPLAIN ANALYSE` query plan?

Let's analyse the query plan we were discussing in the previous section:

postgres@localhost:akjn> explain analyse select count(*) from table1;
+-------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                  |
|-------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=17.50..17.51 rows=1 width=8) (actual time=0.188..0.189 rows=1 loops=1)                     |
|   ->  Seq Scan on table1  (cost=0.00..15.00 rows=1000 width=0) (actual time=0.010..0.110 rows=1000 loops=1) |
| Planning Time: 0.028 ms                                                                                     |
| Execution Time: 0.203 ms                                                                                    |
+-------------------------------------------------------------------------------------------------------------+

Query plan is a Tree structure of "Plan Nodes", with one root node, and the rest of the nodes indicated at the -> marker. So, for the above query plan, we have the following nodes:

Aggregate (also the Root Node)
Seq Scan

Question 1: How to read this? / How does Postgres execute this?

Answer: From inside out.

That is, for the above example, in very simple terms, it will try to do Seq Scan on table1 first, get the output, and give that output to the parent node (that is, Aggregate) for computation.

We will look at more complex examples later, which should give more intuition on this. Let's focus on the other parts of the query plan:

Planning Time is the time taken to generate the query plan.
Execution Time is the time taken to execute the query plan.
Some weird looking parenthesis.

Question 2: What are the weird looking parenthesis?

We have some weird looking parenthesis of the format (cost=0.00..15.00 rows=1000 width=0). Let's dig into the 3 parts of the parenthesis:

cost: This has 2 components - Startup Cost and Total Cost. For cost=0.00..15.00, 0.00 is the startup cost and 15.00 is the total cost.
1. Startup Cost: This is an estimate of how long it will take for the given node to start. As you can expect, it would include the cost of all the children nodes.
2. Total Cost: This is an estimate of how long it will take to finish execution on the particular node.
rows: This is the estimated average number of rows returned by this node to the parent node.
width: This is the estimated average width (in bytes) of rows returned by this node.

Question 3: What is the unit of cost?

The costs are in an arbitrary unit. I'm highlighting some of them here for better intuition:

postgres@localhost:akjn> select name, setting, short_desc from pg_settings where name in ('seq_page_cost', 'cpu_tuple_cost', 'cpu_operator_cost');
+-------------------+---------+---------------------------------------------------------------------------------------+
| name              | setting | short_desc                                                                            |
|-------------------+---------+---------------------------------------------------------------------------------------|
| seq_page_cost     | 1       | Sets the planner's estimate of the cost of a sequentially fetched disk page.          |
| cpu_tuple_cost    | 0.01    | Sets the planner's estimate of the cost of processing each tuple (row).               |
| cpu_operator_cost | 0.0025  | Sets the planner's estimate of the cost of processing each operator or function call. |
+-------------------+---------+---------------------------------------------------------------------------------------+

So if we are fetching 100 pages sequentially, it would contribute to 100*1 = 100 cost units. If we are processing 100 rows, it would contribute to 100*0.01 = 1 cost units.

Let's see if we can understand the query plan now!

postgres@localhost:akjn> explain analyse select count(*) from table1;
+-------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                  |
|-------------------------------------------------------------------------------------------------------------|
| Aggregate  (cost=17.50..17.51 rows=1 width=8) (actual time=0.188..0.189 rows=1 loops=1)                     |
|   ->  Seq Scan on table1  (cost=0.00..15.00 rows=1000 width=0) (actual time=0.010..0.110 rows=1000 loops=1) |
| Planning Time: 0.028 ms                                                                                     |
| Execution Time: 0.203 ms                                                                                    |
+-------------------------------------------------------------------------------------------------------------+

We wanted to find the count of rows in table1. For that, we first did a Sequential Scan on table1, which has no startup cost (0.00). and takes an estimate of 15.00 cost units to return 1000 rows to the parent node. The parent node estimated the start cost as 17.50, and the total cost as 17.51, and estimates to return 1 row.

In terms of the actual time, Sequential Scan on table1 started at 0.010 ms, finished at 0.110 ms. For the Aggregate node, it started at 0.188 ms, finished at 0.189 ms.

Time to dig into a lot of examples!

Example 1: Selecting everything from table having 1000 rows vs 10000000 rows

-- table1 has 1000 rows
postgres@localhost:akjn> explain analyse select * from table1;
+-------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                            |
|-------------------------------------------------------------------------------------------------------|
| Seq Scan on table1  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.008..0.107 rows=1000 loops=1) |
| Planning Time: 0.042 ms                                                                               |
| Execution Time: 0.179 ms                                                                              |
+-------------------------------------------------------------------------------------------------------+

-- Insert 10000000 rows
postgres@localhost:akjn> delete from table1; insert into table1 (select * from generate_series(1, 10000000));

-- table1 has 10000000 rows
postgres@localhost:akjn> explain analyse select * from table1;
+---------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                          |
|---------------------------------------------------------------------------------------------------------------------|
| Seq Scan on table1  (cost=0.00..132759.00 rows=8850600 width=4) (actual time=0.015..1126.625 rows=10000000 loops=1) |
| Planning Time: 0.023 ms                                                                                             |
| Execution Time: 1668.256 ms                                                                                         |
+---------------------------------------------------------------------------------------------------------------------+

Since we need to get all rows, there's no option other than just sequentially scanning all rows in both cases, and return them as output.

Let's make things more interesting now.

Example 2: Selecting a particular row from table having 1000 rows vs 10000000 rows

 -- table1 has 1000 rows 
postgres@localhost:akjn> explain analyse select * from table1 where col1 = 50;
+-------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                      |
|-------------------------------------------------------------------------------------------------|
| Seq Scan on table1  (cost=0.00..26.50 rows=1 width=4) (actual time=0.020..0.199 rows=1 loops=1) |
|   Filter: (col1 = 50)                                                                           |
|   Rows Removed by Filter: 999                                                                   |
| Planning Time: 0.054 ms                                                                         |
| Execution Time: 0.208 ms                                                                        |
+-------------------------------------------------------------------------------------------------+

 -- Insert 10000000 rows
postgres@localhost:akjn> delete from table1; insert into table1 (select * from generate_series(1, 10000000));

-- table1 has 10000000 rows
postgres@localhost:akjn> explain analyse select * from table1 where col1 = 50;
+-----------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                            |
|-----------------------------------------------------------------------------------------------------------------------|
| Gather  (cost=1000.00..61721.76 rows=1 width=4) (actual time=25.289..437.156 rows=1 loops=1)                          |
|   Workers Planned: 2                                                                                                  |
|   Workers Launched: 2                                                                                                 |
|   ->  Parallel Seq Scan on table1  (cost=0.00..60721.66 rows=1 width=4) (actual time=289.218..425.408 rows=0 loops=3) |
|         Filter: (col1 = 50)                                                                                           |
|         Rows Removed by Filter: 3333333                                                                               |
| Planning Time: 0.042 ms                                                                                               |
| Execution Time: 437.171 ms                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------+

See the difference? When the table had small amount of data (1000 rows), it decided that an optimal query plan is to just do a sequential scan over all the rows. But when the table had large amount of data (10000000 rows), it decided that an optimal query plan is to spawn 2 worker nodes, and parallelise the work between 2 worker nodes + 1 leader node.

Observations:

Each worker node was then responsible for running Parallel Seq Scan node, which can be seen in loops=3 (indicating that this node was run 3 times in total).
Rows removed by filter for Parallel Seq Scan node is estimated to be 3333333, which is one-third of the total number of rows. Hence each worker on an average filtered out one-third of the total number of rows (which is almost all of the rows each worker was working on, which makes sense!)

You can read more about parallel queries at https://www.postgresql.org/docs/current/how-parallel-query-works.html.

Example 3: Having a unique index on the column you are selecting

-- Add a UNIQUE index on col1
postgres@localhost:akjn> alter table table1 add constraint col1_unique unique(col1);

 -- table1 has 1000 rows 
postgres@localhost:akjn> explain analyse select * from table1 where col1 = 50;
+--------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                         |
|--------------------------------------------------------------------------------------------------------------------|
| Bitmap Heap Scan on table1  (cost=4.28..8.30 rows=1 width=4) (actual time=0.011..0.012 rows=1 loops=1)             |
|   Recheck Cond: (col1 = 50)                                                                                        |
|   Heap Blocks: exact=1                                                                                             |
|   ->  Bitmap Index Scan on col1_unique  (cost=0.00..4.28 rows=1 width=0) (actual time=0.007..0.008 rows=1 loops=1) |
|         Index Cond: (col1 = 50)                                                                                    |
| Planning Time: 0.051 ms                                                                                            |
| Execution Time: 0.027 ms                                                                                           |
+--------------------------------------------------------------------------------------------------------------------+

-- If we wait for a while
postgres@localhost:akjn> explain analyse select * from table1 where col1 = 50;
+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                              |
|-------------------------------------------------------------------------------------------------------------------------|
| Index Only Scan using col1_unique on table1  (cost=0.28..4.29 rows=1 width=4) (actual time=0.010..0.011 rows=1 loops=1) |
|   Index Cond: (col1 = 50)                                                                                               |
|   Heap Fetches: 0                                                                                                       |
| Planning Time: 0.047 ms                                                                                                 |
| Execution Time: 0.024 ms                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------+

This is interesting. Running the same query gives us "Bitmap Index Scan + Bitmap Heap Scan" query plan initially (immediately after index creation), but gives us "Index Only Scan" after some time.

As we discussed earlier, these are "estimates". These are based on the current state of statistics that the database system has for a given table/database. The system needs some time to update the statistics (or some specific queries to be run that updates these). Once the statistics are updated, it realises that doing an Index Only Scan is more efficient, and it switches to doing that.

Example 4: Having a unique index and querying for more than one values for the column

postgres@localhost:akjn> explain analyse select * from table1 where col1 = 1 or col1 = 2;
+--------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                               |
|--------------------------------------------------------------------------------------------------------------------------|
| Bitmap Heap Scan on table1  (cost=8.87..16.86 rows=2 width=4) (actual time=0.014..0.015 rows=2 loops=1)                  |
|   Recheck Cond: ((col1 = 1) OR (col1 = 2))                                                                               |
|   Heap Blocks: exact=1                                                                                                   |
|   ->  BitmapOr  (cost=8.87..8.87 rows=2 width=0) (actual time=0.011..0.011 rows=0 loops=1)                               |
|         ->  Bitmap Index Scan on col1_unique  (cost=0.00..4.43 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1) |
|               Index Cond: (col1 = 1)                                                                                     |
|         ->  Bitmap Index Scan on col1_unique  (cost=0.00..4.43 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) |
|               Index Cond: (col1 = 2)                                                                                     |
| Planning Time: 0.050 ms                                                                                                  |
| Execution Time: 0.030 ms                                                                                                 |
+--------------------------------------------------------------------------------------------------------------------------+

When we are querying for multiple values, it resorts to doing a "Bitmap Index Scan" for both of the individual filters, gets the result, and does a "BitmapOr" on the two results.

However, if we do a range query, we again do it with the "Index Only Scan":

postgres@localhost:akjn> explain analyse select * from table1 where col1 >= 1 or col1 <=  2;
+------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                               |
|------------------------------------------------------------------------------------------------------------------------------------------|
| Index Only Scan using col1_unique on table1  (cost=0.42..30750.28 rows=988043 width=4) (actual time=0.009..113.048 rows=1000000 loops=1) |
|   Filter: ((col1 >= 1) OR (col1 <= 2))                                                                                                   |
|   Heap Fetches: 0                                                                                                                        |
| Planning Time: 0.128 ms                                                                                                                  |
| Execution Time: 169.281 ms                                                                                                               |
+------------------------------------------------------------------------------------------------------------------------------------------+

We know that the two queries are equivalent for the kind of restrictions we have on the table. However, the database system operates solely on the statistical information it has about the data. Therefore, it's not surprising to observe a difference.

Example 5: Join operations

Let's say we have 2 tables:

table1 having an integer column col1
table2 having an integer column col2

-- Create the two tables and enter 1000 records (1 to 1000) in both tables
create table table1 (col1 int); create table table2 (col2 int);
insert into table1 (select * from generate_series(1, 1000));
insert into table2 (select * from generate_series(1, 1000));

Now if we try to do a join operation:

postgres@localhost:akjn> explain analyse select * from table1 join table2 on col1=col2;
+-------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                        |
|-------------------------------------------------------------------------------------------------------------------|
| Merge Join  (cost=359.57..860.00 rows=32512 width=8) (actual time=0.535..1.059 rows=1000 loops=1)                 |
|   Merge Cond: (table1.col1 = table2.col2)                                                                         |
|   ->  Sort  (cost=179.78..186.16 rows=2550 width=4) (actual time=0.283..0.370 rows=1000 loops=1)                  |
|         Sort Key: table1.col1                                                                                     |
|         Sort Method: quicksort  Memory: 71kB                                                                      |
|         ->  Seq Scan on table1  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.052..0.176 rows=1000 loops=1) |
|   ->  Sort  (cost=179.78..186.16 rows=2550 width=4) (actual time=0.249..0.323 rows=1000 loops=1)                  |
|         Sort Key: table2.col2                                                                                     |
|         Sort Method: quicksort  Memory: 71kB                                                                      |
|         ->  Seq Scan on table2  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.016..0.136 rows=1000 loops=1) |
| Planning Time: 0.055 ms                                                                                           |
| Execution Time: 1.142 ms                                                                                          |
+-------------------------------------------------------------------------------------------------------------------+

Now let's change the data in the two tables and try doing the same join operation:

-- table1 has (1 to 1000), table2 has (901 to 1000)
delete from table1; insert into table1 (select * from generate_series(1, 1000));
delete from table2; insert into table2 (select * from generate_series(901, 1000));

postgres@localhost:akjn> explain analyse select * from table1 join table2 on col1=col2;
+------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                       |
|------------------------------------------------------------------------------------------------------------------|
| Hash Join  (cost=27.50..71.25 rows=1000 width=8) (actual time=0.412..0.445 rows=100 loops=1)                     |
|   Hash Cond: (table1.col1 = table2.col2)                                                                         |
|   ->  Seq Scan on table1  (cost=0.00..27.00 rows=1800 width=4) (actual time=0.084..0.219 rows=1000 loops=1)      |
|   ->  Hash  (cost=15.00..15.00 rows=1000 width=4) (actual time=0.096..0.096 rows=100 loops=1)                    |
|         Buckets: 1024  Batches: 1  Memory Usage: 12kB                                                            |
|         ->  Seq Scan on table2  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.070..0.080 rows=100 loops=1) |
| Planning Time: 0.092 ms                                                                                          |
| Execution Time: 0.468 ms                                                                                         |
+------------------------------------------------------------------------------------------------------------------+

The difference is quite interesting. We can notice that it resorted to "Merge Join" when the data in the 2 tables had more overlap, compared to "Hash Join" when the data in the 2 tables had less overlap.

Example 6: Sorting

postgres@localhost:akjn> explain analyse select * from table1 order by col1;
+-------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                  |
|-------------------------------------------------------------------------------------------------------------|
| Sort  (cost=68.83..71.33 rows=1000 width=4) (actual time=0.203..0.276 rows=1000 loops=1)                    |
|   Sort Key: col1                                                                                            |
|   Sort Method: quicksort  Memory: 71kB                                                                      |
|   ->  Seq Scan on table1  (cost=0.00..19.00 rows=1000 width=4) (actual time=0.009..0.100 rows=1000 loops=1) |
| Planning Time: 0.029 ms                                                                                     |
| Execution Time: 0.351 ms                                                                                    |
+-------------------------------------------------------------------------------------------------------------+

We can see that it used quicksort method for sorting.

Example 7: Sorting and Filtering

postgres@localhost:akjn> explain analyse select * from table1 where col1 = 50 order by col1;
+-------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                      |
|-------------------------------------------------------------------------------------------------|
| Seq Scan on table1  (cost=0.00..21.50 rows=1 width=4) (actual time=0.014..0.078 rows=1 loops=1) |
|   Filter: (col1 = 50)                                                                           |
|   Rows Removed by Filter: 999                                                                   |
| Planning Time: 0.069 ms                                                                         |
| Execution Time: 0.087 ms                                                                        |
+-------------------------------------------------------------------------------------------------+

It realises that there's no need of sorting at all if you're querying for the same column!

However, as soon as we start querying for more than one values for col1, we can see the difference:

postgres@localhost:akjn> explain analyse select * from table1 where (col1 = 50 or col1 = 500) order by col1;
+-------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                            |
|-------------------------------------------------------------------------------------------------------|
| Sort  (cost=24.01..24.02 rows=2 width=4) (actual time=0.088..0.088 rows=2 loops=1)                    |
|   Sort Key: col1                                                                                      |
|   Sort Method: quicksort  Memory: 25kB                                                                |
|   ->  Seq Scan on table1  (cost=0.00..24.00 rows=2 width=4) (actual time=0.014..0.084 rows=2 loops=1) |
|         Filter: ((col1 = 50) OR (col1 = 500))                                                         |
|         Rows Removed by Filter: 998                                                                   |
| Planning Time: 0.040 ms                                                                               |
| Execution Time: 0.100 ms                                                                              |
+-------------------------------------------------------------------------------------------------------+

We can also observe that scanning the rows happens first, sorting happens later.

Concluding the examples

We can go on and on and on and on with the examples, but I'll stop. I hope the diverse set of examples we went through gave some intuition about the kind of factors that contribute to the execution plan of a query.

Would encourage you to try analysing query plans for the following:

Correlated Subqueries: See if you can validate that I didn't lie in my previous blog!
Sorting by multiple columns. Also try combining it with filtering.
Try different type of joins. Try joins between more than 2 tables, see if the data in the tables creates a difference in the order of joins.
Creating index on column1 and querying for (column1, column2). Tinker around with indexes!
Update queries. Delete queries. We didn't even touch those, so try combining them with everything else!
etc etc etc etc. The list is never-ending!

Hey Akshat, when do I use EXPLAIN vs EXPLAIN ANALYSE?

As you might have noticed, we completely switched to using EXPLAIN ANALYSE for all the examples. Does that mean EXPLAIN isn't useful? Absolutely not.

There might be times when you just want the query plan estimate, without actually running the query. Examples could be UPDATE or DELETE queries, when you don't want to actually update the data. You can use EXPLAIN in such cases.

One Final Observation

In case you didn't notice, an underlying theme of the examples was to demonstrate that the query plan adapts itself depending on the state of the database. This is incredibly useful as an application developer, since you do not have to change the application logic to handle 1000 rows vs 10000000 rows (for example). You can specify your query requirements irrespective of the database state, and rest assured that the database will come up with an optimal query plan.

Resource Recommendations

I want to take a moment to share some awesome resources I stumbled upon during writing this.

https://postgrespro.com/blog/pgsql/
1. Excellent blogs on Postgres with superb examples with crystal clear explanations!
2. They even have a book with the PDF version available for free: https://postgrespro.com/blog/pgsql/5970159. I've already ordered mine, have you?
Great podcast on Postgres Query Planner: SE-Radio Episode 328: Bruce Momjian on the Postgres Query Planner
1. My favorite excerpt from the podcast: Up until 12 tables, Postgres is going to try every possible way of returning the data, but it's going to prune it as it goes.
There are tools available to visualise the query plan. I'd encourage you to get comfortable with the tree output syntax, but in case you want to check out the tools:
1. http://tatiyants.com/pev/#/plans/new
2. https://explain.dalibo.com/

Conclusion

That's all folks!

Hope this was an informative article, and you got some takeaways from it. Thanks for your time!

PS:

I'd highly appreciate any feedback on this article - the good, the bad. How was the writing style? Was it easy to understand? Was it too long? Anything else?
I'm aiming to write more blogs now. So, if you liked this one, make sure you follow me on Hashnode to tag along in the journey :)

Explaining and Analysing Postgres EXPLAIN ANALYSE

Introduction

Hey Akshat, what is a Query Plan?

EXPLAIN and ANALYSE

Hey Akshat, how to read the complicated `EXPLAIN ANALYSE` query plan?

Question 1: How to read this? / How does Postgres execute this?

Question 2: What are the weird looking parenthesis?

Question 3: What is the unit of cost?

Let's see if we can understand the query plan now!

Time to dig into a lot of examples!

Example 1: Selecting everything from table having 1000 rows vs 10000000 rows

Example 2: Selecting a particular row from table having 1000 rows vs 10000000 rows

Example 3: Having a unique index on the column you are selecting

Example 4: Having a unique index and querying for more than one values for the column

Example 5: Join operations

Example 6: Sorting

Example 7: Sorting and Filtering

Concluding the examples

Hey Akshat, when do I use EXPLAIN vs EXPLAIN ANALYSE?

One Final Observation

Resource Recommendations

Conclusion

Comments (2)

More from this blog

Correlated Subqueries and Why You Need To Know Them

Command Palette

Introduction

Hey Akshat, what is a Query Plan?

EXPLAIN and ANALYSE

Hey Akshat, how to read the complicated EXPLAIN ANALYSE query plan?

Question 1: How to read this? / How does Postgres execute this?

Question 2: What are the weird looking parenthesis?

Question 3: What is the unit of cost?

Let's see if we can understand the query plan now!

Time to dig into a lot of examples!

Example 1: Selecting everything from table having 1000 rows vs 10000000 rows

Example 2: Selecting a particular row from table having 1000 rows vs 10000000 rows

Example 3: Having a unique index on the column you are selecting

Example 4: Having a unique index and querying for more than one values for the column

Example 5: Join operations

Example 6: Sorting

Example 7: Sorting and Filtering

Concluding the examples

Hey Akshat, when do I use EXPLAIN vs EXPLAIN ANALYSE?

One Final Observation

Resource Recommendations

Conclusion

Comments (2)

More from this blog

Hey Akshat, how to read the complicated `EXPLAIN ANALYSE` query plan?