Workflows fetching some rows, processing and updating them can be optimized by merging those multiple SQL statements into one.
Many maintenance operations are based on finding particular rows, processing them (e.g. sending an email) and finally applying changes to the database. Typically this is implemented as one query to find all rows and many subsequent queries to modify them. This workflow can sometimes be simplified by using the RETURNING feature to manipulate data and get the resulting rows in one step.
-- Not Supported :(
UPDATE tasks
SET owner = NULL
WHERE owner = 'tpetry' AND status = 'open'
RETURNING *;
DELETE FROM tasks
WHERE status = 'finished'
RETURNING *;
INSERT INTO tasks (
status, owner, name
) VALUES (
'open', 'tpetry', 'Create Example'
)
RETURNING *;
When doing operations on some data, most implementations accomplish it inefficiently: Some rows are fetched from the database and then processed in memory by e.g. calculating a new value, sending an e-mail or other tasks. And finally, those rows are updated within the database - in some cases by efficient bulk updates and sometimes by updating one row at a time. But the database has to do twice the work with these implementations. All rows are fetched by the selection and looked up again when updating them.
With PostgreSQL, you can use the
RETURNING
feature to change the behavior of data manipulation queries.
Instead of receiving just the number of affected rows, you get all the column's values.
To be more precise, you use the same syntax as in any
SELECT
part to restrict the columns, apply transformations etc.
The values you will get are the ones after the data manipulation of the query and all triggers have been executed.
Using that exciting feature, you can reduce the formerly stated workflow to a single query that will change the database and tell you the rows it changed.
You can change the workflow when you need to process some rows and update them when finished: You update all the rows to their final value and let PostgreSQL return them after the job is done. Now you do your processing on the returned data. Database efficiency has increased as you select and update the data in one operation. But remember that you only get the values after all updates have been executed. This approach won't work if you need some data which will be updated by the query.
Most applications have use cases to delete a chunk of data on user request but also need to do some processing on the values. Like with UPDATE queries, you can combine the DELETE and UPDATE query into one step. However, the limitation with update queries does not exist as you get the deleted data back precisely as it was before the statement.
At first glance, it may seem counter-intuitive to use this function for SELECT queries.
But the feature is convenient when you have an automatically generated primary key, generated columns or triggers modifying the inserted values.
You can insert new data and get the values back in one call.
When you combine this with
INSERT ... SELECT
queries, you have a powerful tool to do many operations directly within the database without proxying all the rows through your application.
SQL for Devs Founder
Be notified on future content. Never spam.