Blog#153: Implementing Bulk Inserts with UNIQUE Constraints Using Node.js and PostgreSQL

The main goal of this article is to help you improve your English level. I will use Simple English to introduce to you the concepts related to software development. In terms of IT knowledge, it might have been explained better and more clearly on the internet, but remember that the main target of this article is still to LEARN ENGLISH.


Hi, I'm Tuan, a Full-stack Web Developer from Tokyo 😊. Follow my blog to not miss out on useful and interesting articles in the future.

I had the opportunity to research how to do a bulk insert of multiple records in a Node.js + PostgreSQL environment, even when there is a UNIQUE constraint on the target table, so I'm going to summarize it in a blog post.

Using PostgreSQL with Node.js

First, if you want to implement PostgreSQL client functionality in Node.js, I think the most commonly used npm library is pg (node-postgres):

image.png

Like this, create a connection pooling.

var PG = require( 'pg' );
var pg = new PG.pool( 'postgres://localhost:5432/mydb' );

This is an example of executing an SQL "INSERT" process and getting the result.

if( pg ){
  var conn = await pg.connect();                                    // Connection assignment
  if( conn ){
    try{
      var sql = "insert into table1( name, num ) values ( $1, $2 )";  // SQL
      var query = { text: sql, values: [ "yamaha", 100 ] );             // SQL parameters
      conn.query( query, function( err, result ){
        if( err ){
          console.log( err );                                           // SQL execution error
        }else{
          console.log( result );                                        // The result of executing an SQL query
        }
      });
    }catch( e ){
      console.log( e );                                               // An exception has occurred
    }finally{
      if( conn ){
        conn.release();                                               // Releasing the connection
      }
    }
  }
}

This time, I will introduce it using this pg.

Implement bulk insert with Node.js and PostgreSQL

In the example above, we showed an example of a process to insert one record with one SQL execution. You can also execute this multiple times to insert multiple records, but it is convenient if you can efficiently insert them when the number of records is large. This process of "creating multiple records at once" is called "bulk insert".

There are several ways to implement bulk insert with Node.js and PostgreSQL, but this time we will introduce an example using the npm library node-pg-format.

image.png

When using it, prepare bulk insert data in an array and implement it as follows:

var format = require( 'pg-format' );

var records = [        
  [ "yamaha", 100 ],
  [ "suzuki", 101 ],
  [ "yamaha", 102 ]
];

if( pg ){
  var conn = await pg.connect();                                    // Connection assignment
  if( conn ){
    try{
      var sql = format( "insert into table1( name, num ) values %L", records );  // SQL
      var query = { text: sql, values: [] );                          // SQL parameters
      conn.query( query, function( err, result ){
        if( err ){
          console.log( err );                                           // SQL SQL execution error
        }else{
          console.log( result );                                        // The result of executing an SQL query
        }
      });
    }catch( e ){
      console.log( e );                                               // An exception has occurred
    }finally{
      if( conn ){
        conn.release();                                                // Releasing the connection
      }
    }
  }
}

We are instantiating the entire SQL statement in pg-format, and attaching array data as multiple records to be inserted (the parameter for conn.query() execution is empty). This allows us to insert multiple records specified in the records array variable with one SQL execution.

Bulk inserting into Node.js and PostgreSQL with a UNIQUE constraint

Now we will move on to the main topic of this blog entry. We have already introduced the method of bulk insert. The problem is that this does not always work. Specifically, when the target table has a UNIQUE constraint and some of the records in the bulk insert are affected by this constraint.

I will give a specific example. For example, I am introducing an example of bulk inserting multiple records into a table called table1, but it is not possible to exclude the possibility that the following specifications were given when creating the table1 table:

create table table1( name varchar(50) default '', num int default 0 );
alter table table1 add unique( name, num );

The first line "create table..." is fine, but the problem is the second line. By giving this constraint, it has been specified that the combination of name and num must be unique. It's okay for name to be repeated and num to be repeated, but the combination of name and num must be unique (data that violates this condition cannot be inserted).

Under the assumption that there is a UNIQUE constraint, let's look at the example again. The records that are trying to be bulk inserted in this example,

var records = [         // Prepare an array of records to insert.
  [ "yamaha", 100 ],
  [ "suzuki", 101 ],
  [ "yamaha", 102 ]
];

The content was (the first element of the array is name and the second is num and it will be bulk inserted). Looking at this example alone (since the combination of name and num is all different), it seems that bulk insertion can be done without any problems. ... But what if the data {name: "yamaha", num: 100} was already recorded before the bulk insertion was executed? The first record of the bulk insertion will be subject to the UNIQUE constraint, which means it will result in an execution error.

In addition, there is a complication that makes this problem more complicated. That is the question of whether or not records that did not fail the UNIQUE constraint should be inserted when bulk insert fails due to UNIQUE constraint error. The answer to this is probably case by case, depending on the data and application. If the records that must be inserted together have no meaning if not inserted together, then it is correct for the bulk insert to fail and none of the records to be inserted. However, if some of the records failed the UNIQUE constraint but the rest should be inserted, then only part of the bulk insert should fail and the rest should be inserted.

In the case of the bulk insert implementation introduced in the above example, it behaves like the former. In other words, "the bulk insert will only succeed if none of the records violate the constraints, and if one of the records violates the constraints, none of the records will be inserted and an error will occur" (strictly speaking, "bulk insert" means such behavior and processing).

Bulk inserting into Node.js and PostgreSQL with a UNIQUE constraint

After a long introduction, now that we understand the background, how can we implement a bulk insert so that records that violate the UNIQUE constraint are not inserted, but records that do not violate the constraint are inserted correctly? This is the key part of this blog entry.

If you are using PostgreSQL 9.5 or higher and Node.js + node-postgres, the following method can be used to achieve this, although it may vary depending on the RDB, library, and programming language you are actually using.

var records = [         // Prepare an array of records to insert.
  [ "yamaha", 100 ],
  [ "suzuki", 101 ],
  [ "yamaha", 102 ]
];

if( pg ){
  var conn = await pg.connect();                                      
  if( conn ){
    try{
      // Create an SQL that ignores data that violates the UNIQUE constraint.
      var sql = "insert into table1( name, num ) ";
      var selects = [];
      for( var i = 0; i < records.length; i ++ ){
        selects.push( "select '" + records[i][0] + "', " + records[i][1] );
      }
      sql += selects.join( " union all" ) + " on conflict ( name, num ) do nothing";

      var query = { text: sql, values: [] );                         
      conn.query( query, function( err, result ){
        if( err ){
          console.log( err );                                        
        }else{
          console.log( result );                                     
        }
      });
    }catch( e ){
      console.log( e );                                             
    }finally{
      if( conn ){
        conn.release();                                              
      }
    }
  }
}

In this example, we are using programming to generate and execute the following SQL statement:

insert into table1( name, num )
  select 'yamaha', 100 union all
  select 'suzuki', 101 union all
  select 'yamaha', 102
  on conflict ( name, num ) do nothing

I'm not doing a bulk insert exactly, but I'm specifying that in one SQL execution, multiple records can be inserted and if a "conflict" (UNIQUE constraint violation) occurs, it should be "ignored". As a result, only records without UNIQUE constraint violations are inserted together.

In my own tests, bulk insert worked as expected even when the number of records to be inserted at once became very large (in cases where the SQL statement itself becomes huge and complex). I was researching this because I was making an app that needs to do this kind of processing, so I hope this helps someone else.

And Finally

As always, I hope you enjoyed this article and learned something new. Thank you and see you in the next articles!

If you liked this article, please give me a like and subscribe to support me. Thank you. 😊


The main goal of this article is to help you improve your English level. I will use Simple English to introduce to you the concepts related to software development. In terms of IT knowledge, it might have been explained better and more clearly on the internet, but remember that the main target of this article is still to LEARN ENGLISH.

NGUYỄN ANH TUẤN

Xin chào, mình là Tuấn, một kỹ sư phần mềm đang làm việc tại Tokyo. Đây là blog cá nhân nơi mình chia sẻ kiến thức và kinh nghiệm trong quá trình phát triển bản thân. Hy vọng blog sẽ là nguồn cảm hứng và động lực cho các bạn. Hãy cùng mình học hỏi và trưởng thành mỗi ngày nhé!

Đăng nhận xét

Mới hơn Cũ hơn