31.2. Removing an Existing Datanode

Assume a two coordinator cluster, COORD_1 and COORD_2 with three datanodes DATA_NODE_1, DATA_NODE_2 and DATA_NODE_3. Suppose we want to remove DATA_NODE_3 for any reason. Further assume there is a table named rr_tab_foo distributed in round robin fashion and has rows on all the three datanodes. Following steps should be performed to remove an existing datanode from a running cluster:

  1. Transfer the data from the datanode to be removed to the rest of the datanodes for all the tables in all the databases. For example to shift data of the table rr_tab_foo to the rest of the nodes we can use command:

              ALTER TABLE rr_tab_foo DELETE NODE (DATA_NODE_3);
            
  2. Confirm that there is no data left on the datanode to be removed. For example to confirm that there is no data left on DATA_NODE_3, we can use the following command, it should return ZERO rows. In case of non-zero rows it returns the OIDs of the relations whose data still exists on DATA_NODE_3.

              SELECT c.pcrelid FROM pgxc_class c, pgxc_node n WHERE n.node_name = 'DATA_NODE_3' AND n.oid = ANY (c.nodeoids);
            
  3. Stop the datanode server to be removed. Now any SELECTs or DMLs that involve the datanode to be removed would start failing.

  4. Connect to any of the coordinators. In our example assuming COORD_1 is running on port 5432, the following command would connect to COORD_1.

              psql postgres -p 5432
            
  5. Drop the datanode to be removed. For example to drop datanode DATA_NODE_3 use command.

              DROP NODE DATA_NODE_3;
            
  6. Update the connection information cached in pooler.

              SELECT pgxc_pool_reload();
            
  7. Repeat the above three steps (4,5,6) for all the coordinators in the cluster. In our example we would need to repeat the above steps by connecting to COORD_2.

DATA_NODE_3 is now removed from the cluster.