What is Postgres-XC?

Note: XCONLY: The following description applies only to Postgres-XC.

In short

Note: XCONLY: The following description applies only to Postgres-XC.

Postgres-XC is an open source project to provide write-scalable, synchronous symmetric, transparent PostgreSQL cluster solution. It is a collection if tightly coupled database components which can be installed in more than one hardware or virtual machines.

Write-scalable means Postgres-XC can be configured with as many database servers as you want and handle much more writes (updating SQL statements) which single database server cannot do. Symmetric means you can have more than one data base servers which provide single database view. Synchronous means any database update from any database server is immediately visible to any other transactions running in different masters. Transparent means you don't have to worry about how your data is stored in more than one database servers internally. [1]

You can configure Postgres-XC to run on more than one machines. They store your data in a distributed way, that is, partitioned or replicated way at your choice for each table. [2] When you issue queries, Postgres-XC determines where the target data is stored and issue corresponding queries to servers with the target data.

In typical web systems, you can have as many web servers or application servers to handle your transactions. However, you cannot do this for a database server in general because all the changing data have to be visible to all the transactions. Unlike other database cluster solution, Postgres-XC provides this capability. You can install as many database servers as you like. Each database server provides uniform data view to your applications. Any database update from any server is immediately visible to applications connecting the database from other servers. This feature is called "synchronous multi master" capability and this is the most significant feature of Postgres-XC.

Postgres-XC's Goal

Note: XCONLY: The following description applies only to Postgres-XC.

Ultimate goal of Postgres-XC is to provide synchronous multi-master PostgreSQL cluster with read/write scalability. That is, Postgres-XC should provide the following features:

Postgres-XC Key Components

Note: XCONLY: The following description applies only to Postgres-XC.

In this section, we will show main components of Postgres-XC.

Postgres-XC is composed of three major components, called GTM (Global Transaction Manager), Coordinator and Datanode. Their features are given in the following sections.

GTM (Global Transaction Manager)

Note: XCONLY: The following description applies only to Postgres-XC.

GTM is a key component of Postgres-XC to provide consistent transaction management and tuple visibility control.

As described later in this manual, PostgreSQL's transaction management is based upon MVCC (Multi-Version Concurrency Control) technology. Postgres-XC extracts this technology into separate component as GTM so that any Postgres-XC component's transaction management is based upon single global status. Details will be described in Chapter 45.

Coordinator

Note: XCONLY: The following description applies only to Postgres-XC.

Coordinator is an interface to applications. It acts like conventional PostgreSQL backend process. However, Coordinator does not store any actual data. Actual data is stored by Datanode as described below. Coordinator receives SQL statements, get Global Transaction Id and Global Snapshot as needed, determine which Datanode is involved and ask them to execute (a part of) statement. When issuing statement to Datanodes, it is associated with GXID and Global Snapshot so that Datanode is not confused if it receives another statement from another transaction originated by another Coordinator.

Datanode

Note: XCONLY: The following description applies only to Postgres-XC.

Datanode actually stores your data. Tables may be distributed among Datanodes, or replicated to all the Datanodes. Datanode does not have global view of the whole database, it just takes care of locally stored data. Incoming statement is examined by the Coordinator as described next, and rebuilt to execute at each Datanode involved. It is then transferred to each Datanodes involved together with GXID and Global Snapshot as needed. Datanode may receive request from various Coordinators. However, because each the transaction is identified uniquely and associated with consistent (global) snapshot, data node doesn't have to worry what Coordinator each transaction or statement came from.

Postgres-XC Inherits PostgreSQL

Note: XCONLY: The following description applies only to Postgres-XC.

Postgres-XC is an extension to PostgreSQL and inherits most of its features.

It is an open-source descendant of PostgreSQL and its original Berkeley code. It supports a large part of the SQL standard and offers many modern features:

Also, similar to PostgreSQL, Postgres-XC can be extended by the user in many ways, for example by adding new

And because of the liberal license same as PostgreSQL, Postgres-XC can be used, modified, and distributed by anyone free of charge for any purpose, be it private, commercial, or academic.

Notes

[1]

Of course, you should use the information how tables are stored internally when you design the database physically to get most from Postgres-XC.

[2]

To distinguish from PostgreSQL's partitioning, we call this as "distributed". In distributed database textbooks, this is often referred to as "horizontal fragment").

[3]

Postgres-XC's foreign key usage has some restrictions. For details, see CREATE TABLE.