Data Consistency Concepts

Without a transaction (transaction isolation set to NONE), SQLFire ensures FIFO consistency for table updates. Writes performed by a single thread are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by other processes.

When a table is partitioned across members of the distributed system, SQLFire uniformly distributes the data set across members that host the table so that no single member becomes a bottleneck for scalability. SQLFire ensures that a single member owns a particular row (identified by a primary key) at any given time. When an owning member fails, the ownership of the row is transferred to an alternate member in a consistent manner so that all peer servers have a consistent view of the new owner.

It is the responsibility of the owning member to propagate row changes to configured replicas. All concurrent operations on the same row are serialized through the owning member before the operations are applied to replicas. All replicas see the row updates in the exact same order. Essentially, for partitioned tables SQLFire ensures that all concurrent modifications to a row are atomic and isolated from each other, and that the 'total ordering' is preserved across configured replicas.

The operations are propagated in parallel from the owning member to all configured replicas. Each replica is responsible for processing the operation, and it responds with an acknowledgment (ACK). Only after receiving all ACKs from all replicas does the owning member return control to the caller. This ensures that all operations that are sequentially carried out by a single process are applied to all replicas in the same order.

There are several other optimistic and eventually consistent replication schemes that use lazy replication techniques designed to conserve bandwidth, and increase throughput through batching and lazily forwarding messages. Conflicts are discovered after they happen and reaching agreement on the final contents incrementally. This class of systems favor availability of the system even in the presence of network partitions but compromises consistency on reads or make the reads very expensive by reading from each replica.

SQLFire instead uses an eager replication model between peers by propagating to each replica in parallel and synchronously. This approach favors data availability and low latency for propagating data changes. By eagerly propagating to each of its replicas, it is possible for clients reading data to be load balanced to any of the replicas. It is assumed that network partitions are rare in practice and when they do occur within a clustered environment, the application ecosystem is typically dealing with many distributed processes and applications, most of which are not designed to cope with partitioning problems.

By offering a very loosely coupled WAN replication scheme, SQLFire enables the entire client load to be shifted to an alternate "disaster recovery" site.