Distributed Transaction

Distributed Transactions

Overview

Local Transaction

A local transaction is also called a database transaction or traditional transaction (as opposed to a distributed transaction). Its execution model is the common one:

Local transactions have several characteristics:

A single transaction connects to only one transactional database (usually a relational database).
The transaction’s outcome guarantees ACID properties.
Database locks are used.

Initially, transactions were limited to controlling access to a single database resource. After services became modular, the concept of a transaction extended to services. If a single service operation is treated as a transaction, the whole service operation can involve only one database resource. Such a transaction that accesses a single database resource within a single service is called a local transaction.

Distributed Transaction

In a traditional monolithic application, three modules update data on the same data source to complete a business operation. Naturally, data consistency for the whole process is ensured by local transactions.

As business needs and architecture evolve, the monolith is split into micro‑services: the original three modules become three independent services, each with its own data source. The business flow is now carried out by invoking the three services.

Each service still guarantees its own internal data consistency with local transactions, but how do we ensure global data consistency across the entire business process? This is the classic distributed‑transaction requirement in a micro‑service architecture: we need a solution that guarantees global data consistency.

A distributed transaction means that the participants, the transaction‑supporting servers, the resource servers, and the transaction manager all reside on different nodes of a distributed system.

It refers to a large operation composed of many smaller operations that run on different servers; a distributed transaction must ensure that these small operations either all succeed or all fail.

Essentially, a distributed transaction exists to keep data consistent across different databases.

Typical scenarios

E‑commerce order placement with inventory deduction
The order system and the inventory system are separate; a single order requires coordination between them.
Bank card top‑up in a financial system
Recharging a platform via a bank card involves both the bank’s system and the financial platform.
Course selection in an education platform
After a user purchases a course and payment succeeds, the enrollment succeeds; this transaction spans the order system and the enrollment system.
Message sending in a social network
Sending an in‑site message together with an SMS requires coordination between the messaging system and the mobile‑communication system.

Solutions

Compared with local transactions that access a single database resource, distributed‑transaction architectures are more complex. Different distributed application designs raise different concerns—such as multi‑resource coordination and cross‑service transaction propagation—so the implementation mechanisms vary widely.

To achieve distributed transactions, various techniques and strategies are commonly employed, including:

Two‑phase commit (2PC)
Three‑phase commit (3PC)
Distributed transaction coordinators (e.g., Zookeeper, etcd)
Distributed transaction managers (e.g., X/Open XA protocol)
Distributed locks
Optimistic concurrency control

2PC (based on XA)

Background – The X/Open organization (now the Open Group) defined a model for distributed transaction processing. The XA protocol is a distributed transaction protocol.

XA consists of two parts: a transaction manager and resource managers. Resource managers are typically implemented by databases (e.g., Oracle, DB2) that provide XA interfaces. The transaction manager acts as a global scheduler, coordinating commits and rollbacks across the resource managers.

Two‑phase commit (2PC), based on the XA protocol, is an algorithm that ensures all nodes in a distributed system stay consistent when a transaction is committed.

2PC is a mechanism for implementing distributed transactions at the database level.

Implementation idea – In a distributed system, each node knows whether its own operation succeeded or failed, but it cannot know the outcome of other nodes.

When a transaction spans multiple nodes, a coordinator is introduced to collect the results from all participants and instruct them to either commit or rollback.

The two phases are:

Voting phase – Participants report their operation results to the coordinator.
The coordinator tells each participant to execute its local transaction. After the local transaction finishes, the participant reports its status back, but the transaction is not yet committed; database locks are still held.
Commit phase – After receiving all participants’ reports, the coordinator decides whether to commit or rollback and notifies the participants accordingly.
- If all participants succeeded in the first phase, the coordinator tells everyone to commit.
- If any participant failed, the coordinator tells everyone to roll back.

Normal flow

Phase 1 – Prepare: The coordinator sends a prepare request to all resource managers (RMs).
Each RM locks the necessary resources, writes logs, etc., and replies “ready”.
Phase 2 – Commit: Once all RMs are ready, the coordinator sends a commit request.
Each RM performs the commit and acknowledges.

Abnormal flow

Phase 1 – Prepare: The coordinator sends prepare requests, but one or more RMs report failure.
Phase 2 – Rollback: The coordinator detects that the transaction cannot be committed and sends rollback requests.
Each RM undoes its work, releases locks, and acknowledges the rollback.

Drawbacks

Performance – 2PC enforces strong consistency. During execution, all nodes are blocked (because the database must maintain isolation, typically using the SERIALIZABLE isolation level). When one transaction touches a table, other transactions cannot operate on that table, causing global blocking. Only after every node is prepared does the coordinator issue the commit, after which participants release resources.
Coordinator single‑point‑of‑failure – If the coordinator crashes, participants never receive commit or rollback instructions and remain in an indeterminate state. To avoid a complete deadlock, a timeout mechanism is often added, which reduces overall efficiency.

Overall, 2PC is a relatively conservative algorithm.

Example – Four people (A, B, C, D) need to schedule a meeting. Let A be the coordinator and B, C, D the participants.

Voting phase: A emails B, C, D asking if they are free at 10 am on Tuesday.
B replies “yes”.
C does not reply, leaving the whole process blocked.
When C finally replies (yes or no), the algorithm can proceed.
Commit phase: A sends the final decision to B, C, D based on the collected responses.

Because 2PC is slow and prone to blocking, implementing distributed transactions with it is difficult.

3PC

Background – Three‑phase commit (3PC) is a protocol for achieving atomic transactions in distributed systems. It improves upon 2PC, especially addressing the blocking problem that occurs during network partitions or coordinator failures.

Implementation idea – As illustrated:

CanCommit (Inquiry) phase
- The coordinator sends an inquiry (CanCommit) to all participants, asking whether they are ready to commit.
- Participants perform all necessary work but do not commit yet.
- Participants reply (Yes = ready, No = not ready).
PreCommit (Lock) phase
- If the coordinator receives all **Yes** responses, it proceeds to this phase:
  - Sends a pre‑commit message (PreCommit) to all participants, instructing them to lock resources and get ready to commit.
- If any **No** response is received, or a timeout occurs, the coordinator sends an abort message (Abort) telling participants to discard the transaction.
- Participants that receive PreCommit lock resources and wait.
DoCommit (Commit) phase
- If participants successfully locked resources in phase 2, they wait for the final commit or abort command.
- If the coordinator received no **No** responses in phase 2, it sends a commit message (DoCommit) to all participants.
- If the coordinator received any **No** response, or a timeout while waiting for PreCommit, it sends an abort message (Abort).
- Participants that receive DoCommit finalize the commit; if a failure occurs in this phase, participants act according to the last instruction (commit or abort).

Compared with 2PC, 3PC adds a timeout mechanism and introduces an extra phase, allowing participants to synchronize their states.

Effectively, 3PC splits the 2PC commit phase into a pre‑commit and a commit phase. The first phase merely asks participants about their health (e.g., “are you overloaded?”), while the pre‑commit phase works like 2PC’s prepare phase, and the final commit phase mirrors 2PC’s commit.

TCC

Background – Both 2PC and 3PC operate at the database level, whereas TCC works at the business layer. Distributed transactions often involve non‑database actions such as sending SMS messages, which is where TCC shines.

TCC stands for Try‑Confirm‑Cancel. Each branch transaction must implement three operations:

Try – Reserve and lock the required resources (note: this is a reservation, not a final commit).
Confirm – The actual business confirmation; this is the real execution.
Cancel – Rollback; it undoes the reservation made in the Try phase.

The TCC pattern is a high‑performance distributed‑transaction solution suitable for core systems that demand high throughput.

Implementation idea – TCC consists of three phases:

Try – Perform business checks (consistency) and reserve resources (isolation). This is a preliminary step; together with Confirm it forms the complete business logic.
Confirm – After all Try branches succeed, the Confirm phase is executed. In most TCC designs, Confirm is assumed never to fail: if Try succeeded, Confirm will succeed. If Confirm does fail, retries or manual intervention are required.
Cancel – Executed when the business fails and a rollback is needed; it releases the reserved resources. Cancel is also assumed to succeed; otherwise, retries or manual handling are needed.

Conceptually TCC resembles 2PC: first probe, then either commit or roll back. TCC does not depend on the underlying database, allowing cross‑database and cross‑application resource management and giving business owners finer‑grained control.

Example: a transaction needs to perform operations A, B, C. First, each operation performs a reservation. If all reservations succeed, the Confirm step runs; if any reservation fails, all participants execute Cancel. The TCC model also includes a transaction manager that records the global TCC state and decides whether to commit or roll back.

Challenges – Each business operation must define three methods (Try, Confirm, Cancel), making TCC highly intrusive and tightly coupled to business logic. Idempotency must be ensured because retries may occur. Compared with 2PC/3PC, TCC applies to a broader range of scenarios but requires more development effort, as the logic lives in the application layer. Nevertheless, because it is application‑level, TCC can span multiple databases and disparate business systems.

Example: Order‑Inventory deduction

Try – The order service marks the order as “payment pending”. The inventory service checks that available stock > 1 and reserves one unit (available = stock − 1).
If Try succeeds → Confirm – Order status becomes “paid”, inventory’s available stock is permanently reduced.
If Try fails → Cancel – Order status becomes “payment failed”, inventory restores the reserved stock.

Comparing TCC with 2PC’s two‑phase commit: 2PC works at the DB layer across databases, while TCC operates at the application layer, using business logic to achieve the same goal. TCC addresses several 2PC drawbacks:

Eliminates the coordinator single‑point‑of‑failure by letting the primary business initiator drive the activity; the activity manager can be clustered.
Reduces blocking by using timeouts and compensation, avoiding long‑lasting locks and moving resource control into business logic with finer granularity.
Improves data consistency through compensation mechanisms managed by the activity manager.

The advantage of this approach is that applications can define the granularity of data operations, reducing lock contention and increasing throughput. The downside is the high degree of intrusion: every business branch must implement Try, Confirm, and Cancel, and the implementation is complex, requiring different rollback strategies for various failure modes (network issues, system crashes, etc.).

Drawbacks – TCC is an intrusive solution; business systems must implement the three operations, making design and implementation complex.

Exception handling – TCC must address three kinds of anomalies: empty rollback, idempotency, and hanging.

Empty rollback – The Cancel method is called without a preceding Try. The Cancel implementation must detect this situation and return success immediately.
Cause: A branch service crashes or experiences a network error, so the Try never runs. After recovery, the global transaction rolls back and calls Cancel, resulting in an empty rollback.
Solution: Record whether the Try phase executed (e.g., a branch‑transaction table containing the global transaction ID and branch ID). The Try inserts a record; Cancel checks for the record—if present, perform a normal rollback; if absent, treat it as an empty rollback.
Idempotency – To ensure that retries of the second‑phase Try, Confirm, or Cancel do not cause inconsistencies, these interfaces must be idempotent.
Solution: Store an execution status in the branch‑transaction record and check it before each operation.
Hanging – The Cancel of the second phase executes before the Try of the first phase.
Cause: When invoking a branch’s Try via RPC, the branch is registered first, then the RPC call is made. If the network is congested and the RPC times out, the transaction manager may issue a rollback (Cancel) before the Try request finally arrives. The Try would then reserve resources that no longer belong to any active transaction, leaving them “hanging”.
Solution: If the second phase has already completed, the first phase must not run. When executing Try, check the branch‑transaction table for an existing second‑phase record; if found, skip Try.

Illustrative scenario – Transfer 30 CNY from account A to account B, where A and B reside in different services.

Try (A) – Verify A’s balance, then reserve (deduct) 30 CNY.
Confirm (A) – Since the amount was already deducted in Try, Confirm does nothing.
Cancel (A) – If rollback is needed, add the 30 CNY back, but only if Try actually ran.
Try (B) – Reserve (add) 30 CNY to B’s account.
Cancel (B) – If rollback is needed, subtract the 30 CNY, again only if Try ran.

Problem analysis

If A’s Try never executed but Cancel runs, A would receive an extra 30 CNY.
All three methods must be idempotent because they may be invoked multiple times.
If B’s Try succeeds and the reserved 30 CNY is consumed by another thread before Cancel, the rollback must handle this correctly.
If B’s Try never executed but Cancel runs, B would lose 30 CNY.

Resolution

B’s Cancel must first verify whether Try was executed; only then perform the rollback.
Ensure idempotency for Try, Confirm, and Cancel.

(content truncated)

Originally written by Li Wei (李唯_) and published in Chinese on 后端技术栈全书 (Full-Stack Backend Engineering). Translated and adapted for DriftSeas with permission.

Distributed Transaction