logo

Google Globally-Distributed Database

PDF Publication Title:

Google Globally-Distributed Database ( google-globally-distributed-database )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 006

Operation Read-Write Transaction Read-Only Transaction Snapshot Read, client-provided timestamp Snapshot Read, client-provided bound Discussion § 4.1.2 § 4.1.4 — Control pessimistic lock-free Replica Required leader leader for timestamp; any for read, subject to § 4.1.3 any, subject to § 4.1.3 any, subject to § 4.1.3 Timestamp Concurrency § 4.1.3 Table 2: Types of reads and writes in Spanner, and how they compare. lock-free lock-free for the sawtooth bounds from 0 to 6 ms. The remain- ing 1 ms comes from the communication delay to the time masters. Excursions from this sawtooth are possi- ble in the presence of failures. For example, occasional time-master unavailability can cause datacenter-wide in- creases in ε. Similarly, overloaded machines and network links can result in occasional localized ε spikes. 4 Concurrency Control This section describes how TrueTime is used to guaran- tee the correctness properties around concurrency con- trol, and how those properties are used to implement features such as externally consistent transactions, lock- free read-only transactions, and non-blocking reads in the past. These features enable, for example, the guar- antee that a whole-database audit read at a timestamp t will see exactly the effects of every transaction that has committed as of t. Going forward, it will be important to distinguish writes as seen by Paxos (which we will refer to as Paxos writes unless the context is clear) from Spanner client writes. For example, two-phase commit generates a Paxos write for the prepare phase that has no correspond- ing Spanner client write. 4.1 Timestamp Management Table 2 lists the types of operations that Spanner sup- ports. The Spanner implementation supports read- write transactions, read-only transactions (predeclared snapshot-isolation transactions), and snapshot reads. Standalone writes are implemented as read-write trans- actions; non-snapshot standalone reads are implemented as read-only transactions. Both are internally retried (clients need not write their own retry loops). A read-only transaction is a kind of transaction that has the performance benefits of snapshot isolation [6]. A read-only transaction must be predeclared as not hav- ing any writes; it is not simply a read-write transaction without any writes. Reads in a read-only transaction ex- ecute at a system-chosen timestamp without locking, so that incoming writes are not blocked. The execution of the reads in a read-only transaction can proceed on any replica that is sufficiently up-to-date (Section 4.1.3). A snapshot read is a read in the past that executes with- out locking. A client can either specify a timestamp for a snapshot read, or provide an upper bound on the desired timestamp’s staleness and let Spanner choose a time- stamp. In either case, the execution of a snapshot read proceeds at any replica that is sufficiently up-to-date. For both read-only transactions and snapshot reads, commit is inevitable once a timestamp has been cho- sen, unless the data at that timestamp has been garbage- collected. As a result, clients can avoid buffering results inside a retry loop. When a server fails, clients can inter- nally continue the query on a different server by repeat- ing the timestamp and the current read position. 4.1.1 Paxos Leader Leases Spanner’s Paxos implementation uses timed leases to make leadership long-lived (10 seconds by default). A potential leader sends requests for timed lease votes; upon receiving a quorum of lease votes the leader knows it has a lease. A replica extends its lease vote implicitly on a successful write, and the leader requests lease-vote extensions if they are near expiration. Define a leader’s lease interval as starting when it discovers it has a quo- rum of lease votes, and as ending when it no longer has a quorum of lease votes (because some have expired). Spanner depends on the following disjointness invariant: for each Paxos group, each Paxos leader’s lease interval is disjoint from every other leader’s. Appendix A de- scribes how this invariant is enforced. The Spanner implementation permits a Paxos leader to abdicate by releasing its slaves from their lease votes. To preserve the disjointness invariant, Spanner constrains when abdication is permissible. Define smax to be the maximum timestamp used by a leader. Subsequent sec- tions will describe when smax is advanced. Before abdi- cating, a leader must wait until TT.after(smax) is true. 4.1.2 Assigning Timestamps to RW Transactions Transactional reads and writes use two-phase locking. As a result, they can be assigned timestamps at any time Published in the Proceedings of OSDI 2012 6

PDF Image | Google Globally-Distributed Database

google-globally-distributed-database-006

PDF Search Title:

Google Globally-Distributed Database

Original File Name Searched:

spanner-osdi2012.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP