Misplaced Pages

X/Open XA

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

For transaction processing in computing , the X/Open XA standard (short for "eXtended Architecture") is a specification released in 1991 by X/Open (which later merged with The Open Group ) for distributed transaction processing (DTP).

#124875

34-433: The goal of XA is to guarantee atomicity in "global transactions " that are executed across heterogeneous components. A transaction is a unit of work such as transferring money from one person to another. Distributed transactions update multiple data stores (such as databases, application servers , message queues, transactional caches, etc.). To guarantee integrity, XA uses a two-phase commit (2PC) to ensure that all of

68-413: A Database transaction are visible to all nodes simultaneously. That is, once the transaction has been committed all parties attempting to access the database can see the results of that transaction simultaneously. A good example of the importance of transaction consistency is a database that handles the transfer of money. Suppose a money transfer requires two operations: writing a debit in one place, and

102-517: A consequence, the transaction cannot be observed to be in progress by another database client. At one moment in time, it has not yet happened, and at the next it has already occurred in whole (or nothing happened if the transaction was cancelled in progress). An example of an atomic transaction is a monetary transfer from bank account A to account B. It consists of two operations, withdrawing the money from account A and saving it to account B. Performing these operations in an atomic transaction ensures that

136-445: A corresponding index entry is added (at the 20% mark). Because the backup is already halfway done and the index already copied, the backup will be written with the article data present, but with the index reference missing. As a result of the inconsistency, this file is considered corrupted. In real life, a real database such as Misplaced Pages's may be edited thousands of times per hour, and references are virtually always spread throughout

170-428: A credit in another. If the system crashes or shuts down when one operation has completed but the other has not, and there is nothing in place to correct this, the system can be said to lack transaction consistency. With a money transfer, it is desirable that either the entire transaction completes, or none of it completes. Both of these scenarios keep the balance in check. Transaction consistency ensures just that - that

204-619: A relevant backup example, consider a website with a database such as the online encyclopedia Misplaced Pages , which needs to be operational around the clock, but also must be backed up with regularity to protect against disaster. Portions of Misplaced Pages are constantly being updated every minute of every day, meanwhile, Misplaced Pages's database is stored on servers in the form of one or several very large files which require minutes or hours to back up. These large files—as with any database—contain numerous data structures which reference each other by location. For example, some structures are indexes which permit

238-411: A result. If the power gets shut off after element 4 has been written, the battery backed memory contains the record of commitment for the other three items and ensures that they are written ("flushed") to the disk at the next available opportunity. Consistency (database systems) in the realm of Distributed database systems refers to the property of many ACID databases to ensure that the results of

272-512: A single database. The main disadvantage is that 2PC is a blocking protocol: the other servers need to wait for the transaction manager to issue a decision about whether to commit or abort each transaction. If the transaction manager goes offline while transactions are waiting for its final decision, they will be stuck and hold their database locks until the transaction manager comes online again and issues its decision. This extended holding of locks may be disruptive to other applications that are using

306-446: A small battery back-up unit on their cache memory so that they may offer the performance gains of write caching while mitigating the risk of unintended shutdowns. The battery back-up unit keeps the memory powered even during a shutdown so that when the computer is powered back up, it can quickly complete any writes it has previously committed. With such a controller, the operating system may request four writes (1-2-3-4) in that order, but

340-581: A system is programmed to be able to detect incomplete transactions when powered on, and undo (or "roll back") the portion of any incomplete transactions that are found. Application consistency , similar to transaction consistency, is applied on a grander scale. Instead of having the scope of a single transaction, data must be consistent within the confines of many different transaction streams from one or more applications. An application may be made up of many different types of data, various types of files and data feeds from other applications. Application consistency

374-421: A transaction's changes either take effect ( commit ) or do not ( roll back ), i.e., atomically . Specifically, XA describes the interface between a global transaction manager and a specific application. An application that wants to use XA engages an XA transaction manager using a library or separate service. The transaction manager tracks the participants in the transaction (i.e. the various data stores to which

SECTION 10

#1732776303125

408-469: A variety of relational databases and message brokers. Since XA uses two-phase commit, the advantages and disadvantages of that protocol generally apply to XA. The main advantage is that XA (using 2PC) allows an atomic transaction across multiple heterogeneous technologies (e.g. a single transaction could encompass multiple databases from different vendors as well as an email server and a message broker), whereas traditional database transactions are limited to

442-428: Is missing (confirming success), that the save operation was unsuccessful and so it should undo any incomplete steps already taken to save it (e.g. marking sector 123 free since it never was properly filled, and removing any record of XYZ from the file directory). It relies on these items being committed to disk in sequential order. Suppose a caching algorithm determines it would be fastest to write these items to disk in

476-421: Is now extremely rare. Data consistency Data inconsistency refers to whether the same data kept at different places do or do not match. Point-in-time consistency is an important property of backup files and a critical objective of software that creates backups. It is also relevant to the design of disk memory systems, specifically relating to what happens when they are unexpectedly shut down. As

510-454: Is one of the ACID ( Atomicity, Consistency , Isolation , Durability ) transaction properties. An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or none occur. A guarantee of atomicity prevents partial database updates from occurring, because they can cause greater problems than rejecting the whole series outright. As

544-401: Is what will show if the file is opened). Further, the file system's free space map will not contain any entry showing that sector 123 is occupied, so later, it will likely assign that sector to the next file to be saved, believing it is available. The file system will then have two files both unexpectedly claiming the same sector (known as a cross-linked file ). As a result, a write to one of

578-404: The 75% mark. Consider a scenario where an editor comes and creates a new article at the same time a backup is being performed, which is being made as a simple " file copy " which copies from the beginning to the end of the large file(s) and doesn't consider data consistency - and at the time of the article edit, it is 50% complete. The new article is added to the article space (at the 75% mark) and

612-451: The application writes), and works with them to carry out the two-phase commit. In other words, the XA transaction manager is separate from an application's interactions with servers. XA maintains a log of its decisions to commit or roll back, which it can use to recover in case of a system outage. Many software vendors support XA (meaning the software can participate in XA transactions), including

646-667: The atomicity guarantee and is therefore reserved for emergencies. The XA specification describes what a resource manager must do to support transactional access. Resource managers that follow this specification are said to be XA-compliant . The XA specification was based on an interface used in the Tuxedo system developed in the 1980s, but adopted by several systems since then. Atomicity (database systems) In database systems , atomicity ( / ˌ æ t ə ˈ m ɪ s ə t i / ; from Ancient Greek : ἄτομος , romanized :  átomos , lit.   'undividable')

680-422: The controller may decide the quickest way to write them is 4-3-1-2. The controller essentially lies to the operating system and reports that the writes have been completed in order (a lie that improves performance at the expense of data corruption if power is lost), and the battery backup hedges against the risk of data corruption by giving the controller a way to silently fix any and all damage that could occur as

714-507: The database remains in a consistent state , that is, money is neither lost nor created if either of those two operations fails. The same term is also used in the definition of First normal form in database systems, where it instead refers to the concept that the values for fields may not consist of multiple smaller values to be decomposed, such as a string into which multiple names, numbers, dates, or other types may be packed. Atomicity does not behave completely orthogonally with regard to

SECTION 20

#1732776303125

748-422: The database subsystem to quickly find search results. If the data structures cease to reference each other properly, then the database can be said to be corrupted . The importance of point-in-time consistency can be illustrated with what would happen if a backup were made without it. Assume Misplaced Pages's database is a huge file, which has an important index located 20% of the way through, and saves article data at

782-549: The enclosing transaction may cause an isolation or consistency failure. Typically, systems implement Atomicity by providing some mechanism to indicate which transactions have started and which finished; or by keeping a copy of the data before any changes occurred ( read-copy-update ). Several filesystems have developed methods for avoiding the need to keep multiple copies of data, using journaling (see journaling file system ). Databases usually implement this using some form of logging/journaling to track changes. The system synchronizes

816-400: The entire database looked at a single moment. In the given Misplaced Pages example, it would ensure that the backup was written without the added article at the 75% mark, so that the article data would be consistent with the index data previously written. Point-in-time consistency is also relevant to computer disk subsystems. Specifically, operating systems and file systems are designed with

850-432: The expectation that the computer system they are running on could lose power, crash, fail, or otherwise cease operating at any time. When properly designed, they ensure that data will not be unrecoverably corrupted if the power is lost. Operating systems and file systems do this by ensuring that data is written to a hard disk in a certain order, and rely on that in order to detect and recover from unexpected shutdowns . On

884-410: The file and can number into the millions, billions, or more. A sequential "copy" backup would literally contain so many small corruptions that the backup would be completely unusable without a lengthy repair process which could provide no guarantee as to the completeness of what has been recovered. A backup process which properly accounts for data consistency ensures that the backup is a snapshot of how

918-610: The file-system level, POSIX -compliant systems provide system calls such as open(2) and flock(2) that allow applications to atomically open or lock a file. At the process level, POSIX Threads provide adequate synchronization primitives. The hardware level requires atomic operations such as Test-and-set , Fetch-and-add , Compare-and-swap , or Load-Link/Store-Conditional , together with memory barriers . Portable operating systems cannot simply block interrupts to implement synchronization, since hardware that lacks concurrent execution such as hyper-threading or multi-processing

952-400: The files will overwrite part of the other file, invisibly damaging it. A disk caching subsystem that ensures point-in-time consistency guarantees that in the event of an unexpected shutdown, the four elements would be written one of only five possible ways: completely (1-2-3-4), partially (1, 1-2, 1-2-3), or not at all. High-end hardware disk controllers of the type found in servers include

986-403: The logs (often the metadata ) as necessary after changes have successfully taken place. Afterwards, crash recovery ignores incomplete entries. Although implementations vary depending on factors such as concurrency issues, the principle of atomicity – i.e. complete success or complete failure – remain. Ultimately, any application-level implementation relies on operating-system functionality. At

1020-425: The order 4-3-1-2, and starts doing so, but the power gets shut down after 4 get written, before 3, 1 and 2, and so those writes never occur. When the computer is turned back on, the file system would then show it contains a file named XYZ which is located in sector 123, but this sector really does not contain the file. (Instead, the sector will contain garbage, or zeroes, or a random portion of some old file - and that

1054-403: The other ACID properties of transactions. For example, isolation relies on atomicity to roll back the enclosing transaction in the event of an isolation violation such as a deadlock ; consistency also relies on atomicity to roll back the enclosing transaction in the event of a consistency violation by an illegal transaction. As a result of this, a failure to detect a violation and roll back

X/Open XA - Misplaced Pages Continue

1088-415: The other hand, rigorously writing data to disk in the order that maximizes data integrity also impacts performance. A process of write caching is used to consolidate and re-sequence write operations such that they can be done faster by minimizing the time spent moving disk heads. Data consistency concerns arise when write caching changes the sequence in which writes are carried out, because it there exists

1122-434: The possibility of an unexpected shutdown that violates the operating system's expectation that all writes will be committed sequentially. For example, in order to save a typical document or picture file, an operating system might write the following records to a disk in the following order: The operating system relies on the assumption that if it sees item #1 is present (saying the file is about to be saved), but that item #4

1156-446: The same databases. Moreover, if the transaction manager crashes and its record of decisions cannot be recovered (e.g. due to a bug in how the decisions were logged, or due to data corruption on the server), manual intervention may be necessary. Many XA implementations provide an "escape hatch" for transactions to independently decide whether to commit or abort (without waiting to hear from the transaction manager), but this risks violating

#124875