This article presents a practical overview of concurrency control (for theoretical discussions, click here). Concurrency control deals with the issues involved with allowing multiple people simultaneous access to shared entities, be they objects, data records, or some other representation.
Assume that you and I both read the same row from the Customer table, we both change the data, and then we both try to write our new versions back to the database. Whose changes should be saved? Yours? Mine? Neither? A combination? Similarly, if we both work with the same Customer object stored in a shared object cache and try to make changes to it, what should happen?
To understand how to implement concurrency control within your system you must start by understanding the basics of collisions – you can either avoid them or detect and then resolve them. The next step is to understand transactions, which are collections of actions that potentially modify two or more entities.
An important message of this article is that on modern software development initiatives that concurrency control and transactions are not simply the domain of databases, instead they are issues that are potentially pertinent to all of your architectural tiers. In this article, I discuss:
Implementing Referential Integrity and Shared Business Logic discusses the referential integrity challenges that result from there being an object schema that is mapped to a data schema, something called cross-schema referential integrity problems. With respect to collisions things are a little simpler, we only need to worry about the issues with ensuring the consistency of entities within the system of record. The system of record is the location where the official version of an entity is located.
A collision occurs when two activities, which may or may not be full-fledged transactions, attempt to change entities within a system of record. There are three fundamental ways (Celko 1999) that two activities can interfere with one another:
Collisions occur more often when:
So what can you do? First, you can take a pessimistic locking approach that avoids collisions but reduces system performance. Second, you can use an optimistic locking strategy that enables you to detect collisions so you can resolve them. Third, you can take an overly optimistic locking strategy that ignores the issue completely.
Pessimistic locking is an approach where an entity is locked in the database for the entire time that it is in application memory (often in the form of an object). With pessimistic locking:
Pessimistic locking makes it easier to implement and guarantees that database changes tare made consistently and safely. The primary disadvantage is that this approach isn’t scalable. When a system has many users, or when the transactions involve a greater number of entities, or when transactions are long lived, then the chance of having to wait for a lock to be released increases. Therefore this limits the practical number of simultaneous users that your system can support.
With multi-user systems it is quite common to be in a situation where collisions are infrequent. Although the two of us are working with Customer objects, you’re working with the Wayne Miller object while I work with the John Berg object and therefore we won’t collide. When this is the case optimistic locking becomes a viable concurrency control strategy. The idea is that you accept the fact that collisions occur infrequently, and instead of trying to prevent them you simply choose to detect them and then resolve the collision when it does occur.
Figure 1 depicts the logic for updating an object when optimistic locking is used:
Figure 1. Updating an object following an optimistic locking approach.
[画像:Concurrency controle locking]
There are two basic strategies for determining if a collision has occurred:
Figure 1 depicts a native approach, and in fact there are ways to reduce the number of database interactions. The first three requests to the database – the initial lock, marking (if appropriate) the source data, and unlocking – can be performed as a single transaction. The next two interactions, to lock and obtain a copy of the source data, can easily be combined as a single trip to the database. Furthermore the update and unlock can similarly be combined. Another way to improve this is to combine the last four interactions into a single transaction and simply perform collision detection on the database server instead of the application server.
With the strategy you neither try to avoid nor detect collisions, assuming that they will never occur. This strategy is appropriate for single user systems, systems where the system of record is guaranteed to be accessed by only one user or system process at a time, or read-only tables. These situations do occur. It is important to recognize that this strategy is completely inappropriate for multi-user systems.
Collision resolution is critical to effective concurrency control. You have five basic strategies that you can apply to resolve collisions:
It is important to recognize that the granularity of a collision counts. Assume that both of us are working with a copy of the same Customer entity. If you update a customer’s name and I update their shopping preferences, then we can still recover from this collision. In effect the collision occurred at the entity level, we updated the same customer, but not at the attribute level. It is very common to detect potential collisions at the entity level then get smart about resolving them at the attribute level.
For simplicity’s sake, many teams will choose a single locking strategy and apply it for all tables. This works well when all, or at least most, tables in your application have the same access characteristics. However, for more complex applications you will likely need to implement several locking strategies based on the access characteristics of individual tables. One approach, suggested by Willem Bogaerts, is to categorize each table by type to provide guidance as to a locking strategy for it. Strategies for doing so are described in Table 1.
Table 1. Locking strategies by table type.
I’d like to thank Willem Bogaerts and Rich Stone for their feedback regarding this article.