Home java Atomic and non-atomic operations (java)

Atomic and non-atomic operations (java)

Author

Date

Category

How to understand which operations are atomic and which are non-atomic?

Here’s what I found on Habré :

An operation in a shared memory area is called atomic if it
ends in one step relative to other threads that have access to
this memory. During the execution of such an operation on a variable, neither
one thread cannot observe the change is half complete.
Atomic loading ensures that the entire variable is loaded
at one point in time. Non-atomic operations do not provide such a guarantee.

Ie As I understand it, atomic operations are quite small, performed “in one step relative to other threads.” But what does this “step” mean?

One step == one machine operation? Or something else? How to determine exactly which operations are atomic and which are non-atomic?

PS: I found a similar question , but it’s about C # …


Answer 1, authority 100%

How can atomicity be determined?

The atomicity of an operation is most often denoted by its indivisibility sign: the operation can either be applied in full, or not applied at all. Writing values ​​to an array would be a good example:

public class Curiousity {
  public volatile int [] array;
  public void nonAtomic () {
    array = new int [1];
    array [0] = 1;
  }
  public void probablyAtomic () {
array = new int [] {1};
  }
}

When using the nonAtomic method, there is a possibility that some stream will access array [0] at the moment when array [0] is not initialized and will receive an unexpected value. When using probablyAtomic (assuming that the array is first filled and only then assigned – I cannot guarantee now that this is exactly the case in java, but imagine that this rule is valid within example ) this should not be: array always contains either null or an initialized array, but array [0] cannot contain anything other than 1. This operation is indivisible and cannot be applied halfway like this it was with nonAtomic – either completely or not at all, and the rest of the code can safely expect that array will contain either null or values ​​without resorting to additional checks.

In addition, the atomicity of an operation is often understood as the visibility of its result to all participants in the system to which it refers (in this case, threads); this is logical, but, in my opinion, is not a mandatory sign of atomicity.

Why is this important?

Atomicity often stems from the business requirements of applications: banking transactions must be applied in full, concert tickets must be ordered immediately in the quantity specified, etc. Specifically in the context that is parsed (multithreading in java), the tasks are more primitive, but they grow from the same requirements: for example, if you are writing a web application, then the server that parses HTTP requests must have a queue of incoming requests with atomic addition, otherwise there is a risk loss of incoming requests, and, consequently, degradation of the quality of service. Atomic operations provide guarantees (indivisibility) and should be used when these guarantees are needed.

In addition, atomic operations are linearizable – roughly speaking, their execution can be decomposed into one linear history, while simple operations can produce a history graph, which is unacceptable in some cases.

Why are primitive operations not atomic in and of themselves? It would also be easier for everyone.

Modern runtime environments are very complex and have a bunch of optimizations on board that you can do with your code, but in most cases these optimizations violate guarantees. Since most of the code does not actually require these guarantees, it turned out to be easier to separate operations with specific guarantees into a separate class than vice versa. The most common example is changing the order of expressions – the processor and the JVM have the right to execute expressions not in the order in which they were described in the code, as long as the programmer does not force a certain order of execution using operations with specific guarantees. You can also give an example (not sure, though, what is formally correct) with reading a value from memory:

thread # 1: set x = 2
processor # 1: save_cache (x, 2)
processor # 1: save_memory (x, 2)
thread # 2: set x = 1
processor # 2: save_cache (x, 1)
processor # 2: save_memory (x, 1)
thread # 1: read x
processor # 1: read_cache (x) = 2 // while x has already been updated with a value of 1 in thread # 2

This does not use the so-called. single source of truth in order to control the value of X, so such anomalies are possible. As far as I understand, reading and writing directly to memory (or to memory and to the shared processor cache) is exactly what the volatile modifier forces (I could be wrong here).

Of course, optimized code runs faster, but necessary guarantees should never be sacrificed for code performance .

Does this only apply to operations related to setting variables and other processor activities?

No. Any operation can be atomic or non-atomic, for example, classic relational databases ensure that a transaction – which can consist of data changes per megabyte – is either fully applied or not applied. Processor instructions are irrelevant here; an operation can be atomic as long as it is atomic in itself, or its result manifests itself as another atomic operation (for example, the result of a database transaction manifests itself as a write to a file).

In addition, as far as I understand, the statement “the instruction did not have time in one cycle – the operation is not atomic” is also incorrect, because there are some specialized instructions , and no one bothers to atomically set any value in memory at the entrance to the protected block and uncheck it at the exit.

Can any operation be atomic?

No. I really lack the qualifications for correct formulations, but, as far as I understand, any operation involving two or more external effects (side effects) cannot be atomic by definition. First of all, side effect means interaction with some external system (be it a file system or an external API), but even two expressions for setting variables inside a synchronized block cannot be considered an atomic operation as long as one of them can throw an exception – and this is , given the OutOfMemoryError and other possible outcomes, it may not be possible at all.

I have an operation with two or more side effects. Is there anything I can do about it anyway?

Yes, you can create a system with a guarantee that all operations are applied, but with the condition that any side effect can be called an unlimited number of times. You can create a journaling system that atomically records planned operations, checks the log regularly, and does things that have not yet been applied. This can be represented as follows:

client: journal.push {withdrawMoney {card = 41111111, cvc = 123}, reserveTicket {concert = 123}, sendEmail {address = nobody @ localhost}}
client: & lt; log confirmed receipt and writing of job & gt;
journal: process withdrawMoney
journal: markCompleted withdrawMoney
journal: process reserveTicket
journal: & lt; dies before reserveTicket & gt;
journal: & lt; recovering & gt;
journal: process reserveTicket # side effect is called again, but only in case of incorrect work
journal: markCompleted reserveTicket
journal: process sendEmail
journal: markCompleted sendEmail

This ensures the progress of the algorithm, but removes all obligations on the time frame (with which, formally speaking, not everything is in order anyway). If the operations are idempotent, such a system will sooner or later arrive at the required state without any noticeable differences from the expected (except for the execution time).

How do you determine the atomicity of operations in java?

The primary source of truth in this case is the Java Memory Model, which defines what assumptions and guarantees apply to code in the JVM. The Java Memory Model, however, is quite complex to understand and covers a much larger scope of operations than the scope of atomic operations, so in the context of this question it is enough to know that the volatile modifier provides atomic reads and writes, and the Atomic * classes allow compare-and-swap operations to change the values ​​atomically without fear that another record will come between reading and writing, and in the comments below, at the time of reading, they probably added something else that I forgot.


Answer 2, authority 68%

How to understand which operations are atomic and which are non-atomic?

At the risk of incurring accusations of sexism, I can’t resist and give an example of an atomic operation: pregnancy is a strictly atomic operation, there is always one and only one father (we will take all sorts of genetic tricks out of parentheses).

And vice versa, an example of a non-atomic operation: raising a child – alas, a non-atomic operation, a child is, unfortunately, the subject of many different unsynchronized operations on a child’s fragile soul: mom, dad, grandmother, grandfather, zombie box, kindergarten, school , friends, girlfriends, etc. according to the list.


Answer 3, authority 40%

I’ll try to explain. I could be wrong.

There is java, the sources are compiled into bytecode. The bytecode is converted to machine code during program execution. One instruction / command in bytecode can be converted to multiple machine code instructions. This is the problem of atomicity. The processor cannot execute one instruction written in high-level language at a time : it executes machine code containing a sequence of instructions. Therefore, if different processors perform manipulations on the same data, then different processor instructions can be interleaved.

Let me give you an example:

There is a global variable:

public volatile int value = 0;
first-thread {
  value ++
}
second-thread {
  value ++
}

Incrementing a variable is not an atomic operation: it requires at least three instructions:

  • read data
  • increase by one
  • write data

Accordingly, two threads must execute this sequence, but the order of their execution between them is undefined. Because of this, situations like the following may arise:

  1. The first thread read the data
  2. The second thread read the data
  3. The first thread has increased the value by 1
  4. The second thread has increased the value by 1
  5. Second thread wrote the value
  6. The first thread wrote the value

The result is 1, not 2 as expected.

To prevent this from happening, use either synchronization or atomic primitives from the java.util.concurrent package

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions