3

I'm currently implementing a multi-threaded version of Barnes-Hut algorithm for the N-body problem. While the algorithm works, it's not very optimized and I'm trying to reduce the run time of my program.

I've made sure that there are several threads that accurately find borders of the space I'm working with and realized that the code in the highest level object that I set the borders in is fairly unoptimized. It looks like this:

public synchronized void setBorders(float maxX, float minX, float maxY, float minY, int thread){
 if(maxX > this.maxX){ 
 this.maxX = maxX; 
 }
 if(maxY > this.maxY){ 
 this.maxY = maxY; 
 }
 if(this.minX > minX){ 
 this.minX = minX; 
 }
 if(this.minY > minY){ 
 this.minY = minY; 
 }
}

I have several threads attempting to access this method once they've figured out their respective values. Since a synchronized object can only be accessed by a single thread at a given time, this can be significantly improved.

On possible solution that I thought of was to remove the "public synchronized void" and rewrite the code into this:

public synchronized void setBorders(float maxX, float minX, float maxY, float minY, int thread){
 Synchronize(this){ 
 if(maxX > this.maxX){ 
 this.maxX = maxX; 
 }
 }
 Synchronize(this){
 if(maxY > this.maxY){ 
 this.maxY = maxY; 
 }
 }
 Synchronize(this){
 if(this.minX > minX){ 
 this.minX = minX; 
 }
 }
 Synchronize(this){
 if(this.minY > minY){ 
 this.minY = minY; 
 }
 }
}
}

If my understanding of the Synchronized block is correct, that only a single thread can access the code inside a Synchronize(this) block at any given time, this should speed up my code.

Will this work, or is there a reason I should avoid this that I've missed?

Edit: Wow, I'm amazed at the speed and accuracy of the help you guys give. I'm really thankful for all this!

asked Sep 8, 2014 at 16:23
7
  • 2
    I see several options here. The compiler may convert the latter code to the former code, since there is no code between the synchronization blocks, or the performance will be worse due to more synchronization contention. I don't really see how that would improve performance. Unless your idea was to use 4 different locks, instead of using this. Why don't you test it out and post results. Commented Sep 8, 2014 at 16:27
  • 1
    Almost certainly slower although the JVM is free to combine the different synchronized blocks in which case it'd be the same speed. The same as for almost all performance questions... measure it. Commented Sep 8, 2014 at 16:27
  • 2
    the synchronized modifier on the method declaration in the second example looks like a mistake. and it's hard to see the point of having multiple threads all contending to set the values of a single object, are you ok with an outcome where minX is set by one thread and minY is set by another thread? Commented Sep 8, 2014 at 17:03
  • 1
    "Shouldn't acquiring and releasing four locks be faster than acquiring and releasing one?" Umm, no. It should take about four times as long. Commented Sep 8, 2014 at 17:43
  • 1
    The synchronization can be avoided or it's effect minimized, but do you really think, it's the most costly part? I doubt it, unless you're running on hundreds of cores. Commented Sep 8, 2014 at 20:12

4 Answers 4

1

That code should execute extremely quickly, so I doubt if splitting it up into more synchronized blocks will lessen contention. I'd go with the single, function level synchronized.

However, if the code were doing more, e.g.

...
if(maxX > this.maxX){ 
 this.maxX = maxX; 
 doSomeSlowerCalculation();
 updateSomeComplexSharedDataStructure();
}
...

Then splitting it up into individual synchronized blocks could help

answered Sep 8, 2014 at 17:06

Comments

1

There are three options to avoid or reduce synchronization:

Use atomics

Using a class from Guava you can do

private final AtomicDouble maxX = new AtomicDouble(Double.MIN_VALUE);

and simply

while (true) {
 double currentMaxX = this.maxX.get();
 if (currentMaxX >= maxX) break;
 boolean ok = compareAndSet(currentMaxX, maxX);
 if (ok) break;
}

If you really have to use float, write your own class, a few line like these would do.

no synchronization, just a CAS.

Double Checked Locking

With

private volatile float maxX;

and Java 1.5 or higher, the following will do

if (maxX > this.maxX) {
 synchronized (this) {
 if (maxX > this.maxX) {
 this.maxX = maxX; 
 }
 }
}

Minimize sharing

Compute your local max/min and only after a few iterations update the shared state. This is the easiest, but may not apply to your use case.

answered Sep 8, 2014 at 17:36

Comments

0

To optimize this more effectively, you could break your setBorders routine into 4 methods (1 for each param) and use 4 locking Objects to lock each setter method individually... But this would only be more effective if your setters were being called individually (not always as a block) and/or weren't always called in the same order, or if you could do a quick escape from the setter when you determine the value isn't actually changing

answered Sep 8, 2014 at 16:33

3 Comments

making each thread synchronize more is not (generally) going to make things faster.
@jtahlborn I understand what you're getting at... its like unnecessary/useless code blocks, right? But in this case, even if every thread was working the same way, locking variables individually would allow up to 4 threads to be processing at a time, rather than just 1... In any case, it depends on how your code is run.
with synchronized block, only one thread is "working" at a time. and the work is trivial (simple assignment). so your threads are just going to be fighting with each other 4 times as often for locks.
0

First, adding synchronized to a method definition like

synchronized void foo() {
 ...
}

is the same thing as

void foo() {
 synchronized(this) {
 ...
 }
}

If you nest synchronized blocks like your second example:

synchronized void foo() {
 synchronized(this) {
 ...
 }
 synchronized(this) {
 ...
 }
}

then the inner blocks don't have any effect on synchronization, the thread calling the method still has to acquire the object's monitor on entering the method, and releases it upon exiting.

If you remove the synchronized from the method so all you have is:

void foo() {
 synchronized(this) {
 ...
 }
 synchronized(this) {
 ...
 }
}

then as threads execute this code they will have to acquire each block separately. So if multiple threads are setting different variables then you may end up with the object having some fields set by one thread and other fields set by another.

With smaller synchronized blocks, the threads may spend more time contending for locks. For each of these blocks the scheduler has to decide which thread gets the lock next. Lock acquisition is not fair, there's no certainty about which thread will get the lock next.

It's not obvious the lower-granularity approach will be faster, it may be better to let a thread get in, set all the fields, and get out. The second example may just make the scheduler work harder for no good reason. The only certain difference is that the second approach will allow intermixing data from different threads.

answered Sep 8, 2014 at 17:43

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.