I am working on a project in which I construct a URL with a valid hostname (but not a blocked hostname) and then execute that URL using RestTemplate
from my main application thread. I also have a single background thread in my application which parses the data from the url and extracts the block list of hostnames from it.
So if any block list of hostnames is present, then I won't make a call to that hostname from the main thread and I will try making a call to another hostname.
By block list, I mean whenever any server is down, its hostname is on the block list.
Below is my background thread code. It will get the data from my service URL and keep on running every 10 minutes once my application has started up. It will then parse the data coming from the URL and store it in a ClientData
class variable -
public class TempScheduler {
// .. scheduledexecutors service code to start the background thread
// call the service and get the data and then parse
// the response.
private void callServiceURL() {
String url = "url";
RestTemplate restTemplate = new RestTemplate();
String response = restTemplate.getForObject(url, String.class);
parseResponse(response);
}
// parse the response and store it in a variable
private void parseResponse(String response) {
//...
// get the block list of hostnames
Map<String, List<String>> coloExceptionList = gson.fromJson(response.split("blocklist=")[1], Map.class);
List<String> blockList = new ArrayList<String>();
for(Map.Entry<String, List<String>> entry : coloExceptionList.entrySet()) {
for(String hosts : entry.getValue()) {
blockList.add(hosts);
}
}
// store the block list of hostnames which I am not supposed to make a call
// from my main application
ClientData.replaceBlockedHosts(blockList);
}
}
Below is my ClientData
class. replaceBlockedHosts
method will only be called by a background thread meaning only one writer. But isHostBlocked
will be called by main application threads multiple times to check whether the hostname is blocked or not. And also blockHost
method might be called from catch block
multiple times to add the down host in the blockedHosts
list so I need to make sure all the read threads can see the consistent data and are not making calls to that down host, instead they are making calls to next host in the hostnames linked list.
public class ClientData {
// .. some other variables here which in turn used to decide the list of hostnames
private static final AtomicReference<ConcurrentHashMap<String, String>> blockedHosts =
new AtomicReference<ConcurrentHashMap<String, String>>(new ConcurrentHashMap<String, String>());
public static boolean isHostBlocked(String hostName) {
return blockedHosts.get().containsKey(hostName);
}
public static void blockHost(String hostName) {
blockedHosts.get().put(hostName, hostName);
}
public static void replaceBlockedHosts(List<String> hostNames) {
ConcurrentHashMap<String, String> newBlockedHosts = new ConcurrentHashMap<>();
for (String hostName : hostNames) {
newBlockedHosts.put(hostName, hostName);
}
blockedHosts.set(newBlockedHosts);
}
}
And below is my main application thread code in which I am finding all the hostnames on which I can make a call and then iterate the hostnames
list to make a call.
If that hostname
is null or in the block list category, then I won't make a call to that particular hostname and will try next hostname in the list.
@Override
public DataResponse call() throws Exception {
List<String> hostnames = new LinkedList<String>();
// .. some separate code here to populate the hostnames list
// from ClientData class
for (String hostname : hostnames) {
// If host name is null or host name is in block list category, skip sending request to this host
if (hostname == null || ClientData.isHostBlocked(hostname)) {
continue;
}
try {
String url = generateURL(hostname);
response = restTemplate.getForObject(url, String.class);
break;
} catch (RestClientException ex) {
// does this call will be thread safe for blockHost method in ClientData class?
ClientData.blockHost(hostname);
}
}
}
I don't need to make a call to the hostname whenever it is down from the main thread. And my background thread gets these detail from one of my service, whenever any server is down, it will have the list of hostnames which are block hosts and whenever they are up, that list will get updated.
And also, whenever any RestClientException
is being thrown, I will add that hostname in the blockedHosts
concurrentmap since my background thread is running every 10 minutes so that map won't have this hostname until 10 minutes is done. And whenever this server came back up, my background will update this list automatically.
Should I have fully atomic operations so that all the main threads should see consistent data from isHostBlocked
method?
I'm opting for code review to see whether full atomicity is maintained or not in my above code.
3 Answers 3
To answer the issue you raised in a comment:
I have this code running for a while in production and the problem I am seeing is as soon as hostA goes down, I start seeing lot of java.util.concurrent.TimeoutException on my client as you can see I am using Callable so which means I am doing future.get somewhere and it throws TimeoutException as soon as any hosts goes down but if all the hosts are up then everything works like a charm without any java.util.concurrent.TimeoutException which makes me think that something is wrong when any host goes down and gets added to block list so this is puzzling me a little bit
(Also, holy run-on sentences, Batman!)
A possible scenario for the time-outs is as follows:
- You have 4 threads, A through D, connecting to server
api-2.acme.com
. - Thread B throws a RestException for some reason, and marks
api-2.acme.com
as unreachable. - However, threads A, C, and D are not notified of this, and continue to attempt connecting to
api-2.acme.com
, leading either to a time-out or an exception of their own.
You could try cancelling the attempts of A, C, and D through Future.cancel(true)
, which usually issues a thread interrupt that, in turn, would lead to an InterruptedIOException
from the socket. Or you could leave them be: maybe the error was transient—a hiccup—and the other threads make it through just fine.
Thread-safe?
Your code is thread-safe in the sense that it is maintains consistency under concurrency, but it runs the risk of lost updates.
replaceBlockedHosts
builds a new map and then replaces the current map with it. But any intermediate changes made to the old map are lost when this happens, so if a thread just reported failure, it's gone. I see two quick ways to deal with that issue:
Block until the replacement is complete.
Add timestamps to your blocklists. Ignore commands with an earlier timestamp than your list/map currently holds; after all, it is already holding more recent data.
Timestamping failed attempts allows you more fine-grained control with retry policies. It also obviates the need for a timed check-and-replace, so that saves you from needing a replace function at all.
I've added a rudimentary implementation that simply marks the host as unavailable for 10 minutes. Note that setHostUnavailable
uses atomic retrieve-and-replace calls to handle concurrency.
static final long OUTAGE_RETRY_MILLIS = TimeUnit.MILLISECONDS.convert(10, TimeUnit.MINUTES);
final ConcurrentMap<String, Long> failedAttempts;
/* For now, return a constant value. Change this in case we need to have variable retry delays,
* for instance, depending on recent failures. */
public long getRetryDelayInMillis(String host) {
return OUTAGE_RETRY_MILLIS;
}
public boolean isHostBlocked(String host, long currentTickMillis) {
final Long lastOutageMillis = failedAttempts.get(host);
return lastOutageMillis != null && (currentTickMillis - lastOutageMillis) < getRetryDelayInMillis(host);
}
public void setHostUnavailable(String host, long currentTickMillis) {
/* Loop until result written away without interference. */
while (true) {
final Long lastOutageMillis = failedAttempts.get(host);
if ( lastOutageMillis == null ) {
if ( failedAttempts.putIfAbsent(host, currentTickMillis) == null ) {
break;
}
} else if ( lastOutageMillis < currentTickMillis ) {
if ( failedAttempts.replace(host, lastOutageMillis, currentTickMillis) ) {
break;
}
} else { // lastOutageMillis >= currentTickMillis
break; // another thread seems to have written more recent results
}
}
}
Replacing the blocklist
I will be calling
setHostUnavailable
from my background thread right? [My] background thread will have a list of block hostnames so I need to pass the list of block hostnames tosetHostUnavailable
which accepts String as of now.
It depends on the semantics of the check your background thread (the every-ten-minutes one) does. If the idea is that these hosts should be added to the blocklist, then it is a simple matter of looping over setHostUnavailable
. If additionally hosts not in the line-up need to be unblocked, then you need a way to remove them from the blacklist:
public void replaceBlockedHosts(Set<String> hosts, long currentTickMillis) {
for ( final String hostToBlock : hosts ) {
setHostUnavailable(blockedHost, currentTickMillis);
}
// Now unblock other hosts
final Set<String> hostsToUnblock = new Hashset<>(failedAttempts.keySet());
hostsToUnblock.removeAll(hosts);
for ( final String blockedHost : hostsToUnblock ) {
setHostAvailable(blockedHost, currentTickMillis);
}
}
public void setHostAvailable(String host, long currentTickMillis) {
/* Loop until result written away without interference. */
while (true) {
final Long lastOutageMillis = failedAttempts.get(host);
if ( lastOutageMillis == null || lastOutageMillis > currentTickMillis ) {
break; // no outage or outdated results, break out
} else { // lastOutageMillis <= currentTickMillis
if ( failedAttempts.remove(host, lastOutageMillis) ) {
break;
}
}
}
}
-
\$\begingroup\$ @JvR Thanks for suggestion. Quick question, I will be calling
setHostUnavailable
from my background thread right? And also, do you think I should have Http Request timeout as well on myRestTemplate
along withfuture timeout
? \$\endgroup\$arsenal– arsenal2014年09月06日 15:51:41 +00:00Commented Sep 6, 2014 at 15:51 -
\$\begingroup\$ @Webby Call
setHostUnavailable
from any thread that has just discovered the host is unreachable. \$\endgroup\$JvR– JvR2014年09月06日 16:35:25 +00:00Commented Sep 6, 2014 at 16:35 -
\$\begingroup\$ @JvR sure but in my case background thread will have a list of block hostnames so I need to pass the list of block hostnames to
setHostUnavailable
which accepts String as of now. \$\endgroup\$arsenal– arsenal2014年09月07日 03:08:18 +00:00Commented Sep 7, 2014 at 3:08 -
\$\begingroup\$ @Webby I've (hopefully) addressed your question in the answer above. \$\endgroup\$JvR– JvR2014年09月07日 14:17:53 +00:00Commented Sep 7, 2014 at 14:17
Your code is thread-safe as it is. [But not really... see the comments below.]
Be careful with method and variable names.
TempScheduler
does not really have anything temporary.callServiceURL
does a lot more than calling a URL.blockHost(hostname)
would be better asaddToBlockedHosts(hostname)
.call
is non-descriptive.ClientData
only contains information related to host names.Try to have each method doing only one thing. For example,
callServiceURL
should just return some json. ThenparseResponse
should take that json and return a list of host names. Finally, some other functions should add the host names to the set of blocked hosts.volatile
has the same effect asAtomicReference
and makes the code simpler. See this post.Instead of either
AtomicReference
orvolatile
you could just have afinal ConcurrentHashMap
which you wouldclear
when you receive a new list.Your non-class
ClientData
with only static methods makes me cringe. Please make it a normal class. Create an instance ofClientData
in the main thread and pass it to the secondary thread when you create it.
-
\$\begingroup\$ Thanks toto for the great suggestion. I will work on making the method names better and class name as well. One thing which is bothering me, do you think the
blockHost
operation is atomic? Meaning if let's sayhostA
got added to block list by ThreadA, I want to make sure no other thread makes a call to hostA then so somehow I have a feeling it is not fully atomic? \$\endgroup\$arsenal– arsenal2014年09月03日 03:04:45 +00:00Commented Sep 3, 2014 at 3:04 -
1\$\begingroup\$ You are using a
ConcurrentHashMap
with anAtomicReference
so it is absolutely thread-safe. \$\endgroup\$toto2– toto22014年09月03日 04:14:06 +00:00Commented Sep 3, 2014 at 4:14 -
\$\begingroup\$ hmmm, I have this code running for a while in production and the problem I am seeing is as soon as hostA goes down, I start seeing lot of
java.util.concurrent.TimeoutException
on my client as you can see I am usingCallable
so which means I am doingfuture.get
somewhere and it throwsTimeoutException
as soon as any hosts goes down but if all the hosts are up then everything works like a charm without any java.util.concurrent.TimeoutException which makes me think that something is wrong when any host goes down and gets added to block list so this is puzzling me a little bit.. \$\endgroup\$arsenal– arsenal2014年09月03日 04:59:48 +00:00Commented Sep 3, 2014 at 4:59 -
\$\begingroup\$ ..may be I thought blockHost operation is not fully atomic so that's why I am seeing this issue? \$\endgroup\$arsenal– arsenal2014年09月03日 05:00:09 +00:00Commented Sep 3, 2014 at 5:00
-
\$\begingroup\$ I think I understand now.
blockHost
has toget
the set and then it adds the value to it. It is possible that the set in the underlying reference is swapped during a call toblockHost
. For your application, it is not an issue as you have "sloppy" requirements: The blocked hosts which you know about are never quite exactly the true set of blocked hosts. If it does happen that the set is swapped during a call toblockHost
, it is effectively equivalent to having a new blocked host added to the set a fraction of a second before the set is swapped. It's not something you would care abou \$\endgroup\$toto2– toto22014年09月03日 18:17:42 +00:00Commented Sep 3, 2014 at 18:17
Let's get something straight:
- One thread iterates over a list of hosts to run a request for each, and if the request fails it adds to a list of blocked hosts
- Another thread runs in the background, and every 10 minutes it gets the list of blocked hosts from a service and replaces the current list
In the current code these actions are not synchronized: while the background thread gets the new list of blocked hosts, the first thread may add hosts. If those hosts are not on the new list, then the replace operation will effectively remove them.
This might not be a real problem. It depends on if you really want to get every update or not. If you are ok with missing a few updates, that an unreachable host was not correctly added (because the replace operation wiped it out), then it's fine. If you don't want to miss updates, then you need to synchronize these two actions.
Essentially, this is the same thing that @JvR said already. And I agree with him that adding timestamps to the block lists would be a better solution than blocking during the replacement. I suggest to work with his code snippet.
Another good thing about that you were not using the values of your ConcurrentHashMap
anyway.
You didn't include the code that populates the LinkedList<String> hostnames
in the call
method. Keep in mind that if you iterate over the block-list, if during the iteration the block-list gets replaced, there are no guarantees about the iterator in progress. The iterator's state corresponds to the elements that existed at the time of its creation. It may or may not get the updates, so you may be iterating over a list that's out of date. See this related answer for more details.
It's not related to your problem, but I find this null check a bit strange:
if (hostname == null || ClientData.isHostBlocked(hostname)) {
How would a null hostname end up in your list? I suggest to improve your code in a way that avoids adding null hostnames to the list, which doesn't make sense. And then you can drop this null check.
You could replace the AtomicReference
with the volatile
keyword:
- a
ConcurrentHashMap
already ensures thread safety of the hashmap operations - the replacement of the hashmap can be safe with the
volatile
keyword - none of these can help with the logical disconnect between your 2 threads
An AtomicReference
would help if you wanted to do compare-and-set operations,
which is not the case here.
-
\$\begingroup\$ Thanks janos for the suggestion. One quick thing, do you think I should have Http Request timeout as well on my
RestTemplate
along with future timeout? \$\endgroup\$arsenal– arsenal2014年09月06日 15:45:47 +00:00Commented Sep 6, 2014 at 15:45
Explore related questions
See similar questions with these tags.