I'd like to cache some heavy resources locally. The goal of this implementation is to have be able to load resources for an unknown amount of time and then keep them in memory and finally evict them upon needs. So there is no "time-based" evictions needed: only "memory-based" ones.
As written in comment, I'm writing this code in the scope of a small library using only Guava as dependency, and currently, Guava doesn't support the single-element cache.
I've developped the following code:
Resource.java
import com.google.common.io.ByteSource;
import java.io.UncheckedIOException;
import java.lang.ref.SoftReference;
import java.util.function.Supplier;
public class Resource<T> implements Supplier<T>, AutoCloseable {
private final ByteSource source;
private final ResourceLoader<T> loader;
private SoftReference<T> reference;
Resource(ByteSource source, ResourceLoader<T> loader) {
this.source = source;
this.loader = loader;
reference = new SoftReference<>(null);
}
@Override
public T get() throws UncheckedIOException {
// Double-checked locking
T object = reference.get();
if (object == null) {
synchronized(this) {
object = reference.get();
if (object == null) {
object = this.loader.uncheckedLoad(this.source);
this.reference = new SoftReference(object);
}
}
}
return object;
}
@Override
public void close() {
synchronized(this) {
this.reference = new SoftReference<>(null);
}
}
}
ResourceLoader.java
import com.google.common.io.ByteSource;
import java.io.IOException;
import java.io.UncheckedIOException;
public interface ResourceLoader<T> {
public default <T> T uncheckedLoad(ByteSource source) throws UncheckedIOException {
try {
return this.load(source);
} catch (IOException ex) {
throw new UncheckedIOException(ex);
}
}
public <T> T load(ByteSource source) throws IOException;
}
I've also written the test:
import static org.hamcrest.Matchers.is;
import static org.hamcrest.Matchers.sameInstance;
import static org.junit.Assert.assertThat;
import static org.mockito.Matchers.any;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.times;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
public class ResourceTest {
@org.junit.Test
public void shouldCacheInstanceAndReturnIt() throws java.io.IOException {
ResourceLoader<String> loader = mock(ResourceLoader.class);
String abc = new String("abc"); // Force a specific address to avoid javac optimization
when(loader.uncheckedLoad(any())).thenReturn(abc);
when(loader.load(any())).thenReturn(abc);
Resource<String> resource = new Resource(null, loader);
assertThat(resource.get(), is(sameInstance(abc)));
assertThat(resource.get(), is(sameInstance(abc)));
verify(loader, times(1)).uncheckedLoad(any());
}
}
My various questions regarding this code are the following:
- Is this implementation appropriate for a cache?
- Is the double-checked locking the right way to use this? A priori, various calls to
Resource::get
should return identical object (not especially in regard toObject::equals
or==
, but mostly in regards to what the final user sees). - How can I properly test this implementation? I see it's working: my test passes, but how can I simulate a soft reference eviction? Can I simply call my
close()
method?
2 Answers 2
Something between a comment on rolfl's review and an own CR.
DCL
DCL is broken, but can be easily helped using volatile
. Without it, it may happen that an object gets load
ed, but other threads see its non-initialized state, which is too bad.
This wouldn't happen for objects with final variables fully initialized in the constructor, but imposing such a constraint on the user is cruel and the punishment in form of inexplicable random bugs even more so.
AutoCloseable
I'd prefer Closeable
to AutoCloseable
as it guarantees idempotency and that's how it behaves. But I agree that leaving it out is better.
Caching / ReferenceQueue
I guess that the part concerning caching is fine. A ReferenceQueue
would load things which may not be needed anymore, that's more a threaded memory hog than a cache. I actually agree with the design in this respect.
No, I don't need a soft reference.
You most probably do, unless it's OK to lose entries on each GC, and then you could use a WeakReference
. But the standard for caching is SoftReference
.
Note that there were bugs when the JVM threw OOM before cleaning all soft reference. That's why they got rather unpopular and some whatever-stupid-criterion eviction is often preferred.
Alternative
Your single element cache could be implemented via a normal (Guava) cache.
@RequiredArgsConstructor
@Getter
class ResourceKey<T> {
public T get() {
return CACHE.getUnchecked(this);
}
private final ByteSource source;
private final ResourceLoader<T> loader;
}
private static final LoadingCache<ResourceKey, T> CACHE =
CacheBuilder.newBuilder().softValues().build(
new LoadingCache<ResourceKey, T>() {
public T load(ResourceKey key) {...}
}
);
-
\$\begingroup\$ Thanks. You've convinced me to use a
LoadingCache
(notCacheLoader
;) ) also, thanks for confirming that I needSoftReference
. I'm using Java 8, so I believe that the bugs you mention are past in my case. \$\endgroup\$Olivier Grégoire– Olivier Grégoire2015年05月20日 08:54:55 +00:00Commented May 20, 2015 at 8:54 -
\$\begingroup\$ @OlivierGrégoire I haven't been trying to convince you. Actually, you solution has an advantage or two: 1. If you lose the last reference to your
Resource
, then it gets collected together with its content. This can be achieved by usingweakKeys()
. 2. It probably needs less memory per entry. 3. It's probably a bit faster as there's no hash map lookup. \$\endgroup\$maaartinus– maaartinus2015年05月20日 09:03:17 +00:00Commented May 20, 2015 at 9:03 -
\$\begingroup\$ Yes, but my solution has so many issues in regards to synchronization with that DCL since I can't really use volatile... \$\endgroup\$Olivier Grégoire– Olivier Grégoire2015年05月20日 09:49:03 +00:00Commented May 20, 2015 at 9:49
-
\$\begingroup\$ @OlivierGrégoire What's wrong with
volatile
(AFAICT all what's missing are nine chars). +++ If anything, then useAtomicReference
or "normal" synchronization. With a lock per element instead of lock per segment, you get quite some concurrency (assuming not all threads concentrate on a single instance). \$\endgroup\$maaartinus– maaartinus2015年05月20日 10:16:05 +00:00Commented May 20, 2015 at 10:16 -
1\$\begingroup\$ @OlivierGrégoire Exactly. Neither the JVM (>=1.5) nor the hardware can re-order writes preceding a volatile write after it and reads following a volatile read before it, so you won't ever see an non-initialized object. And sure, you can't make a local variable volatile. \$\endgroup\$maaartinus– maaartinus2015年05月20日 10:48:29 +00:00Commented May 20, 2015 at 10:48
Double-checked
Double-checked locking does not work. There are some complicated reasons for this, but, it's enough to say that the Java specification allows the Java runtime to legitimately reorder some of the checking code in a way that makes the sync lock out-of-alignment with the checks you think you are checking. (See: Double checked locking pattern: Broken or not? )
AutoClosable
Now, about the AutoClosable
.
Why?
What does that get for you? You forcefully "forget" a reference that was allowed to be garbage collected anyway. Additionally, it does not close the instance at all, it just makes it refresh later. The AutoClose is a shim, that's not a very useful one at that.
SoftReference
Here's a paradox with caching. What are you caching here? The instance of the resource that you read from the ResourceLoader, or the Resource
class itself? It would be preferable to cache the Resource
instance itself, but you cache the underlying data. Additionally, you re-load it if the cache is cleared by the GC.
My problem is that anything that has a strong reference to the Resource instance probably wants to find the content again at some point.... otherwise it would not need to hold the reference. Thus, if you have memory problems, it is possibly just better to reload the resource each time, than to possibly hang on to it.
Are you sure that you want a SoftReference?
If yes, why don't you have a ReferenceQueue? A reference queue allows you to back-reference a SoftReference to the Resource that owns it. It works as follows:
- create a sub-class of SoftReference (not a wrapper, but an actual "extends" instance).
- put a strong-reference back to the Resource that the SoftReference comes from
- register the SoftReference on a queue
Then, in a separate thread, pull GC'd items off the queue, and process them. This allows you to get an ahead-of-time peek at data that has been de-cached.
It also allows you to have a colleciton of Resources that are held in, for example, a Map, and you can clear the Resources out of the map as they are cleared out by the GC.
-
\$\begingroup\$ Double-checked: what are the alternatives? Using volatile? How in this scope? -- AutoCloseable: fair point. It'll get sacked. -- SoftReference: No, I don't need a soft reference. I read a bit on the references and I understood it was the best for this use case. Any other suggestion? Regarding the resources, they're "getted" a lot in a small amount of time then not in a large one. I want to make sure that the user doesn't reload every time because then he'll do caching probably worse than expected.
Resource
isn't a type of resource, it's really the accessor to the resource itself. \$\endgroup\$Olivier Grégoire– Olivier Grégoire2015年05月19日 16:02:38 +00:00Commented May 19, 2015 at 16:02
Explore related questions
See similar questions with these tags.
ResourceLoader
is overkill and becomes unmanageable in my code and this stays closer to the actualResource
. Finally, the reasons why I won't consider other cache implementations is that this is already in the scope of a small library extending the missing functionalities of Guava. I won't add heavy blocks as extra dependencies. \$\endgroup\$synchronized(this)
. Put in aprivate Object lock = new Object();
and synchronize on that instead.Otherwise a client couldsynchronized(resourceInstance)
and ruin your day. Well, their own day, really, but still. \$\endgroup\$throws
clause. \$\endgroup\$