multithreading LRUcache in C++

Question 1

I want to implement a multithreading LRU cache in C++. And here is the code. Can you tell me what should I do to make it more multithreading?

#include <iostream>
#include <unordered_map>
#include <thread>
#include <chrono>
#include <mutex>
using namespace std;
class LRUCache {
public:
 class Node {
 public:
 int key,val;
 Node* left,*right;
 Node(int _key,int _val):key(_key),val(_val),left(NULL),right(NULL) {}
 }*L,*R;
 unordered_map<int,Node*> m;
 int n;
 mutex mtx;
 condition_variable cv;
 void insert(Node* node)
 {
 Node* next=L->right;
 L->right=node;
 node->right=next;
 node->left=L;
 next->left=node;
 }
 void erase(Node* node) {
 node->left->right=node->right;
 node->right->left=node->left;
 }
 LRUCache(int capacity) {
 n=capacity;
 L=new Node(-1,-1),R=new Node(-1,-1);
 L->right=R;
 R->left=L;
 }
 int get(int key) {
 mtx.lock();
 if(m.count(key)==0) {mtx.unlock();return -1;}
 Node* node=m[key];
 erase(node);
 insert(node);
 cout<<"get:"<<node->val<<endl;
 mtx.unlock();
 return node->val;
 }
 void put(int key, int value) {
 mtx.lock();
 if(m.count(key)) {
 m[key]->val=value;
 erase(m[key]);
 insert(m[key]);
 } else {
 Node* node=new Node(key,value);
 if(m.size()==n) {
 Node* tmp=R->left;
 erase(tmp);
 m.erase(tmp->key);
 }
 insert(node);
 m[key]=node;
 }
 cout<<"put:"<<value<<endl;
 mtx.unlock();
 }
 void putcache() {
 for(int i=1;i<=10;i++) {
 put(i,i+10);
 this_thread::sleep_for(std::chrono::seconds(1));
 }
 }
 void getcache() {
 for(int i=1;i<=10;i++) {
 get(i);
 this_thread::sleep_for(std::chrono::seconds(1));
 }
 }
};
int main() {
 LRUCache c1(100);
 std::thread t1(&LRUCache::putcache,&c1);
 std::thread t2(&LRUCache::getcache,&c1);
 t1.join();
 t2.join();
}

Question 2

Don't using namespace std;, especially in headers - that brings every identifier from std into the global namespace, from every header that's included (and that set of identifiers is subject to change with future standards). We have namespaces for a reason - let's use them!

This program doesn't compile, because condition_variable is not defined. If that's intended to be std::condition_variable, then we need to include <condition_variable> before using it.

It doesn't make sense to have all the members declared public. That means we have no control over the class's invariants, and any of its state can be modified unsafely.

Why are we using std::cout in member functions get() and put()? I wouldn't want my program's standard output stream corrupted like that (to be honest, I wouldn't want such tracing even on std::clog unless specifically requested). And don't flush the stream (with std::endl) when that's not necessary - just use a plain newline.

There's a risk of memory leak here:

 L=new Node(-1,-1),R=new Node(-1,-1);

If the creation of the second Node fails, then we lose the one and only reference to the first (i.e. L), and are unable to delete it.

We have an even bigger memory leak, since no Node ever gets deleted at all.

Why do we have arbitrary sleeps in the putcache() and getcache() member functions? I don't see any benefit to delaying these operations.

Instead of manually locking and unlocking the mutex mtx, prefer std::lock_guard, which will unlock the mutex in all return paths, including when an exception is thrown.

Node requires (possibly deleted) copy and assignment operators, as copies shouldn't be sharing the pointers left and right. Similarly, LRUCache needs those members, due to its pointers L and R. Initialise its members, rather than letting them default-initialise and then overwriting in the constructor.

Question 3

Make more use of the standard library

You are leveraging std::unordered_map to provide \$O(1)\$ lookups, but you have manually implemented a linked list to keep track of which nodes were most recently used. You can use std::list to do this for you. In fact, the list will store both the keys and values, while the std::unordered_map will map keys to entries in the list. You can make use of the guarantee that list iterators will not be invalidated, even if elements in a list are added or (re)moved. The splice() function can be used to move an element to the front.

class LRUCache {
 std::list<std::pair<int, int>> nodes;
 using node_iterator = decltype(nodes)::iterator;
 std::unordered_map<int, node_iterator> index;
 std::mutex mtx;
 std::size_t n;
 void update(node_iterator node_it) {
 nodes.splice(nodes.begin(), nodes, node_it); 
 }
public:
 LRUCache(std::size_t n): n(n) {}
 void put(int key, int value) {
 std::lock_guard lock(mtx);
 if (auto it = index.find(key); it != index.end()) {
 auto node_it = it->second;
 node_it->second = value;
 update(node_it);
 } else {
 nodes.emplace_front(key, value);
 index[key] = nodes.front();
 if (nodes.size() > n) {
 index.erase(nodes.back().first);
 nodes.pop_back();
 }
 }
 }
 int get(int key) {
 std::lock_guard lock(mtx);
 if (auto it = index.find(key); it != index.end()) {
 update(it->second);
 return it->second->second;
 } else {
 return -1;
 }
 }
};

There are other variations possible, see for example this StackOverflow question.

Ambiguous return value

There is a problem with get(): if it returns -1, it is not possible to tell whether this was because the key was not found, or if the key was found but the value was set to -1. Consider adding another way to report whether or not the key was found. A good way in C++17 or later is to use std::optional for the return value.

Make it a template

Your cache is hardcoded to use int for both keys and values. What if you want to have std::string keys and float values? You can easily make your cache work with any type of key and value by making it a template:

template<typename Key, typename Value>
class LRUCache {
 std::list<std::pair<Key, Value>> nodes;
 ...
public:
 void put(const Key& key, const Value& value) {
 ...
 }
 std::optional<Value> get(const Key& key) {
 ...
 }
};

Toby Speight Toby Speight 87.3k14 gold badges104 silver badges322 bronze badges · Answer 1 · 2022-10-24 06:54:17Z

Don't using namespace std;, especially in headers - that brings every identifier from std into the global namespace, from every header that's included (and that set of identifiers is subject to change with future standards). We have namespaces for a reason - let's use them!

This program doesn't compile, because condition_variable is not defined. If that's intended to be std::condition_variable, then we need to include <condition_variable> before using it.

It doesn't make sense to have all the members declared public. That means we have no control over the class's invariants, and any of its state can be modified unsafely.

Why are we using std::cout in member functions get() and put()? I wouldn't want my program's standard output stream corrupted like that (to be honest, I wouldn't want such tracing even on std::clog unless specifically requested). And don't flush the stream (with std::endl) when that's not necessary - just use a plain newline.

There's a risk of memory leak here:

 L=new Node(-1,-1),R=new Node(-1,-1);

If the creation of the second Node fails, then we lose the one and only reference to the first (i.e. L), and are unable to delete it.

We have an even bigger memory leak, since no Node ever gets deleted at all.

Why do we have arbitrary sleeps in the putcache() and getcache() member functions? I don't see any benefit to delaying these operations.

Instead of manually locking and unlocking the mutex mtx, prefer std::lock_guard, which will unlock the mutex in all return paths, including when an exception is thrown.

Node requires (possibly deleted) copy and assignment operators, as copies shouldn't be sharing the pointers left and right. Similarly, LRUCache needs those members, due to its pointers L and R. Initialise its members, rather than letting them default-initialise and then overwriting in the constructor.

G. Sliepen G. Sliepen 68.7k3 gold badges74 silver badges179 bronze badges · Answer 2 · 2022-10-24 21:30:08Z

Make more use of the standard library

You are leveraging std::unordered_map to provide \$O(1)\$ lookups, but you have manually implemented a linked list to keep track of which nodes were most recently used. You can use std::list to do this for you. In fact, the list will store both the keys and values, while the std::unordered_map will map keys to entries in the list. You can make use of the guarantee that list iterators will not be invalidated, even if elements in a list are added or (re)moved. The splice() function can be used to move an element to the front.

class LRUCache {
 std::list<std::pair<int, int>> nodes;
 using node_iterator = decltype(nodes)::iterator;
 std::unordered_map<int, node_iterator> index;
 std::mutex mtx;
 std::size_t n;
 void update(node_iterator node_it) {
 nodes.splice(nodes.begin(), nodes, node_it); 
 }
public:
 LRUCache(std::size_t n): n(n) {}
 void put(int key, int value) {
 std::lock_guard lock(mtx);
 if (auto it = index.find(key); it != index.end()) {
 auto node_it = it->second;
 node_it->second = value;
 update(node_it);
 } else {
 nodes.emplace_front(key, value);
 index[key] = nodes.front();
 if (nodes.size() > n) {
 index.erase(nodes.back().first);
 nodes.pop_back();
 }
 }
 }
 int get(int key) {
 std::lock_guard lock(mtx);
 if (auto it = index.find(key); it != index.end()) {
 update(it->second);
 return it->second->second;
 } else {
 return -1;
 }
 }
};

There are other variations possible, see for example this StackOverflow question.

Ambiguous return value

There is a problem with get(): if it returns -1, it is not possible to tell whether this was because the key was not found, or if the key was found but the value was set to -1. Consider adding another way to report whether or not the key was found. A good way in C++17 or later is to use std::optional for the return value.

Make it a template

Your cache is hardcoded to use int for both keys and values. What if you want to have std::string keys and float values? You can easily make your cache work with any type of key and value by making it a template:

template<typename Key, typename Value>
class LRUCache {
 std::list<std::pair<Key, Value>> nodes;
 ...
public:
 void put(const Key& key, const Value& value) {
 ...
 }
 std::optional<Value> get(const Key& key) {
 ...
 }
};

Stack Exchange Network

multithreading LRUcache in C++

2 Answers 2

Make more use of the standard library

Ambiguous return value

Make it a template

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

multithreading LRUcache in C++

2 Answers 2

Make more use of the standard library

Ambiguous return value

Make it a template

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions