Faster file reading in C++

Question 1

I need to read many different files in succession as fast as possible. It's not one big file, but many small ones. The files I try to read from are the stat files in /proc/<pid>/stat

I am using std::ifstream and std::getline() to read the files.

Here is my current code:

std::ifstream statFile("/proc/" + pid + "/stat");
if (!statFile.is_open())
{
 std::cerr << "Error: Could not open file for PID " << pid << std::endl;
 return 0; // No fatal error because file may be deleted during read
}
std::string line;
if (!std::getline(statFile, line))
{
 std::cerr << "Error: Could not read from file for PID " << pid << std::endl;
 return 0; // No fatal error because file may be deleted during read
}

I tried using mmap(), but that doesn't seem to work in the /proc/ directory.

I also tried using a buffer, but that was slower.

Question 2

Is there a question you want to ask?

Question 3

Your could try either the C API (eg fopen, fread), or the Unix API (eg open, read). Not sure whether they would be faster or not but worth a try. Really your situation is unique, so the only way to know which method is faster is to try them all. If nothing is fast enough then maybe redesign your approach to whatever problem you are trying to solve.

Question 4

could you get whatever information you need from a system API rather than by reading the files?

Question 5

Bleeding edge Linux kernels have a system call that reads a file in one system call. If you need even more speed you could overengineer something using io_uring. I would suggest you stick with the simple approach though.

Question 6

What exactly are you doing with these files? Why are you using getline?

Question 7

Use open() and read() instead of std::ifstream. std::ifstream has overhead due to its high level abstraction and internal buffering. Using lower level system calls like open() and read() can reduce this overhead.

If you can anticipate file offsets or use a buffer to store paths, you can use pread() to avoid reopening files:

ssize_t bytesRead = pread(fd, buffer, sizeof(buffer) - 1, 0);

However, in the /proc case, you generally process files one by one, so the benefit of pread() may be limited unless combined with other strategies.

If you can determine the target PIDs in advance, you could batch the file reads by keeping file descriptors open (to avoid repeated open() calls):

std::vector<int> openProcFiles(const std::vector<int>& pids) {
 std::vector<int> fds;
 for (int pid : pids) {
 char path[64];
 snprintf(path, sizeof(path), "/proc/%d/stat", pid);
 int fd = open(path, O_RDONLY);
 if (fd != -1) {
 fds.push_back(fd);
 }
 }
 return fds;
}
void closeProcFiles(const std::vector<int>& fds) {
 for (int fd : fds) {
 close(fd);
 }
}

Then read from these descriptors as needed.

Use profiling tools like perf, strace, or gprof to understand where bottlenecks occur. This ensures you're optimizing the right part of the code (e.g., syscalls vs. string parsing).

Question 8

I can't run it in multithread because I can't have the CPU be too occupied by the process

Question 9

ok thank you , now I tried to be more specific.

Question 10

The problem with leaving them open is that the files often get deleted or appear new with other PIDs

Question 11

so you have to copy in ram , process and free.

Question 12

This offers some theories on speed ups, but I'm skeptical it will make a meaningful difference. If the OP wants to read a line at a time, you either accept buffering or pay enormous overhead on repeated system calls. The cost of fstream is mostly in the many templates involved (bloating binary size), but as long as you're interacting with manually opened files, not cout/cerr/cin (or if you're using the latter, you disable the sync with stdio), and reading lines, not doing complicated parsing, the overhead should be worth it, and rounding error next to the cost of system calls.

Question 13

I recommend opening the files as normal with std::ifstream and then use std::ifstream::read to read the whole file into a fixed size char array. On my system an array of 957 is enough given the max (or min) values of all the fields in proc_pid_stat(5) + a max length of the comm string of 16. I'd round it up to 1024 for good measure. If your system has sizeof(int) greater than 4, double the size of the buffer - or double it anyway. I doubt you'll notice a difference.

For extracting the numerical values, I recommend using std::from_chars which is supposed to provide the fastest way to convert char arrays into numerical types.

I'd start by defining a class that can hold the values:

struct proc_pid_stat {
 /*
 (1) pid %d
 The process ID.
 */
 int pid;
 /*
 (2) comm %s
 The filename of the executable, in parentheses.
 Strings longer than TASK_COMM_LEN (16) characters
 (including the terminating null byte) are silently
 truncated. This is visible whether or not the
 executable is swapped out.
 */
 std::string comm;
 //... add all the fields with the correct types ... 
 /*
 (52) exit_code %d (since Linux 3.5) [PT]
 The thread's exit status in the form reported by
 waitpid(2).
 */
 int exit_code;
};

To this class I'd add a "magic" value that can be used to indicate if extracting the information from the file failed. This will be set on the last field in the class when extraction starts, but it'll will be overwritten if extraction succeeds.

struct proc_pid_stat {
 // same as above goes here
 static constexpr int fail = std::numeric_limits<int>::min();
};

Then to the actual extration. The only messy parts are the comm(2) and state(3) fields, which comes early. The rest can be made into a big fold expression in which std::from_chars is used:

struct proc_pid_stat {
 // same as above goes here
 friend std::istream& operator>>(std::istream& is, proc_pid_stat& pps) {
 pps.exit_code = fail; // set the last field to a "fail" value
 char buf[1024]; // max length with all the fields incl. comm is 957
 // read the whole line:
 is.read(buf, static_cast<std::streamsize>(sizeof buf));
 const char* const end = buf + is.gcount();
 // extract fields:
 auto rptr = std::from_chars(buf, end, pps.pid).ptr; // (1)
 if(rptr == end) return is;
 ++rptr;
 if(std::distance(rptr, end) < kernel_thread_comm_len) return is;
 std::string_view comm(rptr, kernel_thread_comm_len);
 const auto cpos = comm.rfind(')');
 if(cpos == std::string_view::npos) return is;
 auto sp = rptr + cpos + 1;
 if(std::distance(sp, end) < 96) return is; // a resonable amount left
 pps.comm.assign(rptr, sp); // (2)
 pps.state = *++sp; // (3)
 ++sp;
 // if extracting all the rest succeeds, the last field, exit_code,
 // will get a value other than "fail":
 [&](auto&&... rest) {
 (..., (sp = std::from_chars(sp + (sp != end), end, rest).ptr));
 }(pps.ppid /* (4) */, pps.pgrp /* (5) */, pps.session /* (6) */,
 pps.tty_nr /* (7) */, pps.tpgid /* (8) */, pps.flags /* (9) */,
 pps.minflt /* (10) */, pps.cminflt /* (11) */, pps.majflt /* (12) */,
 pps.cmajflt /* (13) */, pps.utime /* (14) */, pps.stime /* (15) */,
 pps.cutime /* (16) */, pps.cstime /* (17) */, pps.priority /* (18) */,
 pps.nice /* (19) */, pps.num_threads /* (20) */,
 pps.itrealvalue /* (21) */, pps.starttime /* (22) */,
 pps.vsize /* (23) */, pps.rss /* (24) */, pps.rsslim /* (25) */,
 pps.startcode /* (26) */, pps.endcode /* (27) */,
 pps.startstack /* (28) */, pps.kstkesp /* (29) */,
 pps.kstkeip /* (30) */, pps.signal /* (31) */, pps.blocked /* (32) */,
 pps.sigignore /* (33) */, pps.sigcatch /* (34) */,
 pps.wchan /* (35) */, pps.nswap /* (36) */, pps.cnswap /* (37) */,
 pps.exit_signal /* (38) */, pps.processor /* (39) */,
 pps.rt_priority /* (40) */, pps.policy /* (41) */,
 pps.delayacct_blkio_ticks /* (42) */, pps.guest_time /* (43) */,
 pps.cguest_time /* (44) */, pps.start_data /* (45) */,
 pps.end_data /* (46) */, pps.start_brk /* (47) */,
 pps.arg_start /* (48) */, pps.arg_end /* (49) */,
 pps.env_start /* (50) */, pps.env_end /* (51) */,
 pps.exit_code /* (52) */
 );
 return is;
 }
};

Note: kernel_thread_comm_len is a constant to deal with comm fields longer than the 16 characters mentioned for the comm field. Kernel tasks may be 64 characters, so that's what I set that constant to.

Then comes the part with for what processes to collect the information. If you have a std::vector of process IDs, you could add a function that populates a std::vector<proc_pid_stat>:

auto get_proc_pid_stats(std::ranges::random_access_range auto&& pids) {
 static const std::filesystem::path proc("/proc");
 std::vector<proc_pid_stat> ppss(std::ranges::size(pids));
 auto zw = std::views::zip(pids, ppss);
 auto fillfunc = [](auto&& pid_pps) {
 auto& [pid, pps] = pid_pps;
 auto path = proc / std::to_string(pid) / "stat";
 std::ifstream is(path);
 is >> pps;
 };
 std::for_each(std::execution::par, std::ranges::begin(zw),
 std::ranges::end(zw), fillfunc);
 return ppss;
}

Note: The above uses the built-in thread pool (if your implementation supports it). You may need to link with the library implementing it for it to be useful. -ltbb is common. Should you for some reason don't want to use the thread pool, change std::execution::par to std::execution::seq and measure the difference in time.

If you want all the processes, you can make it more effective by not building the filename for every process file like I did in get_proc_pid_stats above. Just collect the filenames and use those instead of pids in the loop above:

std::vector<std::filesystem::path> pids;
for(auto& de : std::filesystem::directory_iterator("/proc")) {
 if(std::isdigit(
 static_cast<unsigned char>(de.path().filename().string().front()))) 
 {
 pids.emplace_back(de.path());
 }
}

Full demo

Question 14

The line auto sp = std::find(++rptr, end, ' '); // find end of (comm) field is dangerous, because the comm field itself can contain spaces within the parentheses. Depending on the number of spaces, this can either only change the state field, or completely shift every value

Question 15

@Plat00n Indeed - and a command may even contain ) so we can't search forward for that either. I updated it to search backwards for ) instead. One could search backwards through the whole string, but I added the magic constant kernel_thread_comm_len which lets it deal with tasks with longer comm strings than TASK_COMM_LEN, which is only 16, while searching through the minimal amount of characters needed.

Question 16

For small files, your overhead is in opening the file. The hard drive needs to read a directory, extract your file name and attributes, then start reading. There's also the overhead of the drive preparation (such as spinning for hard-drives).

Block read the file into memory (array, vector). The idea is to keep the hard drive spinning and data flowing through the stream. Perform searching (like getline), in the vector. Searching memory is always faster than searching the hard drive.

So, using std::istream::read may be as fast as fread for the small files.

If you can, use a DMA (Direct Memory Access) to read the data in the background. Note: in many platforms, the DMA and CPU will share the same data bus, so there may be some contention when reading from the file.

Question 17

The directory I'm working in is the /proc/ file system. As per my understanding, that file system is located in the RAM, so the usual problems with hdds doesn't exist

R.F. 3892 silver badges10 bronze badges · Accepted Answer · 2025-01-24 09:59:47Z

Use open() and read() instead of std::ifstream. std::ifstream has overhead due to its high level abstraction and internal buffering. Using lower level system calls like open() and read() can reduce this overhead.

If you can anticipate file offsets or use a buffer to store paths, you can use pread() to avoid reopening files:

ssize_t bytesRead = pread(fd, buffer, sizeof(buffer) - 1, 0);

However, in the /proc case, you generally process files one by one, so the benefit of pread() may be limited unless combined with other strategies.

If you can determine the target PIDs in advance, you could batch the file reads by keeping file descriptors open (to avoid repeated open() calls):

std::vector<int> openProcFiles(const std::vector<int>& pids) {
 std::vector<int> fds;
 for (int pid : pids) {
 char path[64];
 snprintf(path, sizeof(path), "/proc/%d/stat", pid);
 int fd = open(path, O_RDONLY);
 if (fd != -1) {
 fds.push_back(fd);
 }
 }
 return fds;
}
void closeProcFiles(const std::vector<int>& fds) {
 for (int fd : fds) {
 close(fd);
 }
}

Then read from these descriptors as needed.

Use profiling tools like perf, strace, or gprof to understand where bottlenecks occur. This ensures you're optimizing the right part of the code (e.g., syscalls vs. string parsing).

I can't run it in multithread because I can't have the CPU be too occupied by the process
The problem with leaving them open is that the files often get deleted or appear new with other PIDs
This offers some theories on speed ups, but I'm skeptical it will make a meaningful difference. If the OP wants to read a line at a time, you either accept buffering or pay enormous overhead on repeated system calls. The cost of fstream is mostly in the many templates involved (bloating binary size), but as long as you're interacting with manually opened files, not cout/cerr/cin (or if you're using the latter, you disable the sync with stdio), and reading lines, not doing complicated parsing, the overhead should be worth it, and rounding error next to the cost of system calls.

CollectivesTM on Stack Overflow

Faster file reading in C++

3 Answers 3

7 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

7 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related