I need to read many different files in succession as fast as possible. It's not one big file, but many small ones. The files I try to read from are the stat files in /proc/<pid>/stat
I am using std::ifstream and std::getline() to read the files.
Here is my current code:
std::ifstream statFile("/proc/" + pid + "/stat");
if (!statFile.is_open())
{
std::cerr << "Error: Could not open file for PID " << pid << std::endl;
return 0; // No fatal error because file may be deleted during read
}
std::string line;
if (!std::getline(statFile, line))
{
std::cerr << "Error: Could not read from file for PID " << pid << std::endl;
return 0; // No fatal error because file may be deleted during read
}
I tried using mmap(), but that doesn't seem to work in the /proc/ directory.
I also tried using a buffer, but that was slower.
-
4Is there a question you want to ask?Ulrich Eckhardt– Ulrich Eckhardt2025年01月24日 08:42:56 +00:00Commented Jan 24, 2025 at 8:42
-
2Your could try either the C API (eg fopen, fread), or the Unix API (eg open, read). Not sure whether they would be faster or not but worth a try. Really your situation is unique, so the only way to know which method is faster is to try them all. If nothing is fast enough then maybe redesign your approach to whatever problem you are trying to solve.john– john2025年01月24日 08:49:26 +00:00Commented Jan 24, 2025 at 8:49
-
1could you get whatever information you need from a system API rather than by reading the files?Alan Birtles– Alan Birtles2025年01月24日 08:52:59 +00:00Commented Jan 24, 2025 at 8:52
-
Bleeding edge Linux kernels have a system call that reads a file in one system call. If you need even more speed you could overengineer something using io_uring. I would suggest you stick with the simple approach though.Botje– Botje2025年01月24日 09:07:19 +00:00Commented Jan 24, 2025 at 9:07
-
1What exactly are you doing with these files? Why are you using getline?anon– anon2025年01月24日 10:26:01 +00:00Commented Jan 24, 2025 at 10:26
3 Answers 3
Use open() and read() instead of std::ifstream.
std::ifstream has overhead due to its high level abstraction and internal buffering. Using lower level system calls like open() and read() can reduce this overhead.
If you can anticipate file offsets or use a buffer to store paths, you can use pread() to avoid reopening files:
ssize_t bytesRead = pread(fd, buffer, sizeof(buffer) - 1, 0);
However, in the /proc case, you generally process files one by one, so the benefit of pread() may be limited unless combined with other strategies.
If you can determine the target PIDs in advance, you could batch the file reads by keeping file descriptors open (to avoid repeated open() calls):
std::vector<int> openProcFiles(const std::vector<int>& pids) {
std::vector<int> fds;
for (int pid : pids) {
char path[64];
snprintf(path, sizeof(path), "/proc/%d/stat", pid);
int fd = open(path, O_RDONLY);
if (fd != -1) {
fds.push_back(fd);
}
}
return fds;
}
void closeProcFiles(const std::vector<int>& fds) {
for (int fd : fds) {
close(fd);
}
}
Then read from these descriptors as needed.
Use profiling tools like perf, strace, or gprof to understand where bottlenecks occur. This ensures you're optimizing the right part of the code (e.g., syscalls vs. string parsing).
7 Comments
fstream is mostly in the many templates involved (bloating binary size), but as long as you're interacting with manually opened files, not cout/cerr/cin (or if you're using the latter, you disable the sync with stdio), and reading lines, not doing complicated parsing, the overhead should be worth it, and rounding error next to the cost of system calls.I recommend opening the files as normal with std::ifstream and then use std::ifstream::read to read the whole file into a fixed size char array. On my system an array of 957 is enough given the max (or min) values of all the fields in proc_pid_stat(5) + a max length of the comm string of 16. I'd round it up to 1024 for good measure. If your system has sizeof(int) greater than 4, double the size of the buffer - or double it anyway. I doubt you'll notice a difference.
For extracting the numerical values, I recommend using std::from_chars which is supposed to provide the fastest way to convert char arrays into numerical types.
I'd start by defining a class that can hold the values:
struct proc_pid_stat {
/*
(1) pid %d
The process ID.
*/
int pid;
/*
(2) comm %s
The filename of the executable, in parentheses.
Strings longer than TASK_COMM_LEN (16) characters
(including the terminating null byte) are silently
truncated. This is visible whether or not the
executable is swapped out.
*/
std::string comm;
//... add all the fields with the correct types ...
/*
(52) exit_code %d (since Linux 3.5) [PT]
The thread's exit status in the form reported by
waitpid(2).
*/
int exit_code;
};
To this class I'd add a "magic" value that can be used to indicate if extracting the information from the file failed. This will be set on the last field in the class when extraction starts, but it'll will be overwritten if extraction succeeds.
struct proc_pid_stat {
// same as above goes here
static constexpr int fail = std::numeric_limits<int>::min();
};
Then to the actual extration. The only messy parts are the comm(2) and state(3) fields, which comes early. The rest can be made into a big fold expression in which std::from_chars is used:
struct proc_pid_stat {
// same as above goes here
friend std::istream& operator>>(std::istream& is, proc_pid_stat& pps) {
pps.exit_code = fail; // set the last field to a "fail" value
char buf[1024]; // max length with all the fields incl. comm is 957
// read the whole line:
is.read(buf, static_cast<std::streamsize>(sizeof buf));
const char* const end = buf + is.gcount();
// extract fields:
auto rptr = std::from_chars(buf, end, pps.pid).ptr; // (1)
if(rptr == end) return is;
++rptr;
if(std::distance(rptr, end) < kernel_thread_comm_len) return is;
std::string_view comm(rptr, kernel_thread_comm_len);
const auto cpos = comm.rfind(')');
if(cpos == std::string_view::npos) return is;
auto sp = rptr + cpos + 1;
if(std::distance(sp, end) < 96) return is; // a resonable amount left
pps.comm.assign(rptr, sp); // (2)
pps.state = *++sp; // (3)
++sp;
// if extracting all the rest succeeds, the last field, exit_code,
// will get a value other than "fail":
[&](auto&&... rest) {
(..., (sp = std::from_chars(sp + (sp != end), end, rest).ptr));
}(pps.ppid /* (4) */, pps.pgrp /* (5) */, pps.session /* (6) */,
pps.tty_nr /* (7) */, pps.tpgid /* (8) */, pps.flags /* (9) */,
pps.minflt /* (10) */, pps.cminflt /* (11) */, pps.majflt /* (12) */,
pps.cmajflt /* (13) */, pps.utime /* (14) */, pps.stime /* (15) */,
pps.cutime /* (16) */, pps.cstime /* (17) */, pps.priority /* (18) */,
pps.nice /* (19) */, pps.num_threads /* (20) */,
pps.itrealvalue /* (21) */, pps.starttime /* (22) */,
pps.vsize /* (23) */, pps.rss /* (24) */, pps.rsslim /* (25) */,
pps.startcode /* (26) */, pps.endcode /* (27) */,
pps.startstack /* (28) */, pps.kstkesp /* (29) */,
pps.kstkeip /* (30) */, pps.signal /* (31) */, pps.blocked /* (32) */,
pps.sigignore /* (33) */, pps.sigcatch /* (34) */,
pps.wchan /* (35) */, pps.nswap /* (36) */, pps.cnswap /* (37) */,
pps.exit_signal /* (38) */, pps.processor /* (39) */,
pps.rt_priority /* (40) */, pps.policy /* (41) */,
pps.delayacct_blkio_ticks /* (42) */, pps.guest_time /* (43) */,
pps.cguest_time /* (44) */, pps.start_data /* (45) */,
pps.end_data /* (46) */, pps.start_brk /* (47) */,
pps.arg_start /* (48) */, pps.arg_end /* (49) */,
pps.env_start /* (50) */, pps.env_end /* (51) */,
pps.exit_code /* (52) */
);
return is;
}
};
Note: kernel_thread_comm_len is a constant to deal with comm fields longer than the 16 characters mentioned for the comm field. Kernel tasks may be 64 characters, so that's what I set that constant to.
Then comes the part with for what processes to collect the information. If you have a std::vector of process IDs, you could add a function that populates a std::vector<proc_pid_stat>:
auto get_proc_pid_stats(std::ranges::random_access_range auto&& pids) {
static const std::filesystem::path proc("/proc");
std::vector<proc_pid_stat> ppss(std::ranges::size(pids));
auto zw = std::views::zip(pids, ppss);
auto fillfunc = [](auto&& pid_pps) {
auto& [pid, pps] = pid_pps;
auto path = proc / std::to_string(pid) / "stat";
std::ifstream is(path);
is >> pps;
};
std::for_each(std::execution::par, std::ranges::begin(zw),
std::ranges::end(zw), fillfunc);
return ppss;
}
Note: The above uses the built-in thread pool (if your implementation supports it). You may need to link with the library implementing it for it to be useful. -ltbb is common. Should you for some reason don't want to use the thread pool, change std::execution::par to std::execution::seq and measure the difference in time.
If you want all the processes, you can make it more effective by not building the filename for every process file like I did in get_proc_pid_stats above. Just collect the filenames and use those instead of pids in the loop above:
std::vector<std::filesystem::path> pids;
for(auto& de : std::filesystem::directory_iterator("/proc")) {
if(std::isdigit(
static_cast<unsigned char>(de.path().filename().string().front())))
{
pids.emplace_back(de.path());
}
}
2 Comments
auto sp = std::find(++rptr, end, ' '); // find end of (comm) field is dangerous, because the comm field itself can contain spaces within the parentheses. Depending on the number of spaces, this can either only change the state field, or completely shift every value) so we can't search forward for that either. I updated it to search backwards for ) instead. One could search backwards through the whole string, but I added the magic constant kernel_thread_comm_len which lets it deal with tasks with longer comm strings than TASK_COMM_LEN, which is only 16, while searching through the minimal amount of characters needed.For small files, your overhead is in opening the file. The hard drive needs to read a directory, extract your file name and attributes, then start reading. There's also the overhead of the drive preparation (such as spinning for hard-drives).
Block read the file into memory (array, vector). The idea is to keep the hard drive spinning and data flowing through the stream. Perform searching (like getline), in the vector. Searching memory is always faster than searching the hard drive.
So, using std::istream::read may be as fast as fread for the small files.
If you can, use a DMA (Direct Memory Access) to read the data in the background. Note: in many platforms, the DMA and CPU will share the same data bus, so there may be some contention when reading from the file.