Let me begin by admitting that I'm very ignorant of the inner workings of hard disks. So when I read over the manual for the variable innodb_flush_method, it confused me. Can I get an explanation in layman's terms on the difference in O_DSYNC and O_DIRECT, and how to know if it's a performance issue on a database server.
Some stats on my setup: Mac OSX 10.6 (32-bit kernel, since architecture is out of date) running MySQL 5.1.49-64bit (hoping it would allow me to use the memory). 8GB RAM, ~6GB of innodb data/indexes.
-
2I don't know if Mac OS X supports a proper direct IO option - I did not think it did. You're the second person I've seen today get confused by that manual page. I have an open bug on it here: bugs.mysql.com/bug.php?id=54306Morgan Tocker– Morgan Tocker2011年03月21日 20:55:41 +00:00Commented Mar 21, 2011 at 20:55
1 Answer 1
Here is an explanation on how fdatasync()
works vs how fsync()
works
fdatasync()
flushes all data buffers of a file to disk (before the system call returns). It resembles fsync()
but is not required to update the metadata, such as access time. Applications that access databases or log files often write a tiny data fragment (e.g., one line in a log file) and then call fsync()
immediately in order to ensure that the written data is physically stored on the harddisk. Unfortunately, fsync()
will always initiate two write operations
- one write operation for the newly written data
- one write operation in order to update the modification time stored in the inode
If the modification time is not a part of the transaction concept, then fdatasync()
can be used to avoid unnecessary inode disk write operations.
In English, O_DSYNC
is faster than O_DIRECT
since O_DIRECT
calls fsync()
twice (one for logs and one for data) and fsync()
verifies data writes via two write operations. Using O_DSYNC
calls fdatasync()
and fsync()
. You can think of fdatasync()
as doing an asynchronous fsync()
(not verifying data).
Looking at the numbers, O_DSYNC
does four write ops, two of which are verified, while fsync()
does four write operations, all being verified afterwards.
CONCLUSION
O_DSYNC
- faster than
O_DIRECT
- Data may/may not be consistent due to latency or an outright crash
O_DIRECT
- more stable
- data consistent
- naturally slower
I hope this answer helps, and I hope I didn't make things worse for you.
-
2Worth pointing out: O_DIRECT is only used on the table-space files, not on the logs. Also - whether O_DIRECT is going to be useful or not depends on the hardware. I linked to an open documentation bug as a comment to the author's question.Morgan Tocker– Morgan Tocker2011年03月21日 21:00:19 +00:00Commented Mar 21, 2011 at 21:00
-
Thank you for clarifying that, Morgan. I'll correct this.RolandoMySQLDBA– RolandoMySQLDBA2011年03月21日 21:26:47 +00:00Commented Mar 21, 2011 at 21:26
-
O_DSYNC is synchronous write, how can you conclude that it is faster that asynchronous + fsync?noonex– noonex2014年09月08日 07:43:16 +00:00Commented Sep 8, 2014 at 7:43
-
@noonex fdatasync() is synchronous for its data, not its metadata. According to informit.com/articles/article.aspx?p=23618&seqNum=5,
This means that in principal, fdatasync can execute faster than fsync because it needs to force only one disk write instead of two. However, in current versions of Linux, these two system calls actually do the same thing, both updating the file's modification time.
At the time I wrote my post 3.5 yrs ago, it was true, especially with older versions of Linux.RolandoMySQLDBA– RolandoMySQLDBA2014年09月08日 11:36:40 +00:00Commented Sep 8, 2014 at 11:36 -
@noonex According to en.wikipedia.org/wiki/Sync_(Unix),
The related system call fsync() commits just the buffered data relating to a specified file descriptor. fdatasync() is also available to write out just the changes made to the data in the file, and not necessarily the file's related metadata.
(That Wiki was last updated July 28, 2014).RolandoMySQLDBA– RolandoMySQLDBA2014年09月08日 11:41:06 +00:00Commented Sep 8, 2014 at 11:41