Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

edited May 23, 2017 at 12:40

The last time I saw source for a C run-time-library implementation of memcpy (Microsoft's compiler in the 1990s), it used the algorithm you describe: but it was written in assembly. It might (my memory is uncertain) have used rep movsd in the inner loop.

Your code says, //Start copying 8 bytes as soon as one of the pointers is aligned. When you're performance-testing you should know (because that's when you might expect the best performance) whether both buffers are aligned.

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB? Why speed of memcpy() drops dramatically every 4KB?

I vaguely understand what kind of an effect you're looking for in your code. I don't know what assembler your compiler is actually producing.

The accepted answer to this StackOverflow question demonstrates the kind of assembly that is used nowadays: Very fast memcpy for image processing? Very fast memcpy for image processing?

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB?

I vaguely understand what kind of an effect you're looking for in your code. I don't know what assembler your compiler is actually producing.

The accepted answer to this StackOverflow question demonstrates the kind of assembly that is used nowadays: Very fast memcpy for image processing?

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB?

I vaguely understand what kind of an effect you're looking for in your code. I don't know what assembler your compiler is actually producing.

The accepted answer to this StackOverflow question demonstrates the kind of assembly that is used nowadays: Very fast memcpy for image processing?

Post Undeleted by ChrisW

occurred Feb 6, 2014 at 23:55

added 339 characters in body

Source Link

edited Feb 6, 2014 at 23:55

ChrisW

edited Feb 6, 2014 at 23:55

ChrisW

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB?

I vaguely understand what kind of an effect you're looking for in your code. I don't know what assembler your compiler is actually producing.

The accepted answer to this StackOverflow question demonstrates the kind of assembly that is used nowadays: Very fast memcpy for image processing?

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB?

I vaguely understand what kind of an effect you're looking for in your code. I don't know what assembler your compiler is actually producing.

The accepted answer to this StackOverflow question demonstrates the kind of assembly that is used nowadays: Very fast memcpy for image processing?

Post Deleted by ChrisW

occurred Feb 6, 2014 at 19:21

Source Link

answered Feb 6, 2014 at 19:13

ChrisW

answered Feb 6, 2014 at 19:13

ChrisW

On the subject of alignment there as an interesting (but unrelated to your question) question here on StackOverflow: Why speed of memcpy() drops dramatically every 4KB?

lang-c