Strange data alignment

Question 1

As far as I know, data alignment is putting data in 64bits / 32bits chunks in memory for CPU performance, I am using a 64 bits linux machine, and I did some tests and got some strange results (I can't explain the behavior).

Here are the structures I used :

 class A {
 long l0,l1,l2;
 };
 class B {
 long l0,l1,l2,l3;
 };
 class C {
 long l0,l1,l2,l3,l4;
 };

the test :

int main() {
 C* newC = new C();
 B* newB = new B();
 A* newA = new A();
 int* i = new int();
 std::cout << sizeof(A) << std::endl;
 std::cout << sizeof(B) << std::endl;
 std::cout << sizeof(C) << std::endl;
 std::cout << "C : " << newC << std::endl;
 std::cout << "B : " << newB << std::endl;
 std::cout << "A : " << newA << std::endl;
 std::cout << "i : " << i << std::endl;
 delete (i);
 delete (newC);
 delete (newA);
 delete (newB);
 return 0;
}

Just putting 1 object of each in heap, I added a pointer at the end to see the memory took by newA

the result is this :

24
32
40
C : 0x603010
B : 0x603040
A : 0x603070
i : 0x603090

3*16 bytes between the adress of newC and newB : C is 40 bytes which is already multiple of 64 bits, why those 8 bytes more ??

3*16 bytes between newB and newA ?? B is only 32 bytes, I expected : A : 0x603060

2*16 bytesbetween the adress of newA and i ??

Question 2

Why are you expecting contiguous chunks from successive allocations? There's nothing anywhere that guarantees that, and most general purpose allocators won't give you that at all (often have some metadata before/after each alloc'd chunk).

Question 3

@Mat it takes that much memory ?? this is like 30% of memory used

Question 4

Yes, small allocations with general purpose allocators are expensive. new char; might very well cost you 16 bytes or more.

Question 5

You can't make any definitive statements about addresses when you're allocating on the heap.

There's a good chance that the memory allocation functions maintain inline information about allocated blocks, which will affect the address of the following block, something like:

+--------+-------------+--------+-------------+
| header | alloced mem | header | alloced mem | ...
+--------+-------------+--------+-------------+

In addition, for efficiency, those functions may round up your memory to a multiple of (for example) eight or sixteen (you're still not allowed to use it, since you don't know about it). This may further affect the addresses you see for your allocated memory.

The classic case of both those effects conspiring against you can be seen with:

#include <iostream>
#include <cstdlib>
int main (int argc, char *argv[]) {
 char *one = new char[std::atoi(argv[1])];
 char *two = new char[std::atoi(argv[1])];
 std::cout << static_cast<void*>(one) << '\n';
 std::cout << static_cast<void*>(two) << '\n';
 return 0;
}

and script:

#!/usr/bin/bash
for i in {01..37}; do
 echo $i $(./qq $i)
done

On my system, this outputs:

01 0x800102e8 0x800102f8
02 0x800102e8 0x800102f8
:: (all the same address pairs in here and in gaps below)
12 0x800102e8 0x800102f8
13 0x800102e8 0x80010300
::
20 0x800102e8 0x80010300
21 0x800102e8 0x80010308
::
28 0x800102e8 0x80010308
29 0x800102e8 0x80010310
::
36 0x800102e8 0x80010310
37 0x800102e8 0x80010318

giving a whopping sixteen bytes between the two when you're only allocating a single character.

The fact that it remains sixteen bytes all the way up to new char[12] and increases by eight every time you add eight chars after that seems to indicate a four byte header, minimum sixteen bytes for the header+data, and an eight byte resolution on the header+data area.

Just keep in mind that's based on my knowledge of the way these things tend to be written so, while an educated guess, is still a guess, and you shouldn't rely on it. It may use a totally different strategy than what I think, or it may change its strategy for larger blocks.

If you want to know how much space the types really take, make an array of two and work out the difference in memory addresses between x[0] and x[1]. You'll find it should be the same as you get from sizeof.

Question 6

Is the size of the header specified ? I would like to know the maximal size that can be stored in a 32bytes chunk

Question 7

@Othman: the size of the header isn't specified, the existence isn't either. That's all up to your implementation (and thus non-portable).

Question 8

@Othman: no, not without digging into the internals, and then you lose poratbility. You generally program at the level of the C++ abstract machine, where that information is not available.

Question 9

You can't assume that subsequent calls to new will return contiguous chunks in memory.

Should you want to try you test case, I suggest you to do the following:

 struct D{
 C c;
 B b;
 A a;
 }

Now you can start printing addresses. However with only floats, I expect no alignment issues.

paxdiablo 889k243 gold badges1.6k silver badges2k bronze badges · Accepted Answer · 2015-05-13 08:42:13Z

You can't make any definitive statements about addresses when you're allocating on the heap.

There's a good chance that the memory allocation functions maintain inline information about allocated blocks, which will affect the address of the following block, something like:

+--------+-------------+--------+-------------+
| header | alloced mem | header | alloced mem | ...
+--------+-------------+--------+-------------+

In addition, for efficiency, those functions may round up your memory to a multiple of (for example) eight or sixteen (you're still not allowed to use it, since you don't know about it). This may further affect the addresses you see for your allocated memory.

The classic case of both those effects conspiring against you can be seen with:

#include <iostream>
#include <cstdlib>
int main (int argc, char *argv[]) {
 char *one = new char[std::atoi(argv[1])];
 char *two = new char[std::atoi(argv[1])];
 std::cout << static_cast<void*>(one) << '\n';
 std::cout << static_cast<void*>(two) << '\n';
 return 0;
}

and script:

#!/usr/bin/bash
for i in {01..37}; do
 echo $i $(./qq $i)
done

On my system, this outputs:

01 0x800102e8 0x800102f8
02 0x800102e8 0x800102f8
:: (all the same address pairs in here and in gaps below)
12 0x800102e8 0x800102f8
13 0x800102e8 0x80010300
::
20 0x800102e8 0x80010300
21 0x800102e8 0x80010308
::
28 0x800102e8 0x80010308
29 0x800102e8 0x80010310
::
36 0x800102e8 0x80010310
37 0x800102e8 0x80010318

giving a whopping sixteen bytes between the two when you're only allocating a single character.

The fact that it remains sixteen bytes all the way up to new char[12] and increases by eight every time you add eight chars after that seems to indicate a four byte header, minimum sixteen bytes for the header+data, and an eight byte resolution on the header+data area.

Just keep in mind that's based on my knowledge of the way these things tend to be written so, while an educated guess, is still a guess, and you shouldn't rely on it. It may use a totally different strategy than what I think, or it may change its strategy for larger blocks.

If you want to know how much space the types really take, make an array of two and work out the difference in memory addresses between x[0] and x[1]. You'll find it should be the same as you get from sizeof.

Is the size of the header specified ? I would like to know the maximal size that can be stored in a 32bytes chunk
@Othman: the size of the header isn't specified, the existence isn't either. That's all up to your implementation (and thus non-portable).
@Othman: no, not without digging into the internals, and then you lose poratbility. You generally program at the level of the C++ abstract machine, where that information is not available.

CollectivesTM on Stack Overflow

Strange data alignment

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related