3

I have a question about the answer provided by

@dan04. What is aligned memory allocation?

In particular, if I have something like this:

int main(){
 int num; // 4byte
 char s; // 1byte
 int *ptr;
}

If I have a 32 bit machine, do you think it would still be padding at the data by default?

In the previous question, it was asked about struct, and I am asking about variables declared in main.

update:

a = 2 bytes 
b = 4 bytes
c = 1 byte
d = 1 byte
 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | | words
asked Nov 19, 2016 at 7:46
2
  • Pointers also need memory to be stored in. Commented Nov 19, 2016 at 9:10
  • Why do you think these variables are stored anywhere at all? These are automatic variables and they may be optimised away, or only live in registers. Moreover, if ever stored in memory, the compiler can re-order them if it deems this useful etc. Commented Nov 19, 2016 at 10:00

2 Answers 2

5

There are no rules for this. It depends on the implementation you are using. Further it may change depending on compiler options. The best you can do is to print the address of each variable. Then you can see how the memory layout is.

Something like this:

int main(void)
{
 int num; 
 char s; 
 int *ptr;
 printf("num: %p - size %zu\n", (void*)&num, sizeof num);
 printf("s : %p - size %zu\n", (void*)&s, sizeof s);
 printf("ptr: %p - size %zu\n", (void*)&ptr, sizeof ptr);
 return 0;
}

Possible output:

num: 0x7ffee97fce84 - size 4
s : 0x7ffee97fce83 - size 1
ptr: 0x7ffee97fce88 - size 8

Also notice that in case you don't take the address (&) of a variable, the compiler may optimize your code so that the variable is never put into memory at all.

In general the alignment is typically made to get the best performance out of the HW platform used. That typically imply that variables are aligned to their size or at least 4 byte aligned for variables with size greater than 4.

Update:

OP gives a specific layout example in the update and asks if that layout can/will ever happen.

Again the answer is: It is implementation dependent

So in principle it could happen on some specific system. That said I doubt that it will happen on any mainstream system.

There is another code example compiled with gcc -O3

int main(void)
{
 short s1;
 int i1;
 char c1;
 int i2;
 char c2;
 printf("s1: %p - size %zu\n", (void*)&s1, sizeof s1);
 printf("i1: %p - size %zu\n", (void*)&i1, sizeof i1);
 printf("c1: %p - size %zu\n", (void*)&c1, sizeof c1);
 printf("i2: %p - size %zu\n", (void*)&i2, sizeof i2);
 printf("c2: %p - size %zu\n", (void*)&c2, sizeof c2);
 return 0;
}

Output from my system:

s1: 0x7ffd222fc146 - size 2 <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4 <-- 4 byte aligned
c1: 0x7ffd222fc144 - size 1
i2: 0x7ffd222fc14c - size 4 <-- 4 byte aligned
c2: 0x7ffd222fc145 - size 1

Notice how the location in memory differs from the order variables was defined in the code. That ensures a good alignment.

Sorting by address:

c1: 0x7ffd222fc144 - size 1
c2: 0x7ffd222fc145 - size 1
s1: 0x7ffd222fc146 - size 2 <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4 <-- 4 byte aligned
i2: 0x7ffd222fc14c - size 4 <-- 4 byte aligned

So again to answer the update-question:

On most systems I doubt you'll see a 4 byte variable being placed at address xxx2, xxx6 or xxxa, xxxe. But still, systems may exist where that could happen.

answered Nov 19, 2016 at 8:30
Sign up to request clarification or add additional context in comments.

7 Comments

I mean, why in the link, people say boundary would be 4 bytes?
@Anni_housie Well, it is much due to hardware architecture. For instance the cache is typically organized in lines with 2^N bytes. For performance it would be bad if a 4 byte variable had two bytes in one cache line and the next two bytes in the next cache line. Therefore we typically want 4 byte variables to be 4 byte aligned so that the variable can be held in a single cache line. This is just one example - there are more. But again - it is implementation dependent
In many 32-bit architectures processors like to fetch 32 bits at a time. If the data item crosses a 4 byte (32-bit) boundary, the processor will have to make two fetches, which slows down a program. By keeping data aligned to 4 bytes, the processor only has to make one fetch. Doesn't really have anything to do with cache lines, since most cache lines are greater than 4 bytes.
@ThomasMatthews - Your example is another good example of using the HW in the optimal way. But you are wrong when you say that cache isn't a concern. If a 4 byte variable was placed in memory so that 1 byte mapped into 1 cache line and the next 3 bytes mapped to the next cache line, the processor would have 2 cache misses when reading the variable (if it isn't in cache already, of cause)
@4386427, I have 1 more question about what you mean by "That typically imply that variables are aligned to their size or at least 4 byte ". So if variables have different size, do we still use a constant length word-boundary (say 4 byte)? And what do you mean by "variables are aligned to their size" in this case? Do you mean that we could have a length of word-boundary that is not fixed?
|
1

It's quite hard to exactly predict, but there's certainly some padding going on. Take these two codes for example (I run them on Coliru, 64bit machine)

 #include<iostream>
#include <vector>
using namespace std;
//#pragma pack(push,1)
int main(){ 
 int num1(5); // 4byte
 int num2(3); // 4byte
 char c1[2];
 c1[0]='a';
 c1[1]='a';
 cout << &num1 << " " << &num2 << " " << endl; 
 cout << sizeof(c1) << " " << &c1 << endl;
}
//#pragma pack(pop)
 #include<iostream>
#include <vector>
using namespace std;
//#pragma pack(push,1)
int main(){ 
 int num1(5); // 4byte
 int num2(3); // 4byte
 char c1[1];
 c1[0]='a';
 cout << &num1 << " " << &num2 << " " << endl; 
 cout << sizeof(c1) << " " << &c1 << endl;
}
//#pragma pack(pop)

The first program outputs:

0x7fff3e1f9de8 0x7fff3e1f9dec 
2 0x7fff3e1f9de0

While the second program outputs:

0x7fffdca72538 0x7fffdca7253c 
1 0x7fffdca72537

You can definitely notice that there's a padding being made in the first program, looking at the addresses we can see that: First program: CHAR | CHAR | 6-BYTE PADDING | INT | INT Second program: CHAR | INT | INT

So for the basic question, yes it is probably padding by default. I also tried to use pragma pack to avoid padding, and in contrast to the struct case, I didn't manage to make it avoid padding, since the outputs were exactly the same.

answered Nov 19, 2016 at 8:57

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.