Data structure alignment

Reference:

  1. Wikipedia: Data structure alignment

  2. MarkS Note: Linux下Data Alignment工作方式

  3. Stack Overflow: #pragma pack effect


Wikipedia:

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

For example, when the computer's word size is 4 bytes (a byte means 8 bits on most machines, but could be different on some systems), the data to be read should be at a memory address which is some multiple of 4. When this is not the case, e.g. the data starts at address 14 instead of 16, then the computer has to read two or more 4 byte chunks and do some calculation before the requested data has been read, or it may generate an alignment fault. Even though the previous data structure end is at address 13, the next data structure should start at address 16. Two padding bytes are inserted between the two data structures at addresses 14 and 15 to align the next data structure at address 16.


#pragma pack

#pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly.

For example, given 4-byte integers and the following struct:

struct Test
{
    char    a;
    int     b
    char    c;
};

The compiler could choose to lay the struct out in memory like this:

|   1  |   2  |   3  |   4  |  

| a(1) | padding............|
| b(1) | b(2) | b(3) | b(4) | 
| c(1) | padding............|

//sizeof(Test) = 3 * 4 byte

The most common use case for the #pragma (to editor knowledge from stack overflow) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one.

With #pragma pack(1), the struct above would be laid out like this:

#pragma pack(push)  // push current alignment to stack 
#pragma pack(1)     // set alignment to 1 byte boundary
struct Test
{
    char    a;
    int     b
    char    c;
};
#pragma pack(pop)   // restore original alignment from stack


|   1  |

| a(1) |
| b(1) |
| b(2) |
| b(3) |
| b(4) |
| c(1) |

//sizeof(Test) = 1 * 6 byte

MarkS Note: Linux下Data Alignment工作方式

struct MixedData { 
     char   data1; 
     short  data2;     
     int    data3;      
     char   data4;   
}; 

|    1    |    2    |    3    |    4    |    5    |    6    |    7    |    8    |    9    |    a    |    b    |    c    |

|data1(1) | padding |data2(1) |data2(2) |data3(2) |data3(2) |data3(3) |data3(4) |data4(1) |padding......................|
  • data1:type 是 char,char 是 1-byte aligned,意思是其 memory 位置開頭必須是1 的倍數,所以就放在 0x00 的位置(假設是從 0x01 開始)
  • data2:type 是 short,short 是 2-byte aligned,所以其 memory 位置開頭必須是 2的倍數,因此 0x01 就會被放置一個 1 byte 的 padding,然後在 0x02 放 data2。因為 short size 比 char size 大,因此會 allocate 2-byte memory
  • data3:同理,4-byte aligned,allocate 4-byte memory
  • data4:雖然 size 是 1 byte,但由於目前最大的 alignment size 為 4 bytes,因此會 allocate 4-byte memory,後面的 3 bytes 就會是 padding
  • Compiler 在 alignment 時最大原則:以目前最大的 alignment size向前 padding,向後 allocate大小.

P.S:

  1. 宣告 structure 時,成員的順序看起來沒有影響,但其實是有的,一個好的 structure 應該要減少 padding 產生,這樣便能降低 structure 的大小.
  2. pack(n) 如果n比structure中最大member size還大 那還是會用最大member的size來alignment (c). 在網路傳輸資料時, 需要用pack(1)來避免compiler作padding的動作.

results matching ""

    No results matching ""