packing structures and numbers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been reading btrfs's on-disk format, and two things caught my eye

- attribute((packed)) structures everywhere, often with misaligned fields. This conserves space, but can be harmful to in-memory performance on some archs. - le64's everywhere. This scales nicely, but wastes space. My home directory is unlikely to have more than 4G objects or 4GB extents (let alone >2 devices).

I think the two issues can be improved by separating the on-disk format and the in-memory structure, and by using uleb128 as the on-disk format for numbers. uleb128 is a variable-length format that encodes 7 bits of a number in each byte, using the eighth bit as a stop bit.

So, for example

struct btrfs_disk_key {
   __le64 objectid;
   u8 type;
   __le64 offset;
} __attribute__ ((__packed__));

With 1M objectids, and 1T offsets, this reduces in size from 17 bytes to 10 bytes. Most other structures show similar gains. We can also have more than 256 types if the need arises.

There are, off course, disadvantages to switching to uleb128:

- need to write encode and decode functions, which is tedious. This can be automated a la xdr.
- increased cpu utilization for decoding and encoding
- can no longer know the size of the in-memory structures in advance
- it's just wonderful to rewrite the entire disk format so close to freezing it

The advantages, IMO, outweigh the disadvantages:

- better packing reduces tree depth and therefore seekage, the most important cost on rotating media
- the disk format is infinitely growable
- in-memory format is more efficient for archs which prefer aligned accesses

I'm not volunteering to do this, but please consider this proposal.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux