🗃

Storage

Category

Limit the Number of Files in a Directory

Overview

Typically, a background thread handles I/O operations.

Loading Data

There are two ways to load data a storage. The first way is the most common, a blocking load. The second way is to use asynchronous I/O where there are multiple requests in flight at a time.

Blocking loads

A blocking load is a call to the operating system to load a file where the caller is blocked until the load finishes.

💡
The C++ Standard Library functions are blocking: fread and fwrite.

Asynchronous loads

An asynchronous load is using overlapped I/O.

Storage Devices

Hard disks

Lookup

On most platforms, entries are not sorted in the hard disk's file systems. File systems perform file lookups as O(n) operations.

Storing game data in big archives can increase the file read performance.

Fragmentation

Hard disks suffer from fragmentation.

Storing game data accross fewer files can increase the file access performance.

External storage devices

External storage devices are supported on console platforms such as the Xbox and PlayStation.

Because external storage devices may be connected or disconnected at any time, they must be considered transient.

On console platforms, once an external storage devices is aquired, its content can be enumerated.

On console platforms, the content is signed automatically when using the I/O API.

On Xbox, a hashing method is used to generate a content signature.

Optical discs (DVD / Blu-Ray)

Reads from optical discs is always asynchronous.

Reads from the outer tracks will be faster than reads from the inner tracks.

Security

Storing a hash (similar to a checksum) with the data can help prevent modding of the game data and save data.

When the data is read, the existing hash can be compared to a new hash of the data. If there's a match, the data is valid; otherwise, the data is corrupt.

Compression

Standard lossless algorithms often provide 2:1 or 3:1 compression ratios depending on the type of data, producing smaller files that require less time to read and write.

LZX

LZX is an efficient, lossless algorithm. Depending on the data to compress, the LZX algorithm provides a level of compression that is slightly better than or on a par with other compression libraries.

💡
The Xbox 360 uses a variant of the LZ family of compression algorithms, which is called LZX.

MCT

On the Xbox 360, the MCT Codec compresses 2D, cube, array, or volume textures that are in the DXTn, DXN, or CTX1 format.

Platforms

Windows and Xbox

CreateFile — Creates or opens a file and returns its handle.

CloseHandle — Closes an open file handle.

ReadFile — Reads data from the specified file.

WriteFile — Writes data to the specified file.

Best Practices

Native Functions

Prefer the native I/O functions instead of the standard library functions.

The standard library functions are wrappers on top of the native function and have extra overhead.

The native functions provide optimization flags and asynchronous file operations.

Asynchronous File Operations

By default, the read and write functions block until the entire read or write operation has completed.

I/O operations should be asynchronous.

The benefit from using asynchronous I/O is that you can have more than one request in flight at a time.

💡
Typically, SSD drives have a command queue of 32 requests. As the command queue is shared with the operating system, a good number of requests is around 16.

Windows and Xbox

There are some requirements to support overlapped I/O:

On Windows the sector sizes can be retrieved using the IOCTL API:

Compression

I/O performance can be improved by using lossless compression techniques for storing data.

Read calls

The root cause of all file loading problems is related to the seeking time.

Read and write data in large blocks for better performance.

Reading in larger amounts of data per read call will directly lower the number of read calls, and therefore reduce the number of seeks.

💡
The gains are less significant from an SSD.

Windows and Xbox

When calling ReadFile and WriteFile, specify the dwBytesToRead parameter.

Minimum Buffer Sizes

Xbox

DeviceReadWrite
Hard disk32 KB16 KB
Memory unit64 KB16 KB

Sequential access

Sequential access is faster than random access.

💡
SSD gain more from sequential access due to tighter packing of the tracks on the rotational disk.

Preset the File Size

File write times can be improve if the file size of a new file is known in advance.

On Windows and Xbox, use SetFilePointer and SetEndOfFile to preset the file size.

Avoid Small Files

Most file systems have a minimum allocation size for files called a cluster.

Small files have a high overhead because each file requires a minimum of one cluster of physical space.

💡
On Windows, the default cluser size is 4 KB for an NTFS hard disk and 16 KB for a FAT32 hard disk. On Xbox, the default cluser size is 16 KB for the hard disk and memory unit.

Number of Files in a Directory

Because of the technique used by the operating system to store directory listings, a large number of file in a directory will affect file access performance.

The number of files and subdirectories in a directory should be limited.

💡
On Windows, it is recommendd to store less than 1000 files. On Xbox, it is recommended to store less than 768 files.

Checksums

Checksums can be useful for any data that you want to protect from hackers or simply want to verify at run-time.