Optimizations
Some parts of this section will compare against the original unreleased C# implementation
This is to give an idea of where things can be improved while translating some existing code.
Strings
All strings internally are stored in the platform's native string format.
Encoding
Unix-based Systems
On Linux and other unix-based systems, a file or folder is usually represented by a sequence of bytes.
We'll assume UTF-8, as that's the case on pretty much all modern distros.
Windows
On Windows, that is a sequence of 16-bit characters (UTF-16).
Although Windows has ANSI (A
) and Unicode (W
) APIs side by side, under the hood, Windows
converts everything to UTF-16.
In order to speed things up, it is planned to use UTF-16 strings internally for non-ASCII input to avoid re-encoding them like in the original C# implementation.
This may change, in Rust version, we may investigate the tradeoff.
Case Conversion
In Windows, expect case insensitive paths.
We have to convert paths to uppercase inside various hooks.
We choose uppercase because some languages e.g. greek have special rules for lowercase
This saves us from having to deal with those rules, improving performance.
Previous Implementation
In the C# implementation, we backported a ToUpper
function from one of the .NET 8 Previews.
Implementation Here. (This function is faster than the one which
shipped in .NET 8 proper)
In said implementation, the code is optimized for 99% of the cases where the paths are pure ASCII; with rare other cases falling to a slower implementation.
Rust Implementation
In Rust we can use the make_ascii_uppercase function if we can trust
the paths to be ASCII for comparisons. Alternatively there's also functions like to_uppercase
.
We may want to copy out the implementation of to_uppercase
to return result if string is ASCII
Rather than explicitly checking in another step, which is expensive.
These are not vectorised in the Rust source, but thanks to LLVM magic, they do get vectorised,
so there is generally no need to write our own. The to_uppercase
function is generally limited
to unrolled SSE2, while make_ascii_uppercase
will use rolled AVX2 if possible on CPU target.
Hashing
A hardware accelerated string hash function is used for file paths.
In the original C# implementation, this was a vectorized function which now lives here Reloaded.Memory String Hash.
For the Rust implementation, it's we will use AHash. See the Common/Hashing page for more details.
- All strings stored as Wide Strings.
- Windows APIs use Wide Strings under the hood, even for ANSI APIs.
- Therefore we save time by not having to widen them again.
HashMaps
HashMaps need to be able to query string slices.
In original C# implementation, it was necessary to use a custom dictionary that could query string slices as keys, due to limitation of the built in one.
In Rust, this won't be needed; as thanks to the Equivalent trait. Or using the
HashTable
API raw.
Hashbrown (Swisstable) is also faster.
Memory Storage
In the Rust implementation, we will pack strings together for efficiency.
In C#, some code was limited to the standard String
type due to interoperability with
existing APIs. With some additional foresight and experience, we can do a bit better in Rust.
The idea is to use string pooling
and custom string headers. Rather than having an allocation for
each string, we can perform a single memory allocation and then store all of the strings inline in
a single buffer.
// Pseudocode
pub struct StringEntry
{
/// Truth Table for char_count_and_flag:
///
/// | Bit Position (from LSB) | 0 (IsAscii flag) | 1-15 (char_count) |
/// |--------------------------------|------------------|----------------------|
/// | Value when ASCII | 1 | ASCII char count |
/// | Value when non-ASCII (Windows) | 0 | UTF-16 char count |
/// | Value when non-ASCII (Unix) | 0 | UTF-8 byte count |
///
/// - The least significant bit (LSB, position 0) is used as the `IsAscii` flag.
/// If it's 1, the string is ASCII. If it's 0, the string is non-ASCII.
///
/// - The remaining 15 bits (positions 1-15) are used to store the character count.
/// If the string is ASCII, this is the count of ASCII characters.
/// If the string is non-ASCII, on Windows this is the count of UTF-16 characters,
/// and on Unix this is the count of UTF-8 bytes.
char_count_and_flag: u16,
// Inline in memory, right after header
// Or u8 if using ANSI strings.
data: u16[length]
}
In 99% of cases, file paths will map to ASCII, meaning we only need 1 byte per character.
On Windows, where paths are internally UTF-16, we can then dynamically widen these paths to UTF-16 when needed, as that operation is very cheap for ASCII strings.
#[no_mangle]
pub fn ascii_to_utf16(ascii_str: &str, buffer: &mut [u16]) {
let bytes = ascii_str.as_bytes();
let len = bytes.len();
for i in 0..len {
buffer[i] = bytes[i] as u16;
}
}
Auto-vectorizes nicely, thanks to LLVM.
Pooled Strings in HashTables
And for Hashbrown
's low level HashTable primitive.
// 16 bytes, nicely aligns with cache lines
pub struct TableEntry
{
// 64-bit key
pub key: u64,
// Offset for directory string in pool.
// We deduplicate strings by directory, reducing memory usage further.
// On average there's typically 7 files per 1 folder in Sewer's Steam folder.
pub dir: u32,
// Offset for file string in the pool.
pub file: u32,
}
// If additional metadata is required, trim `dir` and `file` to 30 bits,
// and use remaining space for metadata.
And to search, we should compare key at minimum
// To search (hash at minimum, as shown here)
table.find(KEY_HASH, |&val| val.key == KEY_HASH);
// TODO: Add string comparison here.
Lookup Tree
After creating the 'Redirection Tree' we create a 'Lookup Tree' for translating file paths
More details on both structures can be found in 'Implementation Details' section.
The LookupTree
allows us to resolve file paths in O(3) time, at expense of some RAM.