Microsoft has announced the Resilient File System (ReFS), a replacement for the NTFS file system which has been used since the first release of Windows NT in 1993.
The new file system increases limits in NTFS as follows:
NTFS | ReFS | |
Max file size | 2^64 -1 | 2^64-1 bytes |
Max volume size | 2^40 bytes | 2^78 bytes |
Max files in a directory | 2^32 –1 (per volume) | 2^64 |
Max file name length | 32K unicode (255 unicode) | 32K unicode |
Max path length | 32K | 32K |
I have done my best to set out the NTFS limits but it is complicated, and there are limitations in the Windows API as well as in NTFS. See this article for more on NTFS limits; and this article for an explanation of file name and path length limits in the Windows API.
Microsoft’s announcement focuses on two things. One is resilience, with claims that ReFS is better at preserving data in the event of power failure or other calamity. Another is how ReFS is designed to work alongside Storage Spaces, about which I posted earlier this month.
Of the two, Storage Spaces will be more visible to users. In addition, it sounds as if ReFS will not be the default in Windows 8 client:
…we will implement ReFS in a staged evolution of the feature: first as a storage system for Windows Server, then as storage for clients, and then ultimately as a boot volume. This is the same approach we have used with new file systems in the past.
Note that there are losses as well as gains in ReFS. Short file names are gone, so are quotas, so is compression:
The NTFS features we have chosen to not support in ReFS are: named streams, object IDs, short names, compression, file level encryption (EFS), user data transactions, sparse, hard-links, extended attributes, and quotas.
Overall ReFS strikes me as a conservative rather than radical upgrade. This is not the return of WinFS, an abandoned project which was to bring relational file storage to Windows. It will not help, in itself, with the biggest problem client users have with their file system: finding their stuff. Nor does it have built-in deduplication, which can make storage substantially more efficient. Microsoft says the file system is pluggable (as is NTFS) so that features like deduplication can added by other providers or by Microsoft with other products.
Obviously though, as they’re leveraged by Windows for the system volume by other system components most of the missing features will need to make a comeback in a future OS release before the file system can be bootable (or indeed used generally).
At a guess, those that may be lost forever for ReFS are compression, 8.3 names, EFS and volume (as opposed to folder-) quotas.
Pretty much everything they removed sounds reasonable – except sparse files. Sparse file support is critical for virtual machines, and virtualization would seem to be a core server capacity they’d want to support with their next-gen filesystem.
I’m sure they’ve examined this and determined it’s not necessary – but I’d really love to know what their rationale for that one is.
davis: Interestingly, EFS, compression, and extended attributes are all implemented on top of named streams. I’m curious if Microsoft was deprecating these features to clean up the filesystem, or if they simply decided to leave named streams until ReFS 2.0.
Joshua: Actually, sparse files are not used by virtualization software. The VMDK and VHD formats both use standard (non-sparse) files on disk, and then implement sparseness themselves at the application layer.
Sparse file is not critical, but important for SQL Server though.
Yes, SQL Server is an interesting case.
Many DB engines re-implement a lot of filesystem features themselves, so as to ensure that they can run on as many platforms as possible.
But Microsoft SQL Server only runs on top of NT. In SQL Server 7.0 — the first version that Microsoft released without any Sybase involvement — Microsoft was able to remove around 1/3 of the core DB engine code, simply because they were able to call down into NTFS.