fuse: fix fuseblk i_blkbits for iomap partial writes

On regular fuse filesystems, i_blkbits is set to PAGE_SHIFT which means
any iomap partial writes will mark the entire folio as uptodate. However
fuseblk filesystems work differently and allow the blocksize to be less
than the page size. As such, this may lead to data corruption if fuseblk
sets its blocksize to less than the page size, uses the writeback cache,
and does a partial write, then a read and the read happens before the
write has undergone writeback, since the folio will not be marked
uptodate from the partial write so the read will read in the entire
folio from disk, which will overwrite the partial write.

The long-term solution for this, which will also be needed for fuse to
enable large folios with the writeback cache on, is to have fuse also
use iomap for folio reads, but until that is done, the cleanest
workaround is to use the page size for fuseblk's internal kernel inode
blksize/blkbits values while maintaining current behavior for stat().

This was verified using ntfs-3g:
$ sudo mkfs.ntfs -f -c 512 /dev/vdd1
$ sudo ntfs-3g /dev/vdd1 ~/fuseblk
$ stat ~/fuseblk/hi.txt
IO Block: 512

Fixes: a4c9ab1d49 ("fuse: use iomap for buffered writes")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
This commit is contained in:
Joanne Koong 2025-08-15 11:25:39 -07:00 committed by Miklos Szeredi
parent 7956994650
commit bd24d2108e
3 changed files with 21 additions and 2 deletions

View File

@ -1199,7 +1199,7 @@ static void fuse_fillattr(struct mnt_idmap *idmap, struct inode *inode,
if (attr->blksize != 0)
blkbits = ilog2(attr->blksize);
else
blkbits = inode->i_sb->s_blocksize_bits;
blkbits = fc->blkbits;
stat->blksize = 1 << blkbits;
}

View File

@ -975,6 +975,14 @@ struct fuse_conn {
/* Request timeout (in jiffies). 0 = no timeout */
unsigned int req_timeout;
} timeout;
/*
* This is a workaround until fuse uses iomap for reads.
* For fuseblk servers, this represents the blocksize passed in at
* mount time and for regular fuse servers, this is equivalent to
* inode->i_blkbits.
*/
u8 blkbits;
};
/*

View File

@ -292,7 +292,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
if (attr->blksize)
fi->cached_i_blkbits = ilog2(attr->blksize);
else
fi->cached_i_blkbits = inode->i_sb->s_blocksize_bits;
fi->cached_i_blkbits = fc->blkbits;
/*
* Don't set the sticky bit in i_mode, unless we want the VFS
@ -1810,10 +1810,21 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
err = -EINVAL;
if (!sb_set_blocksize(sb, ctx->blksize))
goto err;
/*
* This is a workaround until fuse hooks into iomap for reads.
* Use PAGE_SIZE for the blocksize else if the writeback cache
* is enabled, buffered writes go through iomap and a read may
* overwrite partially written data if blocksize < PAGE_SIZE
*/
fc->blkbits = sb->s_blocksize_bits;
if (ctx->blksize != PAGE_SIZE &&
!sb_set_blocksize(sb, PAGE_SIZE))
goto err;
#endif
} else {
sb->s_blocksize = PAGE_SIZE;
sb->s_blocksize_bits = PAGE_SHIFT;
fc->blkbits = sb->s_blocksize_bits;
}
sb->s_subtype = ctx->subtype;