Buffer Heads¶
Linux uses buffer heads to maintain state about individual filesystem blocks. Buffer heads are deprecated and new filesystems should use iomap instead.
Functions¶
-
void brelse(struct buffer_head *bh)¶
Release a buffer.
Parameters
struct buffer_head *bh
The buffer to release.
Description
Decrement a buffer_head’s reference count. If bh is NULL, this function is a no-op.
If all buffers on a folio have zero reference count, are clean
and unlocked, and if the folio is unlocked and not under writeback
then try_to_free_buffers()
may strip the buffers from the folio in
preparation for freeing it (sometimes, rarely, buffers are removed
from a folio but it ends up not being freed, and buffers may later
be reattached).
Context
Any context.
-
void bforget(struct buffer_head *bh)¶
Discard any dirty data in a buffer.
Parameters
struct buffer_head *bh
The buffer to forget.
Description
Call this function instead of brelse()
if the data written to a buffer
no longer needs to be written back. It will clear the buffer’s dirty
flag so writeback of this buffer will be skipped.
Context
Any context.
-
struct buffer_head *__bread(struct block_device *bdev, sector_t block, unsigned size)¶
Read a block.
Parameters
struct block_device *bdev
The block device to read from.
sector_t block
Block number in units of block size.
unsigned size
The block size of this device in bytes.
Description
Read a specified block, and return the buffer head that refers
to it. The memory is allocated from the movable area so that it can
be migrated. The returned buffer head has its refcount increased.
The caller should call brelse()
when it has finished with the buffer.
Context
May sleep waiting for I/O.
Return
NULL if the block was unreadable.
-
struct buffer_head *get_nth_bh(struct buffer_head *bh, unsigned int count)¶
Get a reference on the n’th buffer after this one.
Parameters
struct buffer_head *bh
The buffer to start counting from.
unsigned int count
How many buffers to skip.
Description
This is primarily useful for finding the nth buffer in a folio; in that case you pass the head buffer and the byte offset in the folio divided by the block size. It can be used for other purposes, but it will wrap at the end of the folio rather than returning NULL or proceeding to the next folio for you.
Return
The requested buffer with an elevated refcount.
-
int sync_mapping_buffers(struct address_space *mapping)¶
write out & wait upon a mapping’s “associated” buffers
Parameters
struct address_space *mapping
the mapping which wants those buffers written
Description
Starts I/O against the buffers at mapping->i_private_list, and waits upon that I/O.
Basically, this is a convenience function for fsync(). mapping is a file or directory which needs those buffers to be written for a successful fsync().
-
int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end, bool datasync)¶
generic buffer fsync implementation for simple filesystems with no inode lock
Parameters
struct file *file
file to synchronize
loff_t start
start offset in bytes
loff_t end
end offset in bytes (inclusive)
bool datasync
only synchronize essential metadata if true
Description
This is a generic implementation of the fsync method for simple filesystems which track all non-inode metadata in the buffers list hanging off the address_space structure.
-
int generic_buffers_fsync(struct file *file, loff_t start, loff_t end, bool datasync)¶
generic buffer fsync implementation for simple filesystems with no inode lock
Parameters
struct file *file
file to synchronize
loff_t start
start offset in bytes
loff_t end
end offset in bytes (inclusive)
bool datasync
only synchronize essential metadata if true
Description
This is a generic implementation of the fsync method for simple filesystems which track all non-inode metadata in the buffers list hanging off the address_space structure. This also makes sure that a device cache flush operation is called at the end.
-
bool block_dirty_folio(struct address_space *mapping, struct folio *folio)¶
Mark a folio as dirty.
Parameters
struct address_space *mapping
The address space containing this folio.
struct folio *folio
The folio to mark dirty.
Description
Filesystems which use buffer_heads can use this function as their
->dirty_folio implementation. Some filesystems need to do a little
work before calling this function. Filesystems which do not use
buffer_heads should call filemap_dirty_folio()
instead.
If the folio has buffers, the uptodate buffers are set dirty, to preserve dirty-state coherency between the folio and the buffers. Buffers added to a dirty folio are created dirty.
The buffers are dirtied before the folio is dirtied. There’s a small race window in which writeback may see the folio cleanness but not the buffer dirtiness. That’s fine. If this code were to set the folio dirty before the buffers, writeback could clear the folio dirty flag, see a bunch of clean buffers and we’d end up with dirty buffers/clean folio on the dirty folio list.
We use i_private_lock to lock against try_to_free_buffers()
while
using the folio’s buffer list. This also prevents clean buffers
being added to the folio after it was set dirty.
Context
May only be called from process context. Does not sleep. Caller must ensure that folio cannot be truncated during this call, typically by holding the folio lock or having a page in the folio mapped and holding the page table lock.
Return
True if the folio was dirtied; false if it was already dirtied.
-
void mark_buffer_dirty(struct buffer_head *bh)¶
mark a buffer_head as needing writeout
Parameters
struct buffer_head *bh
the buffer_head to mark dirty
Description
mark_buffer_dirty()
will set the dirty bit against the buffer, then set
its backing page dirty, then tag the page as dirty in the page cache
and then attach the address_space’s inode to its superblock’s dirty
inode list.
mark_buffer_dirty()
is atomic. It takes bh->b_folio->mapping->i_private_lock,
i_pages lock and mapping->host->i_lock.
-
void __brelse(struct buffer_head *bh)¶
Release a buffer.
Parameters
struct buffer_head *bh
The buffer to release.
Description
This variant of brelse()
can be called if bh is guaranteed to not be NULL.
-
void __bforget(struct buffer_head *bh)¶
Discard any dirty data in a buffer.
Parameters
struct buffer_head *bh
The buffer to forget.
Description
This variant of bforget()
can be called if bh is guaranteed to not
be NULL.
-
struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp)¶
Get a buffer_head in a block device’s buffer cache.
Parameters
struct block_device *bdev
The block device.
sector_t block
The block number.
unsigned size
The size of buffer_heads for this bdev.
gfp_t gfp
The memory allocation flags to use.
Description
The returned buffer head has its reference count incremented, but is
not locked. The caller should call brelse()
when it has finished
with the buffer. The buffer may not be uptodate. If needed, the
caller can bring it uptodate either by reading it or overwriting it.
Return
The buffer head, or NULL if memory could not be allocated.
-
struct buffer_head *__bread_gfp(struct block_device *bdev, sector_t block, unsigned size, gfp_t gfp)¶
Read a block.
Parameters
struct block_device *bdev
The block device to read from.
sector_t block
Block number in units of block size.
unsigned size
The block size of this device in bytes.
gfp_t gfp
Not page allocation flags; see below.
Description
You are not expected to call this function. You should use one of
sb_bread(), sb_bread_unmovable() or __bread()
.
Read a specified block, and return the buffer head that refers to it. If gfp is 0, the memory will be allocated using the block device’s default GFP flags. If gfp is __GFP_MOVABLE, the memory may be allocated from a movable area. Do not pass in a complete set of GFP flags.
The returned buffer head has its refcount increased. The caller should
call brelse()
when it has finished with the buffer.
Context
May sleep waiting for I/O.
Return
NULL if the block was unreadable.
-
void block_invalidate_folio(struct folio *folio, size_t offset, size_t length)¶
Invalidate part or all of a buffer-backed folio.
Parameters
struct folio *folio
The folio which is affected.
size_t offset
start of the range to invalidate
size_t length
length of the range to invalidate
Description
block_invalidate_folio()
is called when all or part of the folio has been
invalidated by a truncate operation.
block_invalidate_folio()
does not have to release all buffers, but it must
ensure that no dirty buffer is left outside offset and that no I/O
is underway against any of the blocks which are outside the truncation
point. Because the caller is about to free (and possibly reuse) those
blocks on-disk.
-
void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)¶
clean a range of buffers in block device
Parameters
struct block_device *bdev
Block device to clean buffers in
sector_t block
Start of a range of blocks to clean
sector_t len
Number of blocks to clean
Description
We are taking a range of blocks for data and we don’t want writeback of any buffer-cache aliases starting from return from this function and until the moment when something will explicitly mark the buffer dirty (hopefully that will not happen until we will free that block ;-) We don’t even need to mark it not-uptodate - nobody can expect anything from a newly allocated buffer anyway. We used to use unmap_buffer() for such invalidation, but that was wrong. We definitely don’t want to mark the alias unmapped, for example - it would confuse anyone who might pick it with bread() afterwards...
Also.. Note that bforget()
doesn’t lock the buffer. So there can be
writeout I/O going on against recently-freed buffers. We don’t wait on that
I/O in bforget()
- it’s more efficient to wait on the I/O only if we really
need to. That happens here.
Parameters
struct folio *folio
The folio.
Description
If any buffers are in use (dirty, under writeback, elevated refcount), no buffers will be freed.
If the folio is dirty but all the buffers are clean then we need to be sure to mark the folio clean as well. This is because the folio may be against a block device, and a later reattachment of buffers to a dirty folio will set all buffers dirty. Which would corrupt filesystem data on the same device.
The same applies to regular filesystem folios: if all the buffers are
clean then we set the folio clean and proceed. To do that, we require
total exclusion from block_dirty_folio()
. That is obtained with
i_private_lock.
Exclusion against try_to_free_buffers may be obtained by either locking the folio or by holding its mapping’s i_private_lock.
Context
Process context. folio must be locked. Will not sleep.
Return
true if all buffers attached to this folio were freed.
-
int bh_uptodate_or_lock(struct buffer_head *bh)¶
Test whether the buffer is uptodate
Parameters
struct buffer_head *bh
struct buffer_head
Description
Return true if the buffer is up-to-date and false, with the buffer locked, if not.
-
int __bh_read(struct buffer_head *bh, blk_opf_t op_flags, bool wait)¶
Submit read for a locked buffer
Parameters
struct buffer_head *bh
struct buffer_head
blk_opf_t op_flags
appending REQ_OP_* flags besides REQ_OP_READ
bool wait
wait until reading finish
Description
Returns zero on success or don’t wait, and -EIO on error.
-
void __bh_read_batch(int nr, struct buffer_head *bhs[], blk_opf_t op_flags, bool force_lock)¶
Submit read for a batch of unlocked buffers
Parameters
int nr
entry number of the buffer batch
struct buffer_head *bhs[]
a batch of struct buffer_head
blk_opf_t op_flags
appending REQ_OP_* flags besides REQ_OP_READ
bool force_lock
force to get a lock on the buffer if set, otherwise drops any buffer that cannot lock.
Description
Returns zero on success or don’t wait, and -EIO on error.