Commit Graph

9 Commits

Author SHA1 Message Date
Caleb Sander Mateos 1e93de9205 io_uring/query: drop unused io_handle_query_entry() ctx arg
io_handle_query_entry() doesn't use its struct io_ring_ctx *ctx
argument. So remove it from the function and its callers.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-26 09:37:10 -07:00
Pavel Begunkov 4aaa9bc4d5 io_uring/query: introduce rings info query
Same problem as with zcrx in the previous patch, the user needs to know
SQ/CQ header sizes to allocated memory before setup to use it for user
provided rings, i.e. IORING_SETUP_NO_MMAP, however that information is
only returned after registration, hence the user is guessing kernel
implementation details.

Return the header size and alignment, which is split with the same
motivation, to allow the user to know the real structure size without
alignment in case there will be more flexible placement schemes in the
future.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13 11:17:36 -07:00
Pavel Begunkov 2647e2ecc0 io_uring/query: introduce zcrx query
Add a new query type IO_URING_QUERY_ZCRX returning the user some basic
information about the interface, which includes allowed flags for areas
and registration and supported IORING_REGISTER_ZCRX_CTRL subcodes.

There is also a chicken-egg problem with user provided refill queue
memory, where offsets and size information is returned after
registration, but to properly allocate memory you need to know it
beforehand, which is why the userspace currently has to guess the RQ
headers size and severely overestimates it. Return the size information.
It's split into "size" and "alignment" fields because for default
placement modes the user is interested in the aligned size, however if
it gets support for more flexible placement, it'll need to only know the
actual header size.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13 11:17:36 -07:00
Jens Axboe ecb8490b2f Merge branch 'io_uring-6.18' into for-6.19/io_uring
Merge 6.18-rc io_uring fixes, as certain coming changes depend on some
of these.

* io_uring-6.18:
  io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs
  io_uring/query: return number of available queries
  io_uring/rw: ensure allocated iovec gets cleared for early failure
  io_uring: fix regbuf vector size truncation
  io_uring: fix types for region size calulation
  io_uring/zcrx: remove sync refill uapi
  io_uring: fix buffer auto-commit for multishot uring_cmd
  io_uring: correct __must_hold annotation in io_install_fixed_file
  io_uring zcrx: add MAINTAINERS entry
  io_uring: Fix code indentation error
  io_uring/sqpoll: be smarter on when to update the stime usage
  io_uring/sqpoll: switch away from getrusage() for CPU accounting
  io_uring: fix incorrect unlikely() usage in io_waitid_prep()

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13 07:26:37 -07:00
Pavel Begunkov 21bd7b14a3 io_uring/query: buffer size calculations with a union
Instead of having an array of a calculated size as a buffer, put all
query uapi structures into a union and pass that around. That way
everything is well typed, and the compiler will prevent opcode handling
using a structure not accounted into the buffer size.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11 07:53:33 -07:00
Pavel Begunkov 6a77267d97 io_uring/query: return number of available queries
It's useful to know which query opcodes are available. Extend the
structure and return that. It's a trivial change, and even though it can
be painlessly extended later, it'd still require adding a v2 of the
structure.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-10 14:59:35 -07:00
Pavel Begunkov 7ea24326e7 io_uring/query: cap number of queries
If a query chain forms a cycle, it'll be looping in the kernel until the
process is killed. It might be fine as any such mistake can be easily
uncovered during testing, but it's still nicer to let it break out of
the syscall if it executed too many queries.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-09-19 07:06:43 -06:00
Pavel Begunkov 2408d17832 io_uring/query: prevent infinite loops
If the query chain forms a cycle, the interface will loop indefinitely.
Make sure it handles fatal signals, so the user can kill the process and
hence break out of the infinite loop.

Fixes: c265ae75f9 ("io_uring: introduce io_uring querying")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-09-19 07:06:43 -06:00
Pavel Begunkov c265ae75f9 io_uring: introduce io_uring querying
There are many parameters users might want to query about io_uring like
available request types or the ring sizes. This patch introduces an
interface for such slow path queries.

It was written with several requirements in mind:
- Can be used with or without an io_uring instance. Asking for supported
  setup flags before creating an instance as well as qeurying info about
  an already created ring are valid use cases.
- Should be moderately fast. For example, users might use it to
  periodically retrieve ring attributes at runtime. As a consequence,
  it should be able to query multiple attributes in a single syscall.
- Backward and forward compatible.
- Should be reasobably easy to use.
- Reduce the kernel code size for introducing new query types.

It's implemented as a new registration opcode IORING_REGISTER_QUERY.
The user passes one or more query strutctures linked together, each
represented by struct io_uring_query_hdr. The header stores common
control fields needed for processing and points to query type specific
information.

The header contains
- The query type
- The result field, which on return contains the error code for the query
- Pointer to the query type specific information
- The size of the query structure. The kernel will only populate up to
  the size, which helps with backward compatibility. The kernel can also
  reduce the size, so if the current kernel is older than the inteface
  the user tries to use, it'll get only the supported bits.
- next_entry field is used to chain multiple queries.

Apart from common registeration syscall failures, it can only immediately
return an error code in case when the headers are incorrect or any
other addresses and invalid. That usually mean that the userspace
doesn't use the API right and should be corrected. All query type
specific errors are returned in the header's result field.

As an example, the patch adds a single query type for now, i.e.
IO_URING_QUERY_OPCODES, which tells what register / request / etc.
opcodes are supported, but there are particular plans to extend it.

Note: there is a request probing interface via IORING_REGISTER_PROBE,
but it's a mess. It requires the user to create a ring first, it only
works for requests, and requires dynamic allocations.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-09-08 08:06:37 -06:00