grain.experimental.multithread_prefetch#
- grain.experimental.multithread_prefetch(ds, num_threads, buffer_size, sequential_slice=False)#
Uses a pool of threads to prefetch elements ahead of time.
This is a thread-based alternative to multiprocess_prefetch intended to be used with free-threaded Python.
It works by sharding the input dataset into num_threads shards, and interleaving them. Each shard is read by a separate thread inside InterleaveIterDataset.
- Parameters:
ds (IterDataset[T]) – The parent dataset to prefetch from.
num_threads (int) – The number of threads to use for prefetching. If 0, prefetching is disabled and this is a no-op.
buffer_size (int) – The size of the prefetch buffer for each thread.
sequential_slice (bool) – Whether to use sequential slicing.
- Returns:
An IterDataset that prefetches elements from ds using multiple threads.
- Return type:
IterDataset[T]