Typically when willing to let user mode code perform DMA, in device driver code, we would call dma_alloc_coherent() to pre-allocate a chunk of memory at load time (i.e. when loading the kernel module/driver); in other words, at boot time. Then in the mmap() implementation, we could use the kernel logical address obtained from dma_alloc_coherent() to get the page frame number and then pass it to remap_pfn_range().
The above can be considered allocating the DMA buffer at boot time.
What if I would like to allocate the DMA buffer at runtime?
In other words, when user calls mmap(), the user passes in the size of the region, and when it comes to driver code, it will call __get_free_pages() or alloc_pages() to obtain contiguous pages, and then create and add the mapping to the page table.
When creating the mapping, I found one API that handles mapping more than 1 pages - vm_map_pages() (from the answer to this question).
My question is:
Since during runtime, requesting a large number of pages may fail; as a result, instead of only one contiguous chunk of memory, we may end up having a scattered list of memory regions. In such case, the user could access those regions using readv() or writev().
However, in order to let user use readv() and writev(), the user has to know the list of virtual addresses of the start of those regions. How could we obtain those virtual addresses in kernel space?
The vm_area_struct structure has a field vm_next that points to another vm_area_struct. My current implementation is: for each list of pages we obtain (by alloc_pages()) we create a mapping using vm_map_pages(). However, by tracing the code of vm_map_pages(), I did not find it creating new vm_area_struct and append it to the current vm_area_struct. That is why I am confused. If the linked list of vm_area_struct is 1, then how could we obtain the virtual addresses of those memory regions?
How could I obtain the user space's virtual addresses for each scattered mapping?