Welcome! Log In Create A New Profile

How to map the physical contiguous pages into a block of logical contiguous address space?

Posted by luludede 
How to map the physical contiguous pages into a block of logical contiguous address space?
December 17, 2009 09:45PM
Hello,

In Linux Kernel, we have alloc_pages(unsigned int gfp_mask, unsigned int order) to allocate 2^order contiguous physical pages, and we have page_address(struct page *page) to returns the logical address of a single page.

However, it seems that page_address() doesn't guarantee that the physical contiguous pages be mapped to logical contiguous address space. For example if we apply page_address() to four physical contiguous pages for four times, the returned logical address are not necessarily contiguous.

My question is : How to map the physical contiguous pages into a block of logical contiguous address space?

Thank you!
Re: How to map the physical contiguous pages into a block of logical contiguous address space?
December 18, 2009 12:58AM
alloc_pages() with non-zero order argument does return physically-contiguous pages.

If these pages are from lowmem, then their page_address()es are simply physical addresses plus costant offset, thus physical continous as well.
If these pages are from highmem, then they are not directly accessible from kernel at all, they have to be mapped one-by-one. Kernel does not provide an API to map several highmem pages near each other (or at least I'm not aware of such API).

However alloc_pages() is a low-level interface - do you really need it? If needed allocation is small enough (say, less than several megabytes), then simple kmalloc() will give you spave both physically- and virtually-contiguos. If buffer is larger, then it may be a better idea to allocate space on system boot, using alloc_bootmem() interface. Although that forces you to have (part of) your code built into kernel image, not module.
Re: How to map the physical contiguous pages into a block of logical contiguous address space?
December 18, 2009 01:41AM
Hi Yoush,
Thanks for your explanation, I also checked the code and summarize your comment and the code before:

1. For normal memory zone and DMA memory zone pages, if they are physically contiguous, they are logically contiguous
because: the kernel maps the physical address to the logical address in a linear way:
page_address(struct *page page ) {
__va((unsigned long)(page -mem_map) <<12);
}
2. For high memory zone pages, physically contiguous does not necessary mean that they are logically contiguous. Because the kernel use a kind of random way to allocate logical address for physical page. (use map_new_virutal() to map the page one-by-one via a global array pkmap_count[LAST_PKMAP])
--------------------------------

I do need to allocate a series of contiguous pages(not plain kernel memory buffer).
I am trying to improving the performance of read().
In the current design of Linux kernel, on a 64k bytes of read() request, the kernel will (in the scenario that the data is not in page cache)
1) divided the file read request into 16 pages request
2) fill in the 16 pages by reading the data from the hard drive
3) invoke copy_to_user() 16 times to copy the data to the user buffe
Note that one time of 64k data read() request from user will incurr 16 times of invoking copy_to_user(), which is kind of expensive. So I plan to arrange the pages in a logical contiguous layout then I could use only one time of copy_to_user() to finished a 64k data read().

I tested the throughput of copy 512M data from kernel to user on both 4k granularity and 64k granularity. The bigger one is roughtly 3 times of the smaller one:
__copy_to_user_ll (64 bytes unit ) for 8192 times, avg rate: 4510760 KiB/s
__copy_to_user_ll (4 bytes unit )for 131072 times, avg rate: 1515668 KiB/s

-------------------------------------------------------------------------------
Change the granularity of copy_to_user is just one step. I have a crazy idea:

Traditional file systems use fixed block size(4k, 8k, or 16k). What if we create a file system with flexible block size.( block size varies according to the I/O request from the user application )? In this way, we could minimize the number of data copies between user and buffer, as well as minimize the number of I/Os to disk.

I am a newbie to Linux kernel, and welcom any comment on it !

Thank you.

-luludede
Re: How to map the physical contiguous pages into a block of logical contiguous address space?
December 19, 2009 03:10PM
Several comments on that.

- filesystem is actually a data structure over a block device, all i/o is actually done at block device level. Are you going to implement variable-block-sized block device? Why do you think it will perform better?

- for perfornance-sensitive applications, read() is not correct interface to use. Mmap() should be used instead. In that case, there is zero-copy scenario: disk driver DMAs data to pages and exactly those pages get mapped into process memory. Witthin that scenariom, many readpage() requests are activew at a moment, and I/O scheduler tries to make maximum performance for those. There almost certanly is some room for improvement, but I doubt it is in variable-sized blocks.
Author:

Your Email:


Subject:


Message: