User Tools

Site Tools


2mib_fs_dax

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
2mib_fs_dax [2017/12/20 20:37]
Ross Zwisler created
2mib_fs_dax [2020/09/24 00:43] (current)
Darrick Wong mount xfs with lazytime to avoid timestamp update overhead on page faults
Line 16: Line 16:
 Here are the steps that I've used to successfully get filesystem DAX PMDs: Here are the steps that I've used to successfully get filesystem DAX PMDs:
  
-1. First, make sure that our persistent memory block device starts at a 2 MiB aligned physical address.  ​+1. First, make sure that your namespace is in '​fsdax'​ mode. 
 + 
 + # ndctl list --human 
 +
 +   "​dev":"​namespace0.0",​ 
 +   "​mode":"​fsdax",​ 
 +   "​size":"​16.73 GiB (17.96 GB)",​ 
 +   "​uuid":"​179e5b98-96ee-4988-ba9f-ed9383d11598",​ 
 +   "​sector_size":​512,​ 
 +   "​blockdev":"​pmem0",​ 
 +   "​numa_node":​0 
 +
 + 
 +2. Next, make sure that our persistent memory block device starts at a 2 MiB aligned physical address.  ​
  
 This is important because when we ask the filesystem for 2 MiB aligned and sized block allocations it will provide those block allocations relative to the beginning of its block device. ​ If the filesystem is built on top of a namespace whose data starts at a 1 MiB aligned offset, for example, a block allocation that is 2 MiB aligned from the point of view of the filesystem will still be only 1 MiB aligned from DAX's point of view.  This will cause DAX to fall back to 4 KiB page faults. This is important because when we ask the filesystem for 2 MiB aligned and sized block allocations it will provide those block allocations relative to the beginning of its block device. ​ If the filesystem is built on top of a namespace whose data starts at a 1 MiB aligned offset, for example, a block allocation that is 2 MiB aligned from the point of view of the filesystem will still be only 1 MiB aligned from DAX's point of view.  This will cause DAX to fall back to 4 KiB page faults.
Line 24: Line 37:
  # cat /proc/iomem  # cat /proc/iomem
  ...  ...
- 140000000-57fffffff ​: Persistent Memory + 140000000-57fdfffff ​: Persistent Memory 
-   140000000-57fffffff ​: namespace0.0+   140000000-57fdfffff ​: namespace0.0
  
 Our namespace in this case begins at 5 GiB (0x1 4000 0000), which is 2 MiB (0x20 0000) aligned. Our namespace in this case begins at 5 GiB (0x1 4000 0000), which is 2 MiB (0x20 0000) aligned.
  
-If we create any partitions on top of our PMEM namespace, we must ensure that those partitions are likewise 2 MiB aligned. ​ By default fdisk will create partitions that are 1 MiB (2048 sector) aligned from the start of the parent block device:+It is recommend to use raw devices and create multiple namespaces if the system configuration calls for persistent memory to be provisioned into smaller volumes. This is because namespace alignment is enforced at namespace creation time whereas partitions need to be created by tooling that is careful to align both the start of the namespace and the start of partitions. Long term the pmem device partition is scheduled for deprecation in favor of requiring namespaces for all provisioning. 
 + 
 +Instead, if we create any partitions on top of our PMEM namespace, we must ensure that those partitions are likewise 2 MiB aligned. ​ By default fdisk will create partitions that are 1 MiB (2048 sector) aligned from the start of the parent block device:
  
  # fdisk -l /dev/pmem0  # fdisk -l /dev/pmem0
- Disk /dev/pmem0: 16.GiB, 17966301184 ​bytes, ​35090432 ​sectors+ Disk /dev/pmem0: 16.GiB, 17964204032 ​bytes, ​35086336 ​sectors
  Units: sectors of 1 * 512 = 512 bytes  Units: sectors of 1 * 512 = 512 bytes
  Sector size (logical/​physical):​ 512 bytes / 4096 bytes  Sector size (logical/​physical):​ 512 bytes / 4096 bytes
  I/O size (minimum/​optimal):​ 4096 bytes / 4096 bytes  I/O size (minimum/​optimal):​ 4096 bytes / 4096 bytes
  Disklabel type: dos  Disklabel type: dos
- Disk identifier: ​0x5af75158+ Disk identifier: ​0xfd17c8f9
   
  Device ​      Boot Start      End  Sectors ​ Size Id Type  Device ​      Boot Start      End  Sectors ​ Size Id Type
- /​dev/​pmem0p1 ​      ​2048 ​35090431 35088384 ​16.7G 83 Linux+ /​dev/​pmem0p1 ​      ​2048 ​35086335 35084288 ​16.7G 83 Linux
  
 A filesystem built on top of this partition won't be able to provide DAX with 2 MiB aligned block allocations. ​ We instead need to have our partition begin at a 2 MiB aligned boundary: A filesystem built on top of this partition won't be able to provide DAX with 2 MiB aligned block allocations. ​ We instead need to have our partition begin at a 2 MiB aligned boundary:
  
  # fdisk -l /dev/pmem0  # fdisk -l /dev/pmem0
- Disk /dev/pmem0: 16.GiB, 17966301184 ​bytes, ​35090432 ​sectors+ Disk /dev/pmem0: 16.GiB, 17964204032 ​bytes, ​35086336 ​sectors
  Units: sectors of 1 * 512 = 512 bytes  Units: sectors of 1 * 512 = 512 bytes
  Sector size (logical/​physical):​ 512 bytes / 4096 bytes  Sector size (logical/​physical):​ 512 bytes / 4096 bytes
  I/O size (minimum/​optimal):​ 4096 bytes / 4096 bytes  I/O size (minimum/​optimal):​ 4096 bytes / 4096 bytes
  Disklabel type: dos  Disklabel type: dos
- Disk identifier: ​0x276da416+ Disk identifier: ​0xfd17c8f9
   
  Device ​      Boot Start      End  Sectors ​ Size Id Type  Device ​      Boot Start      End  Sectors ​ Size Id Type
- /​dev/​pmem0p1 ​      ​4096 ​35090431 35086336 ​16.7G 83 Linux+ /​dev/​pmem0p1 ​      ​4096 ​35086335 35082240 ​16.7G 83 Linux
  
-2. Once we have a block device that starts at a 2 MiB aligned persistent memory address, we then need to create a filesystem on top of it that will give us 2 MiB aligned and sized block allocations. ​ Here are the commands to do that with either ext4 or XFS:+3. Once we have a block device that starts at a 2 MiB aligned persistent memory address, we then need to create a filesystem on top of it that will give us 2 MiB aligned and sized block allocations. ​ Here are the commands to do that with either ext4 or XFS:
  
 ext4: ext4:
Line 61: Line 76:
  
 xfs: xfs:
- # mkfs.xfs -f -d su=2m,sw=1 /​dev/​pmem0 + # mkfs.xfs -f -d su=2m,​sw=1 ​-m reflink=0 ​/​dev/​pmem0 
- # mount /dev/pmem0 /mnt/dax+ # mount -o dax,​lazytime ​/dev/pmem0 /mnt/dax
  # xfs_io -c "​extsize 2m" /mnt/dax  # xfs_io -c "​extsize 2m" /mnt/dax
  
Line 70: Line 85:
 [[https://​linux.die.net/​man/​8/​xfs_io|xfs_io(8)]] for more details. [[https://​linux.die.net/​man/​8/​xfs_io|xfs_io(8)]] for more details.
  
-3. Now that we have a filesystem that can give us 2 MiB sized and aligned+4. Now that we have a filesystem that can give us 2 MiB sized and aligned
 block allocations we just need to create a file that will receive those block allocations we just need to create a file that will receive those
 allocations. ​ To do this we need to begin with a file that is at least 2 MiB allocations. ​ To do this we need to begin with a file that is at least 2 MiB
2mib_fs_dax.1513802250.txt.gz · Last modified: 2017/12/20 20:37 by Ross Zwisler