Skip to the content.

minimal_memory_lock

Description

Page faults is a non-deterministic event and should be avoided during real-time execution.

This example shows how to lock the process virtual memory into physical RAM in order to prevent page faults happening during the applications real-time runtime phase.

Requirements

<your username>    -   memlock   <limit in kB>

Adding some memory stress

While running the examples, you can use stress-ng to generate stress the memory and generate page faults. Run the following command in another terminal before running the example:

$ stress-ng --fault 0 --perf -t 1m
# for newer kernels
$ stress-ng --userfaultfd 0 --perf -t 1m

How to run

No memory lock

By default, the memory is not locked, and we can see some page faults while the node is spinning.

$ ros2 run minimal_memory_lock minimal_memory_lock
Process memory before node creation:  13 MB
Process memory is not being locked. Memory page faults may occur.
Process memory before spin:  13 MB
Total page faults before spin [Minor: 789, Major: 0]
[WARN] [1651228546.831772017] [minimal_publisher]: New page faults during spin: [minor: 3, major: 0]
[WARN] [1651228547.331783493] [minimal_publisher]: New page faults during spin: [minor: 3, major: 0]
[INFO] [1651228547.831905095] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228548.331781121] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228548.831736137] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228549.331774937] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228549.831664055] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228550.331663058] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228550.831664222] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228551.331658742] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228551.831660235] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228552.331768024] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[WARN] [1651228552.831653605] [minimal_publisher]: New page faults during spin: [minor: 1, major: 0]
[INFO] [1651228553.331652699] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651228553.831650582] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[WARN] [1651228554.331644228] [minimal_publisher]: New page faults during spin: [minor: 1, major: 0]

Using the --allocate-in-node we can simulate the node allocating some memory on the heap when spinning.

If we allocate 50 MBytes each time we can see page faults generated more frequently:

$ ros2 run minimal_memory_lock minimal_memory_lock --allocate-in-node 50
Process memory before node creation:  13 MB
Process memory is not being locked. Memory page faults may occur.
Process memory before spin:  13 MB
Total page faults before spin [Minor: 790, Major: 0]
[WARN] [1651228794.768297975] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]
[WARN] [1651228795.268509771] [minimal_publisher]: New page faults during spin: [minor: 12803, major: 0]
[WARN] [1651228795.767921477] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]
[WARN] [1651228796.267221048] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]
[WARN] [1651228796.769758758] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]
[WARN] [1651228797.271399242] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]
[WARN] [1651228797.773109372] [minimal_publisher]: New page faults during spin: [minor: 12800, major: 0]

Locking the memory

Using the --lock-memory option we enforce all the process virtual memory to be locked into the physical RAM memory. Now we expect to see no page faults in the spin phase.

$ ros2 run minimal_memory_lock minimal_memory_lock --lock-memory
Process memory before node creation:  13 MB
Process memory before locking:  13 MB
Process memory locked
Process memory after locking:  91 MB
Process memory before spin:  91 MB
Total page faults before spin [Minor: 20439, Major: 0]
[INFO] [1651229005.207436740] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229005.707370395] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229006.207337949] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229006.707390404] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229007.207372475] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229007.707527666] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]

In the applications allocate memory dynamically during runtime we may see some page faults the first time the memory is allocated. After that, because we are using MCL_FUTURE that memory will be locked too.

$ ros2 run minimal_memory_lock minimal_memory_lock --lock-memory --allocate-in-node 50
Process memory before node creation:  13 MB
Sleeping here so all the middleware threads are created
Process memory before locking:  13 MB
Process memory locked
Process memory after locking:  91 MB
Process memory before spin:  91 MB
Total page faults before spin [Minor: 20444, Major: 0]
[WARN] [1651229441.463864983] [minimal_publisher]: New page faults during spin: [minor: 12805, major: 0]
[INFO] [1651229441.895453677] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229442.395400214] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229442.896404487] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229443.395399625] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]

One option is to pre-allocate some memory, so we don’t get page faults the first time the memory is allocated in runtime. Using the option --memory-preallocation we can specify the total amount we want to pre-allocate in for our process.

In this case, we preallocate 150 MB, this is enough to lock all the memory locked on the node construction, and the additional memory that will be allocated in runtime.

$ ros2 run minimal_memory_lock minimal_memory_lock --lock-memory --allocate-in-node 50 --memory-preallocation 150
Process memory before node creation:  13 MB
Sleeping here so all the middleware threads are created
Process memory before locking:  13 MB
Process memory locked
Process memory after locking:  91 MB
Total memory pool to preallocate: 150 MB
Additional memory to reserve: 58
Process memory after memory pool reservation:  151 MB
Process memory before spin:  151 MB
Total page faults before spin [Minor: 35294, Major: 0]
[INFO] [1651229844.746268910] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229845.218119805] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229845.717945891] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229846.218008774] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651229846.717899510] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]

Tuning the thread stack size

In the previous examples we saw how process memory after locking is considerable. This happens because mlockall will lock the memory stack of all the threads created by the process. By default, this is 8 MB per thread in Linux. This could be overkill if we have a high number of threads. One way to overcome this issue is to set stack size of each thread.

For demonstration purposes we change the default stack size of the threads using the --stacksize option.

$ ros2 run minimal_memory_lock minimal_memory_lock --lock-memory --stacksize 200
Process memory before node creation:  13 MB
Sleeping here so all the middleware threads are created
Process memory before locking:  13 MB
Process memory locked
Process memory after locking:  18 MB
Process memory before spin:  18 MB
Total page faults before spin [Minor: 2350, Major: 0]
[INFO] [1651427649.910823256] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427650.396723160] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427650.896730277] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427651.396724911] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427651.896710860] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427652.396718902] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427652.896704369] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427653.396709113] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
[INFO] [1651427653.896919652] [minimal_publisher]: New page faults during spin: [minor: 0, major: 0]
^C[INFO] [1651427654.367910536] [rclcpp]: signal_handler(signal_value=2)
Process memory after spin:  18 MB
Total page faults after spin [Minor: 2383, Major: 0]

We reduced the total memory used by the process. However, this configuration should be done for each thread knowing which stack size is needed. Otherwise, we might see a stack overflow. Some middleware implementations allow to configure the stack size of each thread.

Wait for middleware initialization

When the node is created, the rmw middleware will create some threads. If we start spinning the node too soon, the middleware thread might not be created yet, and we will see some page faults happening during the spin.

You can observe this by using the --no-sleep option. You might have to use some memory stress and run the example several times to observe this issue.

Resources