Linux memory monitoring (allocations Vs usage)

How to use some of Linux’s standard tools and how different types of memory usage shows up.

Examples of using malloc and writing to memory with three use-cases for a simple process

In each case we run the example with a 64MB allocation so that we can see the usage from standard linux tools.

We do something like this

gary@linux:~/git/unixfun$ ./malloc_and_write 65536
Allocating 65536 KB
Allocating 67108864 bytes
The address of your memory is 0x7fa2829ff010
Hit <return> to exit
Continue reading

Using iperf multi-stream may not work as expected

Running iperf with parallel threads

TL;DR – When running iperf with parallel threads/workers the -P option must be specified after the -c <target-IP> option. This is mentioned in the manpage but some options (-t for instance) work in any order, while others (specifically the -P for parallel threads) definitely does not, which is a bit confusing.

For example – these two invocations of iperf give very different results

  • iperf -P 5 -c (The -P before -c) -Yields 20.4 Gbits/sec
  • iperf -c -P 5 (The -P after the -c)- Yields 78.3 Gbits/sec
Continue reading

mpstat has an option to show utilization per NUMA node

Not sure how long this has been a thing, but I recently discovered that mpstat takes a -N option for “NUMA Node” that works in the same way as -P for “Processor”. e.g. $ mpstat -N 0,1 1 will show stats for NUMA nodes 0 and 1 every 1 second. Just like mpstat -P ALL shows all processors mpstat -N ALL shows all NUMA nodes (and is easier to type).

The output looks like this

05:09:17 PM NODE    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
05:09:18 PM 0 1.13 0.00 9.30 0.00 0.28 0.15 0.00 31.78 0.00 57.21
05:09:18 PM 1 0.40 0.00 8.03 0.00 0.28 1.05 0.00 31.34 0.00 58.78

Average: NODE %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: 0 0.80 0.00 8.56 0.00 0.27 0.11 0.00 36.49 0.00 53.78
Average: 1 0.49 0.00 10.02 0.00 0.32 1.01 0.00 25.13 0.00 63.03

Using mpstat -N is is quite easy to check to see how the workload is distributed among the NUMA nodes of a multi-socket machine.