In Linux the system call fork is said to use copy-on-write (COW) mechanism thus avoiding to actually duplicate memory if it's not really needed.
However, I've never seen an actual clear evidence of COW in action. I thought thus I build my own.
For this purpose I run a program that calculates the square of a matrix with 40,000 rows and columns, with each entry being a double (8 Bytes). This program forks into 8 processes in order to do the squaring. The actual values are populated at random. Both the initial matrix and the result are allocated using the system call mmap as rw and shareable.
I was hoping that I could see COW in action using top to track memory usage. What I get is the  view below. This at best is an example "by contradiction": fork can't be possibly actually cloning and duplicating the memory of the initial matrix at the time of call, as the system does not have 95GiB of RAM, but only 16GiB.
(Incidentally this shows that it doesn't make much sense to add the values for different processes in any of the memory columns of top. I wasn't aware of this, which I'm afraid explains in part me failing to clearly see how the copy-on-write is taking place)
It doesn't seem such compelling an evidence to show COW in action. From an educational point of view, this is a weak example, as all processes "show" 11.9GiB use. It's not a compelling, clear example one could wish for. Instead you need to argue along the lines "ignores this, ignore that other value, then imagine...and you get were we wanted". Better would be some tool showing the actual, portion of independent physical memory each process is using.
Here independent means: if you add the memory usage you see for each process you get the actual, total physical memory the whole program is using.
Being top the wrong tool for that, what could one use instead?
EDIT: For example, would any of these tools help? If so, how to use them to get that info? valgrind, smem, /proc/pid/smap...
EDIT2: Digging further as hinted here , the best chance I've found so far is through smaps (or maybe pmap) as
( echo "Parent:" ; pid=266694 ; cat /proc/$pid/smaps | grep -A9 "/dev/zero" ; echo " "; for ch in
cat /proc/$pid/task/$pid/children; do echo -e "\n###\nChild: $ch"; cat /proc/$ch/smaps| grep -A9 "/dev/zero"; done ) > out
where the parent process' id is 266694. The part of that output potentially useful here seems the following
Parent: 7fa5d9963000-7fa8d486b000 rw-s 00000000 00:01 18271562 /dev/zero (deleted)  
Size:           12500000 kB  
KernelPageSize:      4 kB  
MMUPageSize:           4 kB  
Rss:               35512 kB  
Pss:               35512 kB  
Shared_Clean:          0 kB  
Shared_Dirty:          0 kB 
Private_Clean:         0 kB  
Private_Dirty:     35512 kB
-- 7fa8d486b000-7fabcf773000 rw-s 00000000 00:01 18271561  /dev/zero (deleted)  
Size:           12500000 kB  
KernelPageSize:        4 kB  
MMUPageSize:           4 kB  
Rss:            12500000 kB  
Pss:             1562500 kB  
Shared_Clean:          0 kB  
Shared_Dirty:   12500000 kB 
Private_Clean:         0 kB  
Private_Dirty:         0 kB
    
\### Child: 266706 7fa5d9963000-7fa8d486b000 rw-s 00000000 00:01 18271562                   /dev/zero (deleted)  
Size:           12500000 kB  
KernelPageSize:        4 kB  
MMUPageSize:           4 kB 
Rss:               35872 kB  
Pss:               35872 kB  
Shared_Clean:          0 kB  
Shared_Dirty:          0 kB  
Private_Clean:         0 kB  
Private_Dirty:     35872 kB  
-- 7fa8d486b000-7fabcf773000 rw-s 00000000 00:01 18271561                   /dev/zero (deleted)  
Size:           12500000 kB  
KernelPageSize:        4 kB  
MMUPageSize:           4 kB  
Rss:            12500000 kB  
Pss:             1562500 kB  
Shared_Clean:          0 kB  
Shared_Dirty:   12500000 kB 
Private_Clean:         0 kB  
Private_Dirty:         0 kB
and the rest of the children showing roughly the same values. I'm not sure yet how to interpret this correctly.
However, I'm afraid my initial question has more to do with (*nix) memory management and the tools available for inspecting it -top and ps clearly useless here (see (ext)). And this is a completely different topic I know but an eps>0. Maybe I'd change the title or remove the question altogether...

