Monday, October 22, 2012

Remove Hundreds of Thousands of Files, take 3

After blogged about using find to remove files, I realised that there are thousands of error in execve system calls. By simulating the same scenario with just 2 files, I understand that the search PATH is the culprit.
[pid 29481] execve("/usr/lib/lightdm/lightdm/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/local/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/local/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = 0

The amount of system calls is reduced (especially errors) if I specify the correct full path of 'rm'.

$ touch somefiles-{a..z}{a..z}{a..z}

$strace -cf find . -name "somefiles-*" -exec /bin/rm -f {} \;
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.59   36.418146        2072     17576           waitpid
  2.50    0.940861          54     17576           clone
  0.72    0.271665          15     17576           unlinkat
  0.07    0.025220           0    105467           close
  0.03    0.012440         541        23           getdents64
  0.03    0.011857           1     17577           fstatat64
  0.02    0.009425           1     17576     17576 _llseek
  0.01    0.005058           0     52737           open
  0.01    0.004589           0    140626           mmap2
  0.01    0.003785           0     17577           ioctl
  0.00    0.001053           0     52757           brk
  0.00    0.001028           0     52735           fstat64
  0.00    0.000388           0     17577           munmap
  0.00    0.000245           0     70311           mprotect
  0.00    0.000000           0     17580           read
  0.00    0.000000           0     17577           execve
  0.00    0.000000           0     52734     52734 access
  0.00    0.000000           0         1           gettimeofday
  0.00    0.000000           0         2           uname
  0.00    0.000000           0     17581           fchdir
  0.00    0.000000           0         3           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           getrlimit
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0     17577           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0     17577           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   37.705760                738330     70311 total

Another way to further reduce the amount of system calls as well as run time is to take advantage of 'rm' ability to take more than one file as argument. Getting 'find' output to pipe to 'xargs -L 10 /bin/rm -f', we are able to ask 'rm' to remove 10 files at a time. You can see the mass reduction in system calls and run time.

$ cat rm.sh
#! /bin/bash

find . -name "somefiles-*" | xargs -L 10 /bin/rm -f


$ touch somefiles-{a..z}{a..z}{a..z}

$ strace -cf ./rm.sh
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.63    9.880623        5611      1761         1 waitpid
  0.98    0.097799           6     17576           unlinkat
  0.18    0.018443          10      1760           clone
  0.12    0.011728         510        23           getdents64
  0.08    0.008373           0     17577           fstatat64
  0.00    0.000331           0      7058         4 open
  0.00    0.000312           0      1842           read
  0.00    0.000289           0     14108           mmap2
  0.00    0.000041           0      1761      1760 ioctl
  0.00    0.000039           0     12342         2 close
  0.00    0.000000           0        69           write
  0.00    0.000000           0      1761           execve
  0.00    0.000000           0         1           time
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0      5296      5288 access
  0.00    0.000000           0         1           pipe
  0.00    0.000000           0      5323           brk
  0.00    0.000000           0         3           dup2
  0.00    0.000000           0         1           getppid
  0.00    0.000000           0         1           getpgrp
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0      1765           munmap
  0.00    0.000000           0         1           sigreturn
  0.00    0.000000           0         3           uname
  0.00    0.000000           0      7049           mprotect
  0.00    0.000000           0         5           fchdir
  0.00    0.000000           0      1761           _llseek
  0.00    0.000000           0        25           rt_sigaction
  0.00    0.000000           0        21           rt_sigprocmask
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        24         8 stat64
  0.00    0.000000           0      5295           fstat64
  0.00    0.000000           0         9           getuid32
  0.00    0.000000           0         9           getgid32
  0.00    0.000000           0         9           geteuid32
  0.00    0.000000           0         9           getegid32
  0.00    0.000000           0         3         1 fcntl64
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0      1761           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   10.017978                106026      7065 total

0 Comments:

Post a Comment

<< Home