Saturday, July 31, 2010

Bryan Cantrill is with Joyent

Joyent has announced Bryan Cantrill as VP of engineering.

Bryan talked about why he joined Joyent.

Thursday, July 29, 2010

Filenames With Spaces

Filename with space may be okay in Windows environment, but not for UNIX. Recently I encountered a lot of partial backup due to filenames ended with space(s). You will be surprised to see the kind of files the user created or transferred from Windows to UNIX and I blogged about it a year ago.

To be able to evenly split the backup streams into roughly equal amount of size, my script uses find /some/dir -mount -type f -ls to generate the file list. The output will then be piped to a while loop to sum up the file size. However, shell will auto-collapse white space into a single space if you do not quote the variable. Although quoting variable is able to preserve white spaces, it will not be able to tackle filename ends with space

By applying padding (I used "==" in my below test case) to the end of the filename, I am able to preserve the original filename. Simply truncate the padding before writing to file list will resolve all these white space issue.

BTW, my environment is running Solaris.

$ touch 'a' 'b' 'c' ' start1space' '  start2space' '   start3space' \
        'mid 1space' 'mid  2space' 'mid   3space' \
        'end1space ' 'end2space  ' 'end3space   '

$ find . -type f -ls | while read x x x x x x x x x x filename
do
  echo --$filename--
done                               
--./a--
--./b--
--./c--
--./ start1space--
--./ start2space--
--./ start3space--
--./mid 1space--
--./mid 2space--
--./mid 3space--
--./end1space--
--./end2space--
--./end3space--

$ find . -type f -ls | while read x x x x x x x x x x filename
do
  echo "--$filename--"
done                             
--./a--
--./b--
--./c--
--./ start1space--
--./  start2space--
--./   start3space--
--./mid 1space--
--./mid  2space--
--./mid   3space--
--./end1space--
--./end2space--
--./end3space--

$ find . -type f -ls | sed 's/$/==/' | while read x x x x x x x x x x filename
do
  echo "--${filename%==}--"
done
--./a--
--./b--
--./c--
--./ start1space--
--./  start2space--
--./   start3space--
--./mid 1space--
--./mid  2space--
--./mid   3space--
--./end1space --
--./end2space  --
--./end3space   --

Labels: ,

Monday, July 26, 2010

Sun Set. That's Sad

With the departure of James Gosling (father of Java) in April this year, Bryan Cantrill (father of DTrace) announced his departure in his blog. It is sad to see so many good people leaving Sun. That keep me thinking what Solaris will be in the future. Sad!

Labels:

Saturday, July 24, 2010

Too Much Money is Worse than Too Little

I just read Guy Kawasaki's recent article - Why Too Much Money is Worse than Too Little and I totally agreed with all his points.

So next time when you say "Money is not a problem", think again.

Friday, July 23, 2010

The surprising truth about what motivates us

Found this video from this site.

Wanna find out more videos, click here and Congnitive Media. (Just found out that they did a visual presentation for Arup, that's my ex-company - Ove Arup, where I started my first IT job as an Engineering Analyst developing civil/structural engineering software. That was 1989, 21 years ago)

As you can see, visual presentation is one damn power tool. See my practise on the back of a napkin. FYI, I just finished reading the "2nd version" of the napkin book - Unfolding the Napkin. Both books are available in our public library

Labels:

Friday, July 16, 2010

One Million Files in A Directory Under Veritas File System

Recently my backup ran really really slow with a throughput of 50KBps. With such a low throughput, I was not able to finish it within the backup window. After some investigations, I realised that there is a directory comprises of close to 1 million files. It literally takes minutes to do a "ls -l".

Although file system allows you have that many files, it will be extremely inefficient for other downstream activities.

It will be a challenge to locate the problematic directory if you were to use shell script. Other scripting languages such as Perl will be more appropriate. With find2perl utility that comes with standard perl installation, you can get it to generate equilvant find command in perl

$ find2perl /usr/include -type d
#! /usr/bin/perl -w
    eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
        if 0; #$running_under_some_shell

use strict;
use File::Find ();

# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.

# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name   = *File::Find::name;
*dir    = *File::Find::dir;
*prune  = *File::Find::prune;

sub wanted;



# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/usr/include');
exit;


sub wanted {
    my ($dev,$ino,$mode,$nlink,$uid,$gid);

    (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
    -d _
    && print("$name\n");
}
With this skeleton code, I am able to modify it to help to locate sub-directory with the most items
#! /usr/bin/perl


use strict;
use File::Find ();
use Cwd 'abs_path';


# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.

# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name   = *File::Find::name;
*dir    = *File::Find::dir;
*prune  = *File::Find::prune;


my $max = 0;
my $maxpath;


sub wanted;


my $searchdir;
if ( $#ARGV == -1 ) {
        $searchdir=".";
} else {
        $searchdir=$ARGV[0];
}



# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, $searchdir);
print $maxpath, ' ', $max, "\n";
exit;


sub wanted {
        my ($dev,$ino,$mode,$nlink,$uid,$gid);

        my $file;
        my $count=0;
        if ( (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) && -d _ )
{
                opendir(DIR, $_);
                while (defined($file=readdir(DIR))) {
                        ++$count;
                }
                closedir(DIR);
                if ( $count > $max ) {
                        $max = $count;
                        $maxpath = abs_path($_);
                }
        }
}
If I run on my Ubuntu 10.04 Netbook Edition /usr/share directory, it will tell me /usr/share/foomatic/db/source/printer has 3258 files (include . and ..)
$ ./files-in-directory-max.pl /usr/share
/usr/share/foomatic/db/source/printer 3258

So, what will be the maximum number of files in a directory under vxfs ? I found this using Google:
Recommended maximum number of files in a single Veritas File System (VxFS) directory is 100,000

Labels: