Chi Hung Chan: June 2008

Monday, June 30, 2008

Range Pattern in AWK

I came across this post regarding removing #ifdef code from C program. Although the thread is closed for no obvious reason, it is pretty interesting from the shell script view point. The original post is like this:

Removing lines of code defined under flag
I want to write a shell script that will remove lines of C code that is defined under a certain flag, for eg, "#ifdef PRODUCT" in all the C files in a directory. Please help me somebody..I'm clueless!!

In C compiler, you should be able to run through the pre-processor (-E flag) to ignore the "#ifdef PROD" code by defining all the other conditional compilation directives with the "-D" flag. However, the preprocessing also include the content of the header files. This is definitely not the solution we are looking for.

In AWK, you can specify a range pattern and apply an action statement. Suppose we have this simple.c C program and our job is to remove the #ifdef PROD block of code.

#include <stdio.h>

int main(void) {

        #ifdef PROD
        printf("Prod is here\n");
        #endif

        #ifdef DEV
        printf("Dev is here\n");
        #endif

        #ifdef PROD
        printf("Prod is there\n");
        #endif
}

Below AWK code skips to the next input record when it is within the range pattern. The start range pattern is any whitespace followed by #ifdef PROD and the end range pattern is any whitespace followed by #endif. Otherwise, it just simply print it out.

$ awk '
/^[ \t]*#ifdef PROD/,/^[ \t]*#endif/ {
next
}
{
print
}' sample.c
#include <stdio.h>

int main(void) {


        #ifdef DEV
        printf("Dev is here\n");
        #endif

}

Range pattern in AWK is particularly useful and very often people deliberately plant their own start and end patterns in their data stream for future processing. May be you can start to think along this line.

Labels: awk

Thursday, June 26, 2008

I Knew What You Did In Your Sessions

This blog title sounds like the 1997 movie I Knew What You Did Last Summer. Yes indeed, but in Solaris operating system, not in cinema. By turning on the auditing feature in Solaris, you can really tell what the user did in all his/her sessions. However the default setting in audit_control does not capture the execve system call. In order to find out the command executed together with the argument supplied to it, you need to include the "ex" in the flags: in /etc/security/audit_control and include "/usr/sbin/auditconfig -setpolicy +argv" in /etc/security/audit_startup This is what I have in my demo server:

# egrep -v '^#' /etc/security/audit_control
dir:/var/audit
flags:lo,fc,fm,fd,ex,xp,xc,xs,xx,fa
minfree:20
naflags:lo

# egrep -v '^#' /etc/security/audit_startup
/usr/bin/echo "Starting BSM services."
/usr/sbin/auditconfig -setpolicy +cnt
/usr/sbin/auditconfig -setpolicy +arg
/usr/sbin/auditconfig -conf
/usr/sbin/auditconfig -aconf

Once that has been configured properly, you can start the auditing by rebooting the server.

# cd /etc/security

# ./bsmconv
This script is used to enable the Basic Security Module (BSM).
Shall we continue with the conversion now? [y/n] y
bsmconv : INFO : checking startup file .
bsmconv : INFO : turning on audit module .
bsmconv : INFO : initializing device allocation .
The Basic Security Module is ready .
If there were any errors , please fix them now .
Configure BSM by editing files located in /etc/ security .
Reboot this system now to come up with BSM enabled .

# reboot


----- wait for the system to boot up -----



# svcs | grep "auditd"
online 23:30:03 svc:/system/auditd:default

The default location for the audit trail is in /var/audit. It is best practice to ensure you have sufficient space in this directory and /var (or /var/audit) is a separate partition because the audit trail file grows very fast in a production system. The audit trail files are stored in binary format and the data structure is described in audit.log. Two built-in utilities are provided in Solaris to streamline the audit reporting, they are auditreduce and praudit. The praudit default output is pretty hard to understand and comma separated variable output is not in fix-field format

# auditreduce | praudit | head
file,2008-06-23 10:56:27.000 +08:00,
header,44,2,system booted,na,2008-06-23 10:56:27.330 +08:00
text,booting kernel
header,145,2,open(2) - read,,solaris11,2008-06-23 11:02:51.437 +08:00
path,/devices/pseudo/pool@0:pool
attribute,20666,root,sys,336,118489094,970662608897
subject,chihung,root,root,root,root,674,2971128068,787 65558 10.1.2.84
return,success,7
header,120,2,open(2) - read,write,sp,solaris11,2008-06-23 11:02:51.467 +08:00
path,/var/adm/lastlog

However, if you read the man page of praudit, you will realise that it can dump the audit trail in XML. Here is a sample dump:

# auditreduce | praudit -x | head -50
<?xml version='1.0' encoding='UTF-8' ?>
<?xml-stylesheet type='text/xsl' href='file:///usr/share/lib/xml/style/adt_record.xsl.1' ?>

<!DOCTYPE audit PUBLIC '-//Sun Microsystems, Inc.//DTD Audit V1//EN' 'file:///usr/share/lib/xml/dtd/adt_record.dtd.1'>

<audit>
<file iso8601="2008-06-24 13:14:44.000 +08:00"></file>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.536 +08:00">
<argument arg-num="2" value="0x4" desc="cmd"/>
<argument arg-num="1" value="0x6" desc="no path: fd"/>
<attribute mode="10000" uid="root" gid="root" fsid="344" nodeid="323" device="0"/>
<subject audit-uid="chihung" uid="root" gid="root" ruid="root" rgid="root" pid="704" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.536 +08:00">
<argument arg-num="2" value="0x4" desc="cmd"/>
<argument arg-num="1" value="0x7" desc="no path: fd"/>
<attribute mode="10000" uid="root" gid="root" fsid="344" nodeid="323" device="0"/>
<subject audit-uid="chihung" uid="root" gid="root" ruid="root" rgid="root" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.542 +08:00">
<argument arg-num="2" value="0x3" desc="cmd"/>
<argument arg-num="1" value="0x5" desc="no path: fd"/>
<attribute mode="10000" uid="chihung" gid="other" fsid="344" nodeid="324" device="0"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="2"/>
</record>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.542 +08:00">
<argument arg-num="2" value="0x4" desc="cmd"/>
<argument arg-num="1" value="0x5" desc="no path: fd"/>
<attribute mode="10000" uid="chihung" gid="other" fsid="344" nodeid="324" device="0"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.542 +08:00">
<argument arg-num="2" value="0x3" desc="cmd"/>
<argument arg-num="1" value="0x6" desc="no path: fd"/>
<attribute mode="10000" uid="chihung" gid="other" fsid="344" nodeid="324" device="0"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="2"/>
</record>
<record version="2" event="fcntl(2)" host="solaris11" iso8601="2008-06-24 13:14:44.542 +08:00">
<argument arg-num="2" value="0x4" desc="cmd"/>
<argument arg-num="1" value="0x6" desc="no path: fd"/>
<attribute mode="10000" uid="chihung" gid="other" fsid="344" nodeid="324" device="0"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>
<record version="2" event="access(2)" host="solaris11" iso8601="2008-06-24 13:14:44.547 +08:00">
<path>/dev/pts/1</path>
<attribute mode="20620" uid="chihung" gid="tty" fsid="337" nodeid="5942560" device="103079215105"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>
<record version="2" event="stat(2)" host="solaris11" iso8601="2008-06-24 13:14:44.548 +08:00">
<path>/dev/pts/1</path>
<attribute mode="20620" uid="chihung" gid="tty" fsid="337" nodeid="5942560" device="103079215105"/>
<subject audit-uid="chihung" uid="chihung" gid="other" ruid="chihung" rgid="other" pid="707" sid="1932086420" tid="639 65558 10.1.2.84"/>
<return errval="success" retval="0"/>
</record>

"sid" is the session ID and it is not difficult to find out how many records for all the sessions using Tcl (with tDOM extension). Once we locate the session ID, we can dump out all the execve commands and there corresponding arguments. auditreduce is able merge and select audit records from audit trail files, -a and -b flags to dump audit records between two timestamps. See the man page for all other options to reduce the audit trail output before it is piped to praudit.

$ ./sid.tcl
Usage: ./sid.tcl <xmlfile> <user>

$ ./sid.tcl sample.xml chihung
3331057518=19
2156193989=19
2283295222=19
740162063=177
3244834577=22
1932086420=387

$ ./sid-exec.tcl
Usage: ./sid-exec.tcl <xmlfile> <sid>

$ ./sid-exec.tcl sample.xml 740162063
/usr/bin/bash  ; -bash  ; success
/usr/lib/fs/ufs/quota  ; /usr/sbin/quota  ; success
/usr/bin/cat  ; /bin/cat -s /etc/motd  ; success
/usr/bin/mail  ; /bin/mail -E  ; success
/usr/bin/hostname  ; hostname  ; success
/usr/bin/more  ; more audit_control  ; success
/usr/bin/cat  ; cat audit_startup  ; success
/usr/bin/man  ; man auditconfig  ; success
/sbin/sh  ; sh -c cd /usr/man; tbl /usr/man/man1m/auditconfig.1m |neqn /usr/share/lib/pub/eqnchar - |nroff -u0 -Tlp -man - | col -x > /tmp/mpTkaODb  ; success
/usr/bin/col  ; col -x  ; success
/usr/bin/tbl  ; tbl /usr/man/man1m/auditconfig.1m  ; success
/usr/bin/neqn  ; neqn /usr/share/lib/pub/eqnchar -  ; success
/usr/bin/nroff  ; nroff -u0 -Tlp -man -  ; success
/sbin/sh  ; sh -c trap '' 1 15; /usr/bin/mv -f /tmp/mpTkaODb /usr/man/cat1m/auditconfig.1m 2> /dev/null  ; success
/usr/bin/mv  ; /usr/bin/mv -f /tmp/mpTkaODb /usr/man/cat1m/auditconfig.1m  ; success
/sbin/sh  ; sh -c more -s /tmp/mpTkaODb  ; success
/usr/bin/more  ; more -s /tmp/mpTkaODb  ; success
/usr/bin/ls  ; ls  ; success
/usr/sbin/auditreduce  ; auditreduce  ; success
/usr/sbin/praudit  ; praudit  ; success
/usr/bin/su  ; su -  ; success
/sbin/sh  ; -sh  ; success
/usr/lib/fs/ufs/quota  ; /usr/sbin/quota  ; success
/usr/bin/cat  ; /bin/cat -s /etc/motd  ; success
/usr/bin/mail  ; /bin/mail -E  ; success
/usr/bin/hostname  ; hostname  ; success
/usr/sbin/auditreduce  ; auditreduce  ; success
/usr/bin/prstat  ; prstat -x  ; success
/usr/bin/sparcv9/prstat  ; prstat -x  ; success
/usr/sbin/auditreduce  ; auditreduce  ; success
/usr/sbin/praudit  ; praudit -x  ; success
/usr/sbin/auditreduce  ; auditreduce -a 20080624130000  ; success
/usr/sbin/praudit  ; praudit -x  ; success
/usr/bin/vi  ; vi /var/tmp/x  ; success
/usr/bin/grep  ; grep 2283295222 /var/tmp/x  ; success
/usr/sbin/auditreduce  ; auditreduce -a 20080624130000  ; success
/usr/sbin/praudit  ; praudit -x  ; success

Here is the sid.tcl and sid-exec.tcl

$ cat sid.tcl
#! /usr/local/bin/tclsh


if { $argc != 2 } {
        puts stderr "Usage: $argv0 <xmlfile> <user>"
        exit 1
}
set xmlfile [lindex $argv 0]
set uid [lindex $argv 1]


package require tdom


set nrec 0
set doc [dom parse [tDOM::xmlReadFile [lindex $argv 0]]]
set root [$doc documentElement]
foreach r [$root selectNodes "//subject\[@uid='$uid']"] {
        set sid [$r getAttribute sid]
        if { [info exist arraySid($sid)] } {
                incr arraySid($sid)
        } else {
                set arraySid($sid) 1
        }
}
foreach { n v } [array get arraySid] {
        puts "$n=$v"
}

$ cat sid-exec.tcl
#! /usr/local/bin/tclsh


if { $argc != 2 } {
        puts stderr "Usage: $argv0 <xmlfile> <sid>"
        exit 1
}
set xmlfile [lindex $argv 0]
set sid [lindex $argv 1]


package require tdom


set nrec 0
set doc [dom parse [tDOM::xmlReadFile [lindex $argv 0]]]
set root [$doc documentElement]
foreach r [$root selectNodes "//subject\[@sid='$sid']/.."] {
        set event [$r getAttribute event]
        if { ![string equal $event {execve(2)}] } { continue }
        foreach pNode [$r selectNodes {path/text()}] {
                puts -nonewline "[string trim [$pNode nodeValue]] "
        }
        puts -nonewline " ; "
        foreach argNode [$r selectNodes {exec_args/arg/text()}] {
                puts -nonewline "[string trim [$argNode nodeValue]] "
        }
        puts -nonewline " ; "
        puts "[[$r selectNodes {return}] getAttribute errval]"
}

Labels: security, Solaris, Tcl, XML, xpath

Wednesday, June 25, 2008

Optimising My First Python Program

After My First Python Program, I realised that Python provides iterable object built-in function that is optimised in execution speed, memory utilisation and coding effort.

With iterable object, you can get the object to work directly with the "for" loop that has been optimised in C code. As for the "while" loop, it is actually done in Python byte code through the Python virtual machine and generally is slower than the "for" loop. In general, one should optimise code as this:

change "while" loop to "for" loop
iterate every item in the Object using the iterable nature of the object
- file object: for line in open("access.log","r"):
- Dictionary object: for key in dictObj:
  instead of
  for key in dictObj.keys():
- String object: for char in strObj:
- List object: for item in listObj:

My previous not optimised code:

file=open(sys.argv[1],'r')
line=file.readline()
while line:
        fields=line.split()
        times=fields[1].split(':')
        hour=times[0]
        sc[hour] += int(fields[18])
        cs[hour] += int(fields[19])
        cnt[hour] += 1
        line=file.readline()
file.close()

Optimised code:

for line in open(sys.argv[1],'r'):
        fields=line.split()
        times=fields[1].split(':')
        hour=times[0]
        sc[hour] += int(fields[18])
        cs[hour] += int(fields[19])
        cnt[hour] += 1

As mentioned in Learning Python, 3rd Edition ,

The concept of “iterable objects” is relatively new in Python. It’s essentially a generalization of the notion of sequences—an object is considered iterable if it is either a physically stored sequence, or an object that produces one result at a time in the context an iteration tool like a for loop

Labels: performance, python

The usefulness of a cup is in its emptiness

A learned man once went to a Zen master to inquire
about Zen. As the Zen master talked, the learned man
would frequently interrupted him with remarks like,
"Oh yes, we have that too....", etc.
Finally the Zen master stopped talking and began to
serve tea to the learned man; however, he kept on pour-
ing until the tea cup over flowed.
"Enough, no more can go into the cup!" the learned man
interrupted.
"Indeed I see," answered the Zen master. "If you do not
first empty your cup, how can you taste my cup of tea?"

The above was written by Bruce Lee in his Jee Kune Do manuscript. Use Google to locate the scanned pdf copy.

I can tell you personally that it is extremely difficult to "empty" one's cup in order to "taste" other's tea. Some years ago, I tried but failed to "empty" anything. Now I am trying it again and hope that I can empty 10-20% of "my cup of tea". BTW, I am relating this to IT skills that I know and don't know.

The moral of the story is "The usefulness of a cup is in its emptiness"

Trimming Large Log File

If you have a very large web access log and you cannot afford to shutdown the web server in order to cleanup or rotate the log file, you may want to use this trick. Copy the content to a separate file, if the content is worth saving. Once that is done, copy the /dev/null to overwrite the existing web access log. Don't worry, your web server is still having the same file handler to the existing web access log.

I tested it out by writing a simple Tcl program to append line to a file every 0.1 seconds to simulate the log generation. In a separate terminal, I overwrite the file with /dev/null. The file got trimmed and the Tcl program still keep on appending to the file. The Tcl program looks like this:

$ tclsh
% set fp [open access.log a]
% fconfigure $fp -buffering line
% set n 0
% while 1 {
 puts $fp "[incr n] appending line to log file ..."
 after 100
}

You may loose some records but at least you do not have to worry about the large log file filling up your file system.

Labels: Tcl, unix

Friday, June 20, 2008

New Trick to Parse Field Separated File

After so many years of using UNIX, I am still learning new tricks in programming shell script. Today, I am going to show you a trick which will save you a lot of repetitive coding in extracting fields from a field-separated file such as /etc/passwd.

I used to do it this way:

$ while read line
do
    user=`echo $line | cut -d: -f1`
    uid=`echo $line | cut -d: -f3`
    gid=`echo $line | cut -d: -f4`
    echo "User=$user ($uid:$gid)"
done < /etc/passwd

User=root (0:0)
User=daemon (1:1)
User=bin (2:2)
User=sys (3:3)
User=adm (4:4)
User=lp (71:8)
User=uucp (5:5)
User=nuucp (9:9)
User=dladm (15:3)
User=smmsp (25:25)
User=listen (37:4)
User=gdm (50:50)
User=webservd (80:80)
User=postgres (90:90)
User=nobody (60001:60001)
User=noaccess (60002:60002)
User=nobody4 (65534:65534)
User=chihung (100:1)

New way to do thing:

$ while IFS=: read user x uid gid dummy
do
    echo "User=$user ($uid:$gid)"
done < /etc/passwd
 
User=root (0:0)
User=daemon (1:1)
User=bin (2:2)
User=sys (3:3)
User=adm (4:4)
User=lp (71:8)
User=uucp (5:5)
User=nuucp (9:9)
User=dladm (15:3)
User=smmsp (25:25)
User=listen (37:4)
User=gdm (50:50)
User=webservd (80:80)
User=postgres (90:90)
User=nobody (60001:60001)
User=noaccess (60002:60002)
User=nobody4 (65534:65534)
User=chihung (100:1)

The above trick is to set the IFS environment variable as colon (IFS=:) for "read" command. Any name=value pair set before a command will be used as environment variable within the command and there will not have any side effect to your existing shell. Below shows how you can set the environment variables to command "env". Also the man page of sh (man sh in Solaris) describing the usage of IFS

$ /bin/sh
$ MY_a=a MY_b=b MY_c=c env | grep MY
MY_a=a
MY_b=b
MY_c=c

$ env | grep MY

$ man sh
 ....
     read name ...

         One line is read from the standard input and, using  the
         internal  field  separator, IFS (normally space or tab),
         to delimit word boundaries, the first word  is  assigned
         to  the  first name, the second word to the second name,
         and so forth, with leftover words assigned to  the  last
         name.  Lines can be continued using \newline. Characters
         other than newline can be quoted by preceding them  with
         a  backslash. These backslashes are removed before words
         are assigned to names, and no interpretation is done  on
         the  character  that  follows  the backslash. The return
         code is 0, unless an EOF is encountered.

Now I can write cleaner and efficient script to parse any field-separated type of file.

Labels: shell script

Wednesday, June 18, 2008

Cool Stuff from OpenSolaris 2008.05

You have to read this OpenSolaris 2008.05 review from Ashwin Bhat, a Sun Campus Ambassador. He has more or less summarised most of the cool stuff in OpenSolaris

Labels: opensolaris

Tuesday, June 17, 2008

How to Parse Multiline in AWK

I was given a Java thread dump output from my colleague and the output is kind of a multiline records with record separator (RS) being a blank line. The question is, can AWK parse this type of file just like other space/colon/comma separated files. The answer is yes. It is well documented in The AWK Programming Language written by the three authors. If you have the book, it is described on page 60-61.

Here is a sample script to demonstrate how we can summarise the "Thread-" name that is in waiting and/or in lock state. The sample Java thread dump is download from here.

$ cat threaddump.sh
#! /bin/sh

if [ $# -ne 1 ]; then
 echo "Usage: $0 <thread-dump>"
 exit 1
fi
if [ ! -f $1 ]; then
 echo "Error: $1 does not exist"
 exit 2
fi


awk '
BEGIN {
 RS=""
 FS="\n"
}
/^"Thread-/ {
 split($1,a," ")
 if ( /waiting on/ ) {
  wait[a[1]]++
 }
 if ( /locked/ ) {
  lock[a[1]]++
 }
}
END {
 printf("Thread\t\tWait\tLock\n")
 printf("------\t\t----\t----\n")
 for ( i in wait ) {
  printf("%s\t%d\t%d\n", i, wait[i], lock[i])
 }
}' $1


$ cat threaddump.txt
Full thread dump Java HotSpot(TM) Client VM (1.5.0_10-b03 mixed mode):

"Thread-7" prio=4 tid=0x0b482220 nid=0x1570 in Object.wait() [0x0bbcf000..0x0bbcfae8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x03017960> (a concurrency.diners.Fork)
 at java.lang.Object.wait(Object.java:474)
 at concurrency.diners.Fork.get(Fork.java:22)
 - locked <0x03017960> (a concurrency.diners.Fork)
 at concurrency.diners.Philosopher.run(Philosopher.java:29)

"Thread-6" prio=4 tid=0x0b481808 nid=0xa84 in Object.wait() [0x0bb8f000..0x0bb8fb68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x030707e0> (a concurrency.diners.Fork)
 at java.lang.Object.wait(Object.java:474)
 at concurrency.diners.Fork.get(Fork.java:22)
 - locked <0x030707e0> (a concurrency.diners.Fork)
 at concurrency.diners.Philosopher.run(Philosopher.java:29)

"Thread-5" prio=4 tid=0x0b47e310 nid=0x167c in Object.wait() [0x0bb4f000..0x0bb4fbe8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x03070850> (a concurrency.diners.Fork)
 at java.lang.Object.wait(Object.java:474)
 at concurrency.diners.Fork.get(Fork.java:22)
 - locked <0x03070850> (a concurrency.diners.Fork)
 at concurrency.diners.Philosopher.run(Philosopher.java:29)

"Thread-4" prio=4 tid=0x0b47d808 nid=0x1730 in Object.wait() [0x0bb0f000..0x0bb0fc68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x030708c0> (a concurrency.diners.Fork)
 at java.lang.Object.wait(Object.java:474)
 at concurrency.diners.Fork.get(Fork.java:22)
 - locked <0x030708c0> (a concurrency.diners.Fork)
 at concurrency.diners.Philosopher.run(Philosopher.java:29)

"Thread-3" prio=4 tid=0x0b480cd8 nid=0x11c4 in Object.wait() [0x0bacf000..0x0bacfce8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x03017b38> (a concurrency.diners.Fork)
 at java.lang.Object.wait(Object.java:474)
 at concurrency.diners.Fork.get(Fork.java:22)
 - locked <0x03017b38> (a concurrency.diners.Fork)
 at concurrency.diners.Philosopher.run(Philosopher.java:29)

"AWT-EventQueue-1" prio=4 tid=0x0b46e1d0 nid=0x16c8 in Object.wait() [0x0ba4f000..0x0ba4fa68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x02ffa368> (a java.awt.EventQueue)
 at java.lang.Object.wait(Object.java:474)
 at java.awt.EventQueue.getNextEvent(EventQueue.java:345)
 - locked <0x02ffa368> (a java.awt.EventQueue)
 at java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:189)
 at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:163)
 at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:157)
 at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:149)
 at java.awt.EventDispatchThread.run(EventDispatchThread.java:110)

"DestroyJavaVM" prio=6 tid=0x00266dc0 nid=0x1118 waiting on condition [0x00000000..0x0006fae8]

"AWT-EventQueue-0" prio=6 tid=0x0b451f60 nid=0x124c in Object.wait() [0x0b82f000..0x0b82fbe8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x0300a858> (a java.awt.EventQueue)
 at java.lang.Object.wait(Object.java:474)
 at java.awt.EventQueue.getNextEvent(EventQueue.java:345)
 - locked <0x0300a858> (a java.awt.EventQueue)
 at java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:189)
 at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:163)
 at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:157)
 at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:149)
 at java.awt.EventDispatchThread.run(EventDispatchThread.java:110)

"thread applet-concurrency/diners/Diners.class" prio=4 tid=0x0b3cab40 nid=0x10a0 in Object.wait() [0x0b7ef000..0x0b7efb68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x0300a900> (a sun.applet.AppletViewerPanel)
 at java.lang.Object.wait(Object.java:474)
 at sun.applet.AppletPanel.getNextEvent(AppletPanel.java:282)
 - locked <0x0300a900> (a sun.applet.AppletViewerPanel)
 at sun.applet.AppletPanel.run(AppletPanel.java:332)
 at java.lang.Thread.run(Thread.java:595)

"AWT-Windows" daemon prio=6 tid=0x0ac90b38 nid=0x1124 runnable [0x0af0f000..0x0af0fce8]
 at sun.awt.windows.WToolkit.eventLoop(Native Method)
 at sun.awt.windows.WToolkit.run(WToolkit.java:269)
 at java.lang.Thread.run(Thread.java:595)

"AWT-Shutdown" prio=6 tid=0x0ac90780 nid=0x7dc in Object.wait() [0x0aecf000..0x0aecfd68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x02fb5830> (a java.lang.Object)
 at java.lang.Object.wait(Object.java:474)
 at sun.awt.AWTAutoShutdown.run(AWTAutoShutdown.java:259)
 - locked <0x02fb5830> (a java.lang.Object)
 at java.lang.Thread.run(Thread.java:595)

"Java2D Disposer" daemon prio=10 tid=0x0ac82aa8 nid=0x1014 in Object.wait() [0x0ae8f000..0x0ae8f9e8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x02fdd7a8> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
 - locked <0x02fdd7a8> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
 at sun.java2d.Disposer.run(Disposer.java:107)
 at java.lang.Thread.run(Thread.java:595)

"Low Memory Detector" daemon prio=6 tid=0x00a94e70 nid=0x1038 runnable [0x00000000..0x00000000]

"CompilerThread0" daemon prio=10 tid=0x00a93a70 nid=0x12c8 waiting on condition [0x00000000..0x0abcf8c8]

"Signal Dispatcher" daemon prio=10 tid=0x00a92e28 nid=0x16bc waiting on condition [0x00000000..0x00000000]

"Finalizer" daemon prio=8 tid=0x00a89cd0 nid=0x1044 in Object.wait() [0x0ab4f000..0x0ab4fc68]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x02fdd950> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
 - locked <0x02fdd950> (a java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
 at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x00a88860 nid=0x106c in Object.wait() [0x0ab0f000..0x0ab0fce8]
 at java.lang.Object.wait(Native Method)
 - waiting on <0x02fdd700> (a java.lang.ref.Reference$Lock)
 at java.lang.Object.wait(Object.java:474)
 at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
 - locked <0x02fdd700> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x00a85d98 nid=0x1030 runnable 

"VM Periodic Task Thread" prio=10 tid=0x00a960c8 nid=0x14b4 waiting on condition 


$ ./threaddump.sh threaddump.txt
Thread          Wait    Lock
------          ----    ----
"Thread-3"      1       1
"Thread-4"      1       1
"Thread-5"      1       1
"Thread-6"      1       1
"Thread-7"      1       1

I can tell you that AWK is damn powerful.

Labels: awk, shell script

Monday, June 16, 2008

My First Python Program

This is my second attempt in trying to learn Python since 2000. You may be wondering what's the motivation behind it and whether I will "dump" my favourite scripting, Tcl, to go full steam with Python. Tcl is still my "mother tongue" and definitely no harm to learn another "foreign language".

The motivation comes from "The Zen of Python" and the way they do multi-precision integer calculation. Below shows Python in action and compare with Perl (with Bignum module) & UNIX bc:

$ /cygdrive/c/Python25/python
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361
22493198378815695858127594672917553146825187145285692314043598457757469857480393
45677748242309854210746050623711418779541821530464749835819412673987675591655439
46077062914571196477686542167660429831652624386837205668069376L
>>>exit()

$ perl -v

This is perl, v5.8.8 built for cygwin-thread-multi-64int
(with 8 registered patches, see perl -V for more detail)

Copyright 1987-2006, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

$ echo "use Bignum; print 2**1000" | perl
1.07150860718627e+301

$ echo "2^1000" | bc
10715086071862673209484250490600018105614048117055336074437503883703\
51051124936122493198378815695858127594672917553146825187145285692314\
04359845775746985748039345677748242309854210746050623711418779541821\
53046474983581941267398767559165543946077062914571196477686542167660\
429831652624386837205668069376

Recently my colleague passed me a pretty big (170MB in size, 227K lines) IIS log file and I thought this is a good time to practice my Python skill. BTW, this is my first not so trivial Python program. The objective of the program is to work out the hourly byte sent, byte received and hits. Also, I wanted to compare Python with AWK and Tcl (8.4.12).

Here is the "battle field" for Python vs Tcl vs AWK in my Cygwin. To be fair, each program will be executed 3 times.

$ time ./sum.py iis.log > a

real    0m14.672s
user    0m0.015s
sys     0m0.015s

$ time ./sum.py iis.log > a

real    0m15.391s
user    0m0.031s
sys     0m0.031s

$ time ./sum.py iis.log > a

real    0m15.094s
user    0m0.015s
sys     0m0.031s

$ time ./sum.sh iis.log > b

real    0m18.704s
user    0m15.170s
sys     0m0.373s

$ time ./sum.sh iis.log > b

real    0m18.219s
user    0m14.951s
sys     0m0.233s

$ time ./sum.sh iis.log > b

real    0m18.390s
user    0m14.873s
sys     0m0.483s

$ time ./sum.tcl iis.log > c

real    0m15.781s
user    0m0.015s
sys     0m0.015s

$ time ./sum.tcl iis.log > c

real    0m14.641s
user    0m0.015s
sys     0m0.000s

$ time ./sum.tcl iis.log > c

real    0m15.031s
user    0m0.015s
sys     0m0.000s


# verify the output are the same
# btw, python and tcl treated the default end of line to be CRLF (native platform is Windows)
$ for i in a b c
do
dos2unix < a | md5sum
done
83211bf4faa32495ca9eb52c6b520974 *-
83211bf4faa32495ca9eb52c6b520974 *-
83211bf4faa32495ca9eb52c6b520974 *-

It is clear the both Python and Tcl come in neck to neck. A comprehesive scripting language like Python and Tcl is definitely more versatile than a specific tool such as AWK. Below are the source codes for the various programs in case you are interested in the details:

$ cat sum.py
#! /cygdrive/c/Python25/python

import sys

if len(sys.argv) != 2:
        print "Usage:", sys.argv[0], ""
        exit(1)


sc={}
cs={}
cnt={}
for i in range(24):
        index='%02d' % i
        sc[index]=0
        cs[index]=0
        cnt[index]=0


file=open(sys.argv[1],'r')
line=file.readline()
while line:
        fields=line.split()
        times=fields[1].split(':')
        hour=times[0]
        sc[hour] += int(fields[18])
        cs[hour] += int(fields[19])
        cnt[hour] += 1
        line=file.readline()
file.close()


k=sc.keys()
k.sort()
for i in k:
        print i,sc[i],cs[i],cnt[i]




$ cat sum.sh
#! /bin/sh

if [ $# -ne 1 ]; then
        echo "Usage: $0 <input-log>"
        exit 1
fi

awk '
{
        split($2,t,":")
        hr=t[1]
        sc[hr]+=$19
        cs[hr]+=$20
        hit[hr]++
}
END {
        for ( h=0 ; h<24 ; ++h ) {
                hh=sprintf("%02d",h)
                print hh, sc[hh], cs[hh], hit[hh]
        }
}' $1




$ cat sum.tcl
#! /cygdrive/c/ActiveTcl/8.4.12.0/bin/tclsh

if { $argc != 1 } {
        puts stderr "Usage: $argv0 "
        exit 1
}
set logfile [lindex $argv 0]
if { ![file exists $logfile] } {
        puts stderr "Error. $logfile does not exist"
        exit 2
}


# initialise to 0
set hours {}
for { set h 0 } { $h < 24 } { incr h } {
        lappend hours [format {%02d} $h]
}
foreach hr $hours {
        set sc($hr) 0
        set cs($hr) 0
        set hit($hr) 0
}


set fp [open $logfile r]
while { [gets $fp line] >= 0 } {
        set time [lindex $line 1]
        set hr [lindex [split $time :] 0]
        incr sc($hr) [lindex $line 18]
        incr cs($hr) [lindex $line 19]
        incr hit($hr)
}
close $fp


foreach hr $hours {
        puts "$hr $sc($hr) $cs($hr) $hit($hr)"
}

I just covered 200 pages (out of 746 pages) of the Learning Python, 3rd Edition and hope to explore more features as I go into the details. So far, I particularly like the feature-rich OO methods available in their core objects. However, I still have not figure out how to differentiate between attribute and method of an object.

Labels: awk, Cygwin, python, Tcl

Tuesday, June 10, 2008

Start Everything "At Now"

In Solaris, you can put a process to run in background by ending the command with a &. However, your background job will be killed or terminated if you quit from your existing shell session.

In order to allow the background job to continue to run even after you exit from your shell, you need to run with nohup(1) command. All standard output and error will be redirected to nohup.out file.

If you do not like this arrangement, you may want to try at(1) command to run at a later time or even "now". By default, you can have a maximum of 4 parallel at jobs. See queuedefs(4) for the configuration details. BTW, the file is located at /etc/cron.d/queuedefs.

To submit a job to run as a background (in actual fact, it is a kind of cron job), type in your command(s) at the "at" prompt. Type control-D to finish the "at" command:

$ at now
at> /my/long/running/job.sh
at> ctrl-D
commands will be executed using /bin/bash
job 1213107087.a at Tue Jun 10 22:11:27 2008

You may want to start everything AT NOW

Labels: Solaris

Friday, June 06, 2008

sed trick

If your sed regular expression pattern contains "/", it will break sed because it will think that you are trying to terminate the command premature. You can try to use other character to replace "/", if not you have to escape the "/" with "\/". See below for illustration:

$ echo "/home/chihung/bin" | sed -e "s/\/home\/chihung/\/user\/chchan/"
/user/chchan/bin

$ echo "/home/chihung/bin" | sed -e "s#/home/chihung#/user/chchan#"
/user/chchan/bin

One caveat, if you use "!" to replace "/" within the double quote expression pattern, sed in an interactive bash shell will complain because the shell will interpret "!" as if you are trying to access the history event. Shell will not parse single-quoted string and therefore sed expression pattern using "!" as separator is fine. See below:

$ echo "/home/chihung/bin" | sed -e "s!/home/chihung!/user/chchan!"
bash: !/home/chihung!/user/chchan!": event not found

$ echo "/home/chihung/bin" | sed -e 's!/home/chihung!/user/chchan!'
/user/chchan/bin

$ cat a.sh
#! /bin/sh
echo "/home/chihung/bin" | sed -e "s!/home/chihung!/user/chchan!"

$ ./a.sh
/user/chchan/bin

Labels: shell script

Storage Solution on OpenSolaris

I blogged about Sun Open Storage few weeks ago and I am sure you will agree with me that the future for storage will be based on open standards.

If we are in the same frequency, you should read this article:
Building a Storage Server on top of OpenSolaris by Thomas Nau from University of Ulm, Germany

With ZFS, iSCSI (target and initiator), CIFS, NFS (NFSv4.1, a.k.a parallel NFS (pNFS) in the pipeline) all being "first-class citizen" in OpenSolaris, I am sure OpenSolaris will rock.

Labels: opensolaris, storage

Thursday, June 05, 2008

Google Shell

With the power of AJAX, goosh.org brings Google search in command line inside a web browser similar to UNIX shell.

Type help for more info

By default, it is web search. Here I am trying to search for myself
Hit the "Return" key for more search results

images for image search. Here I am looking for OpenSolaris images

You can "cd blogs" to blogs namespaces. Once you are there, whatever keywords you type will be used for blog search. Here I am looking for blog that talk about Solaris and cputrack

You can also restrict your search just to Wikipedia. Also, you can do Google Translate. Here I am trying to translate "Very Cool" from English to Chinese

Very Cool, 非常酷, sehr cool, ...

Labels: google

Wednesday, June 04, 2008

Virtualisation of Singapore Skyline

Above: Physical - Singapore skyline from 70th floor
Below: Virtual - Singapore skyline on the beach (in sand)

Labels: singapore

Tuesday, June 03, 2008

Learn About The Grid

CERN - European Organisation for Nuclear Research, where the World Wide Web started, has put up a web site: Grid Cafe (The place for everyone to learn about the grid). It consists of demos, games, history, projects, blogs, podcasts, videos.

See this video for an introduction

Labels: grid

Rendering, a scalability case study

Recently I assisted an overseas partner to resolve a performance/scalability problem regarding Maya software rendering on Windows 2003. Their customer was telling them that the Sun Fire X4600 M2 server's performance is worst than Sun Ultra 40 M2 workstation. It is hard to believe that the server's performance (8, 16 cores) is worst than the workstation (only 4 cores).

I was given the Maya scene file and managed to render on my older Sun Fire X4600 server with Windows 2003 64-bit and Maya 8.5.

By default, Maya renderer will try to make use of all the CPU (default command line "-n 0") in the computer. However, not all the CPU will be used throughout the entire rendering process. Below Task Manager showed that only a few of the cores are actively participated in a 16-core rendering.

A typical frame is rendered using various number of cores and it is plotted against the ideal case (1/n). As you can see, the 'sweet spot' is to render with 1-3 cores. Also, performance is getting worst if more than 3/4 cores are deployed in the rendering. Likely Maya has to spend more time to do housekeeping internally. If the server has enough memory for the job, the best choice is to render one frame per core. Having more processes rendering 1 frame per core will result in more jobs trying to fetch the Maya file and it's corresponding dependencies (they are big, in this case it is 1+GB) from the network.

The CPU utilisation for "Render -n 16" command is captured using perfmon and is visualised using Gnuplot as shown below. Obviously the average CPU utilisation is less than 20% even through Maya claimed to use all 16 cores to render.

Performance issue is not a static problem, often time you have to strike a balance between cpu, io and network. So the moral of the story will be:

Understand your application
Don't simply take the default setting
Understand your system
Monitor all aspect of the infrastructure, CPU, memory, network, storage, io ...

Labels: grid, rendering

Chi Hung Chan