Monday, December 17, 2007

Load testing stand alone classes with Apache JMeter

We want to collect timing information for some of your class methods. The kind of question we are interested in is, how long method A or method B takes. We would also like to subject our methods to multi thread load and analyze the performance under load. We also want pretty graphs of performance tests plus wish to save all the individual sample in some file.

Instead of doing this task by writing our own instrumentation and system.currentTimeMillis(), we decided to take a different path. Since we are already using apache JMeter for our web application load testing , we started looking for ways to do load testing of our class methods using apache JMeter. Finally, the task turned out to be much simpler than anticipated.

First up, you would like to create some custom samplers that apache JMeter can use. Below we present one such custom sampler.



public class JPACreateTest implements JavaSamplerClient {

public Arguments getDefaultParameters() {
    Arguments params = new Arguments();
    return params;
}

public SampleResult runTest(JavaSamplerContext context) {
    SampleResult results = new SampleResult();
    String subject = null ;
    String message = null ;
    String userName = " jpa console" ;

    try {
        Date now = new Date();
        String t = DateFormat.getDateTimeInstance().format(now);
        subject = " JPA create test @ " + t;
        message = " JPA create message @ " + t ;

        results.sampleStart();
        // call your methods
        MessageService messageService = new MessageServiceImpl();
        messageService.createNewMessage(userName, subject , message);
        results.sampleEnd();

        results.setSuccessful(true);

    } catch(Exception ex) {
        results.setSuccessful(false);
        ex.printStackTrace();
    }
    return results ;


}

public void setupTest(JavaSamplerContext arg0) {
// no initialization required

}

public void teardownTest(JavaSamplerContext arg0) {
// no cleanup required

}



Implement Java Sample Client interface and return a Sample Result. Now to use this Sampler in JMeter please follow these steps:

  • Put your sampler implementation in JMeter classpath. Better thing is to package the application class files inside a jar and drop the jar in JMeter/lib/ext folder.
  • Adjust the value of property user.classpath to include all the jars that are used by your application jar file. JMeter.properties file
Now you have created a sampler, put the jar file containing sampler inside apache JMeter lib/ext folder and all the dependencies of sampler jar is known to JMeter via user.classpath property. Now re-start JMeter UI. To add your sampler,


  • Right click context Menu
  • Add Sampler - Java Request
  • Select your Sampler from the drop down
Two Links have been especially useful

Saturday, October 13, 2007

using sphinx freetext search engine on windows xp

This entry deals with using (testing) the open source full text search engine sphinx on a win32 box. Official sphinx document as well as other links and tutorials for sphinx are Linux oriented. So a step by step newbie sort of guide would be helpful , I believe. First go to the sphinx website and download the zip file containing binaries. If you feel you are up to compiling sphinx on win32 boxes then you should not read newbie guides anyway! Assumption is that you already have a working mysql installation on your windows box.

http://www.sphinxsearch.com/downloads.html



Download the zip file and extract in some folder of your choice. First step in our process is to create a sphinx.conf file. This sphinx.conf file will be used by all the programs. For simplicity's sake, lets say we have a table that has only 3 columns, id, title and description. we want to index text in title and description fields. Our test table is named "test" and it sits in a database called testdb.

Below is how our config file looks like,


source test {
type = mysql

sql_host = localhost
sql_user = gloo

sql_pass = gloo

sql_db = testdb

sql_port = 3306

# indexer query

sql_query = \ SELECT \ id, title,description FROM test;

# document info query
sql_query_info = SELECT * FROM test WHERE id=$id
}

index test
{
source = test
path = ./test-index

morphology = stem_en
min_word_len = 3

min_prefix_len = 0

min_infix_len = 3
}


searchd {
port = 3312

}



You have to know a bit about sphinx config file. see this link on the sphinx web site for all the options. Now we are ready to index our test table. To do so just go to the folder where you extracted the sphinx binaries. You should see three programs, indexer,searchd and search as shown in the picture below.



Just go to command line and type command "indexer". Make sure that your mysql service is running before you try to use the indexer. Indexer will list out all possible options. You can run it with the path to your configuration file. Again , read the sphinx documentation for full information. If you type in the indexer command, the indexer will run and create index of your data. you can verify this by looking inside your folder ( or wherever you set the path to index files) . some new files should have been created.



So now, we have used the indexer and created the indexes out of data. But how do we use these indexes? What we do here is start the sphinx search daemon that other programs and API can connect to. It is also possible to build sphinx as mysql plug-in. But lets go with the sphinx searchd daemon for now. start the searchd daemon.


sphinx searchd daemon is now listening for connections. To test, now use the search client program. Later we will see how to use the PHP API to do searches.
You will find the following links useful




Tuesday, September 25, 2007

Usability of Indian web sites




we have changed six addresses in last seven years and that is one of the reasons I keep trying to move more and more of my life online. To make ordering and tracking things easier. The things can include tata-sky subscription, LPG-cylinders, credit card bills, share trading , shopping and banking to name just a few. While most of the organizations in India do have online presence and websites, usability or a good user experience is pretty low on most of the sites. (even circa 2007)

Now, I will try to collect and post examples of bad website designs from time to time. The first one is from iocl.com (indian oil corporation) I am a domestic LPG cylinder user and want to locate the distributor in my area. I know my address. should be fairly simple? What with google map and all? Answer is a big *No*. When I click the distributor list, I just get to select state (imagine that, state). Then i am shown the name/address of all the distributors in karnataka. Now, how helpful is that?

Thursday, September 20, 2007

T_PAAMAYIM_NEKUDOTAYIM

Today I was working with youtube API and PHP. During development time, I opened the error log and completely freaked out when i saw this message

[Thu Sep 20 14:58:59 2007] [error] [client 127.0.0.1] PHP Parse error: syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM in D:\\video\\web\\search.php on line 114

By chance I forgot my specs at home today so I really rubbed my eyes hard before looking at the screen again and No, No mistakes, this is the error. Quick googling revealed that this message is Hebrew . Both suraski and gutmans appear to be from Israel Institute of Technology according to zend management profiles page

Thursday, August 02, 2007

dbmonster as pure data generator

According to their website " DB Monster is a tool which helps database application developers with tuning the structure of the database, tuning the usage of indexes, and testing the application performance under heavy database load." DB monster can use a supplied schema to populate your database tables. This is a real help if you are planning to tune your database and want to analyze index usage etc. You always want to do tuning against a volume database , the one which has lot of rows and has data distributions that match your expected usage.

However I stumbled on db monster when i was only looking for data generator tools. I had already written some code to populate various tables in my database. I was looking for some data generator that I can just wire into my code. DB Monster has different data generators like string generator , number generator etc. Below , I show you how to use the DB monster String generator.

import java.util.Random;

import pl.kernelpanic.dbmonster.DBMonster;
import pl.kernelpanic.dbmonster.DBMonsterContext;
import pl.kernelpanic.dbmonster.DictionaryManager;
import pl.kernelpanic.dbmonster.generator.StringGenerator;

public class StringGeneratorProxy {

//real generator
private StringGenerator generator ;
String val ;
public StringGeneratorProxy() throws Exception{

generator = new StringGenerator();
DBMonsterContext ctx = new DBMonsterContext();
DictionaryManager dictmanager = new DictionaryManager();
Random random = new java.util.Random();
dictmanager.setRandom(random);
ctx.setProperty(DBMonster.DICTIONARY_MANAGER_KEY,
dictmanager);
ctx.setProperty(DBMonster.RANDOM_KEY, random);
generator.initialize(ctx);

}
public void setMinLength(int length) {
this.generator.setMinLength(length);
}

public void setMaxLength(int length) {
this.generator.setMaxLength(length);
}

public String generate() {
val = (String) generator.generate();
return val ;
}


public static void main(String[] args) throws Exception{
StringGeneratorProxy generator = new
StringGeneratorProxy ();
generator.setMinLength(100);
generator.setMaxLength(500);
System.out.println(generator.generate());

}
}




Tuesday, July 10, 2007

DB2 on a vmware guest linux OS

we are working with DB2 UDB v9. Since we are still on the bleeding edge the changes to database schema are very frequent and huge. we do not even attempt to publish the patches. For QA and production we are maintaining a set of magic unix shell scripts that creates the database and all the related stuff in one shot. so far so good. Run a script and everything is fine. Only issue is, these scripts are supposed to be used on a unix box.

Unfortunately, my development box is windows XP. I was in need of a local copy of DB2 so that i can work even when i am not connected to office databases. I tried running DB2 on my XP laptop but maintenance was a big nightmare. (Because of the way we are doing things, nothing else!)

So finally, one fine day , I decided to ditch windows copy of DB2 in favor of DB2 running on a linux guest OS using vmware on my XP box. The advantages are
  • I can decide when i want to run DB2. Not like my windows copy of DB2 that keeps running so many services no matter that. ( I could not figure out a way to turn them off when not in use)
  • RAM is adjustable . so far i have given my linux guest 512 MB out of my 2 GB.
  • Looks like a clean solution , I can keep running both the OS.
  • Most important of all, maintenance is easy for me, I can just run the unix scripts used by QA and production guys.
Now for the installation and how to do it on your machine.

Install vmware server on your windows XP box. You will need a key for your copy of vmware server. The keys can be obtained from the vmware website. Once you have punched in the keys and started vmware server, Add a new virtual machine. The steps are very straight forward. Only thing you have to think about is , how to do the networking stuff and the size of your virtual machine on disk . I went with the bridge mode in networking. I want to assign one IP to my XP box and one IP to my guest OS.

After creating the virtual machine, pop in the linux distro CD inside your CD drive. ( I used open suse but i guess you can work with other distros as well) After that it was usual suse install process. I wanted a small system so i had assigned 8 GB for my guest OS. I did not install 1000+ utilities also , just the basic stuff. Once the guest OS install is over, you should verify that you can connect to the outside world from your guest OS and vice versa. Like I said, i have two IPs , one for my vmware interface and one for my XP interface.

After that , just start the guest OS and it is like any other linux machine to the outside world.

Sunday, May 06, 2007

converting youtube.com flv videos to mobile 3gp format

This took one full Sunday afternoon as well as evening. I would like to blog the required information before it fades away from my memory ;o)

I was watching some Simpson's videos on youtube.com and though of playing them on my mobile handset. Now flash seems to rule the roost when it comes to online videos. Almost all online video websites I know use the .flv format. so we can break our problem in two parts

  • A) get the video file from provider site
  • B) convert the flv video file to 3GP format (H263) that can play on a mobile hand set

For (A) and (B) there are many sites that allow you to download youtube videos. You just copy+paste the youtube.com URL in a box and then you can download the flv file to your hard disk. A quick google would reveal many links . To start your research you can look at the first link at the end of this post. Issue is, most of the ready made tools are geared towards dealing with individual files, forcing you to work in dumb user mode and asking for 30$ every now and then.

I wanted a script or program that I can use in batch mode or call from within my other programs. Right now I am using this perl script that is a rip-off of python youtube-dl script . I will write a java version of this script when i get some time. Downloading youtube.com videos is a non-issue but converting flv videos to 3GP format proved quite a task.

First up there is a nokia multimedia converter on windows that may work for you. Just search forums.nokia.com for more info. There are many programs that let you do the flv to other media format conversions. Just check the first link i have provided. You can look for total video converter or simple or some other tool.

Like I said, My requirements are a bit different. I want a program or script that I can call from command line to do batch processing of flv files. This proved a bit difficult because the information is scattered all over the place. There are many web pages that deal with this topic. I will post them all at the end of this post. To cut a long story short, I zeroed on ffmpeg. ffmpeg could do the required conversion if iyou build it with right switches. Now that looked like a lot of work to do but here I got lucky ;o)

Riva FLV encoder is built on top of ffmpeg and on my windows xp box I can see ffmpeg.exe inside Riva FLV encoder folders. superb! A quick check revealed that it has built-in support for 3gp and flv. (do with $ffmpeg -formats )

So this is what I use now ( Beware, I do not know much about what all options are allowed , i have to digg that bit)


C:\sw\Riva\Riva FLV Encoder 2.0>ffmpeg -i Z3Kvn28HvJw.flv -ab 8 -ar 8000 -b 200
-s 176x144 Z3Kvn28HvJw.3gp

Next up I have to see if there are any nice java wrappers around ffmpeg. And here are the links i found useful,

  1. start here with a post about youtube.com grabbers , viewers and converters
  2. Perl script to download youtube flv video
  3. And that is a rip-off of this python yuotube-dl script.(original)
  4. Nokia Multimedia converter
  5. see this post about ffmpeg 3gp display size options
  6. niemueller.de post about converting to 3gp
  7. Ubuntu + ffmpeg
  8. How to enable AMR support in ffmpeg
  9. post praising vixy.net
  10. Linuxondesktop blog detailing flv23gp conversion with mencoder and ffmpeg
  11. Home of great ffmpeg library
  12. Mplayer Home - http://www.mplayerhq.hu/
  13. Riva FLV encoder
  14. Digg.com article on how to download youtube videos
  15. How to compile ffmpeg under windows - sonke rohde (creator of Riva )

Friday, April 27, 2007

IIMB pgsem SOP

I have worked in Indian software industry for past seven years. Most of indian software industry is consultancy oriented and geared towards learning ready-made tools. But I was more interested in doing product design and development. in year 2000 I joined the start-up team of Indegene life-systems working on hand held device applications for pharmaceuticals and biotechnology industries.

During my stay at indegene I designed and implemented the health internetwork india pilot project (http://www.nhicindia.ernet.in ) for world health organization. This project was a pilot for "bridge the digital divide initiative" and have been deployed in many south asian and african countries. I was also instrumental in creating the platform for multimedia content delivery that forms the core of indegene revenue till date. From indegene, I moved on to oracle to work on their CRM suite that had 400+ installations world wide. At that time, ours was the only team in oracle india center with a full product. As a team, We stabilized the product and took it through two releases. My roles involved resolving customer issues, working with product managers and enhancing the product.

I joined everypath, a provider of mobile business solutions in september, 2004. I was the first employee of everypath india office. I took active part in setting up the office and doing recruitment of core team. At that time Everypath was the only company offering a web development like stack for mobile devices. Even today, doing a quick google would reveal the kind of market expectations we had generated. Our team in india created applications for pocket PC and blackberry that were deployed by enterprises across US, UK and japan. At this point, my primary responsibilities were two fold. first was, working with customers and gathering their requirements. second was designing the applications according to customer requirements in tandem with the development team.

I was responsible for creating the mobile field service application (FSA) product. Everypath was a privately held company and it folded in December 2005 when the board decided to pull the plug after running into legal issues. For a while, i was working to support the mobile FSA product in japan. From there i switched to America Online (AOL) taking charge of development of embedded chat project. Embedded chat provides backbone for real time multi party text and voice applications like AIM, ICQ and AOL client chats. AOL was trying to open its services for world and contemplating a move to open source stack and commodity software.

Now, moving embedded chat project to commodity software with web browsers as user application was a technological challenge because it is a real time streaming application. Previous attempts by AOL US to move to open source stack had failed. After joining the project, I designed the new architecture and lead a team of 5 people who moved the product successfully into production using open source components.

when I look at my career so far I have been working in an engineering set up with small teams. Team creating products and solutions are typically smaller and that is true of all my teams. There were interactions with people from different walks but they were limited to product requirements gathering. But most important of all, I was always the executioner and never the planner. first requirements were always handed out to me, I never planned them. Now I am at a cross roads where I have a need to interact with people in marketing , sales and other departments.

The scope of my interaction is increasing and so far i am well equipped to interact with one type of people only. This is one need i feel. The other is to increase the scope of my work. So far all my work has been "direct involvements" with small teams. I know how to do what they are doing and that is how i am able to come up with estimates and guide my team. But I can not hope to know everything and I have to work in situations when direct involvement is not possible. I need to develop this skill to guide my team comprising of different skills that i may not know. That would be required to lead big teams and get involved in planning.

Some where down the line , i want to start my own company and I believe these skills would be required. I also hope to network with interesting people during my classes. Finally, education holds a value of its own to me and this course provides a nice opportunity to continue my education.

Wednesday, April 18, 2007

Usage of encryption/decryption and one way hashes

Everyone knows terms like one way hashes, digests, MD5 , encryption, decryption etc but not all people understand the nuances. what people understand is that you want to hide a message so you change the original message into a more difficult to read form. But as an application developer when do we need to use encryption/decryption and when do we use one way hashes?

You are going to use one way hashes when you never want to unscramble the original messages again. Think of authenticating the users. You only want to compare the user supplied passwords against what is stored in your database. So you first take the user supplied password ( when you created the login) , apply a one way hash on it (like SHA-1) and then store the hashed result in your database. when the user logins again, he supplied his password in plain text . You take the user supplied plain text password , apply the same hash function on it and then compare the result of hashing against what is stored in your database.

so schematically, at the time of login creation

  • plain text password = P
  • F(P) => apply one way hashing function F on P
  • store F(P) in database

when the user logins again , he supplies plain text password Q
then F(Q) should match F(P)

So here we do not compare P against Q. we always compare F(P) against F(Q). Now you can see that we are open to dictionary attacks in this scheme. I can guess your password, run it through a well known hash function and compare the results against what is stored in password database.

To foil such attacks we introduce something called salt. Salt is a sequence of random bytes, hashed and stored with the original password. The hashing function makes use of salt during hashing , so when given password P, we also generate a salt S and then store S and F(S,P) together in database. If someone is now guessing our passwords he has to guess the salts also.

Remember , we never talked about retrieving the original messages , i.e. plain text passwords. Now where do we want to use encryption/decryption? we typically use them in cases when we require to hide some data from prying eyes but need the plain text form inside our application. Say, given a message M, we can generate the encrypted message E(M) and display E(M) instead of M. When our application needs to read M, we apply a decrypt function D on E(M).

  • D(E(M)) = M

Thursday, March 08, 2007

List of job consultants (head hunters) in bangalore

I was going through my hard disk and found this interesting piece of info. All of us are pissed off by 100s of head hunter mails landing in our mailboxes. Confess it , we all curse those mails and most if the time delete it without even reading. Last year I just started collecting them in my yahoo mailbox on a whim! Then one day i just wrote a small script to get the emails of all those head hunters. Most of them are from bangalore (not surprising, considering my location) and a small number are from chennai,hyderabad , pune and delhi region. This list may be helpful for mass mailing to consultants ( Yah I like the idea, Give them back proper ;o)

But first here is the procedure, in case you decide to create your own list

This is the procedure to extract consultant emails

########################################################
# save consultants mails in one separate folder of
yahoo mailbox.
# do this for 1-2 , 3 months.
# Go to yahoo mail options | General preference | and
select 100 mails per page view
# copy+paste all the headers into text file
mailbox-dump-file
#########################################################
# now, before running this program, remove spaces
before and after the @sign
# inside mailbox-dump-file using vim editor
# : 1,$ s/[ ]@/@/g
# and :1,$ s/@[ ]/@/g
# run this script with
# perl consultant.pl | grep "@" >
clist.txt
# To concat 2 or more final lists
# cat clist* | uniq | tee megalist.txt
#
#
open (list,$ARGV[0]) || die " file not found \n" ;
@bucket ;
while ($line = ) { @words = split(/ /, $line);
push(@bucket,$words[0]) ; }
undef %saw;
@saw{@bucket} = ();
@emails = sort keys %saw;
foreach (@emails) { print " $_ \n" ; }

And Here is the list that you can download from esnips

Saturday, March 03, 2007

latest php 5.2.x on old iBook running panther

When I took that 12" iBook in 2005 I never thought it was going to last full 2 years. I had subjected it to pretty rigorous tests installing and compiling a lot of stuff. The age is showing now , nothing hardware wise ;o) but it is becoming increasingly difficult to maintain it as a development machine now. All the documents for all the tools now a day mention tiger only. Most of my libraries are way too old and attempting to install anything usually starts a "I-cant-find-this-also" chain reactions. The last time that I tried was to install ImageMagick with ruby and I gave up after some time. It has been my internet terminal since then.

So today when my brother asked me to install latest PHP on it I was really apprehensive. Now I do not want to start a libraries chase on a Saturday afternoon. PHP is supposed to be a big library with lot of dependencies and I was sure that I will give up this time also. But whoa!!! what a surprise!!! I did the install in 2 hours flat on mac os x 10.3.9! and i can not believe it. Those who find it hard to believe that installing something can take 2 hours have , well, obviously, not installed lot of things in life!

I started with this PHP mac article. Follow the article to the letter. Only difference is I download 2.0.59 instead of 2.2.x series and i chose my prefix to be /opt/apache2. Apache installed nice and smooth. The I downloaded PHP 5.2.1 tarball and did a configure and make. The configuration options I chose were to build with GD and mysql . ( we want image manipulation etc). I did not use all the switches mentioned in the article but YMMV. Here I ran into my first issue of a header file xmlsave.h not found.

Looks like PHP needs latest libxml headers and my panther was missing those. The best help I got from net was reading this blog and also the same PHP mac article mentioned above. So i decided to install libxml also. Install is pretty simple, just run configure and make and you will get libxml installed in /usr/local. Now you have to go back and configure PHP to build against your libxml. There is no need to install new version of libxml in /usr as it may break existing things. To configure PHP you should use --with-libxml-dir=/usr/local option now. Doing so and running make again I ran into my second issue.

The second issue came out to be a documented bug and you can read more on supplied link. I do not know what is the issue but downloading the latest snapshot and compiling again built the PHP 5.2.x for me. I had installed JPEG and PNG libraries using fink some time back. The GD built fine against those libraries. So yipeee now we have latest PHP and the iBook gets an elevated status of development machine.






importance of UPSERT database operation

This week I ran into upserts. I have to create a script/tool for uploading data into databases. The source data format is pretty "loose" and allows the users a lot of flexibility in how they generate it. The upside of this approach is that you are not imposing lot of rules when users are generating source data. BTW lot of rules piss people off , things like you can not have comments in between, you can not skip lines, you can not have empty lines, you can only have a certain format etc. So we tend to make it easy for the people to generate data. However there is a downside too.

Downside is , you need upsert kind of operation. If the data that user supplied is already there then do not try to insert it again , just update with whatever the user has supplied. Now doing this kind of operation in application layer would be tough.

To avoid duplicate data insertion from your program you first need to find out if the user supplied data already exists or not. So you either fire a query to do lookup on supplied fields or you load all the keys and fields in the beginning. Both are expensive operations atleast for large tables. In first case you waste a lot of time and in second case you read and keep carrying a lot of data. what we need is some in built mechanism from database.

Surprisingly, mysql provides quite a number of options on what to do when a duplicate row of data (some unique key violation) happens.
  • You can ignore the insert using INSERT IGNORE INTO
  • you can replace the row with new data using ( need to find more)
  • you can update a counter using INSERT INTO ON DUPLICATE UPDATE

All sort of things are possible with mysql. This article gives more information. Oracle and DB2 provide a MERGE operation. You can decide the matching on existing columns and then either insert or update. All this is very database specific and puritans will shout no "portability" but what the heck! Do you change your database every night? These features are powerful features and meant to be used. I am pretty comfortable using them , let the database care about the data, I do not want to do all the house keeping in my application code.

If you are using hibernate then you have to fire custom SQL queries on JDBC connection. You can define a <sql-insert> element for your entity but I am not sure what version of hibernate it works with plus I do not want to override the entity insert in all the cases. Only the data upload case is special. I also need to find out a way to provide named parameters as part of custom SQL queries in hibernate.

Friday, February 16, 2007

The DB2 install experience

All these years I have been working with Oracle and mysql. For an application developer my oracle knowledge should be quite good. Till fairly recently it was not easy to get a "personal" copy of oracle so most of my side projects used mysql. When oracle came out with express edition (XE) I pretty much said goodbye to mysql also. So when we were "forced" to work with DB2 my initial reaction was pretty vile.

First , I was not sure if i can install it on one of the free linux distros like fedora or opensuse since I had a copy of enterprise edition of DB2. My first task was to install DB2 on a intel box running open suse 10.2. So after some big download sessions involving DVD images of open suse and DB2 enterprise edition I rolled my sleeves and dived in. Installing open suse was, what should i say, a piece of cake? Since this box was exclusively to run DB2 i decided not to install all the 5000 nifty utilities ;o)

To install DB2 i took the easiest path out , the X windows installer. The installer looks from some other era but works. I just took all the default and ended up with a minor error that installer was not able to create an instance. Before you begin your install it is recommended to read the quick beginning for DB2 servers available from IBM site. If you are using the X windows installer you would not need it but the information is pretty useful.

Now this was nothing minor. During install DB2 dumps the log files in /tmp and reading log files was more or less of no help! The errors were too generic and would have implied a range of things. This was my first "stuck-up" point. I must have spent close to 4 days trying to create an instance by hand. I am not going to waste more space here telling that story because installations are very individual things and a number of things can go wrong. I had installed NX server on my suse box that some how screwed up the host name and DB2 install was not able to resolve the host name. I solved this issue by contacting our resident DB2 expert.

After resolving host name issue I was able to start and stop the database instance but not the listener. The db2start and db2stop commands would run fine but the listener would not start. This was my second stuck-up point. DB2 service entry is put in /etc/services file and SVCENAME config parameter should also have the same value. That issue was resolved after issuing a $db2=> update dbm cfg using SVCENAME 60000 command. BTW, whenever you see db2=> or CLP mentioned in the documents, it means the db2 command line processor. You can start the CLP by typing $/bin/db2 command.

Next steps were verifying the install by creating sample database and doing some selects here and there. At this point i could start stop the database, start the listener and connect to it. All of this was done on the DB2 server machine. One more thing to remember is to switch to DB2 instance user before you issue any commands. Now I needed a client to use this DB2 server from my win XP laptop. Here I did something that I should not have done. I wasted quite a few days trying to run with runtime client.

Now YMMV but i would recommend grabbing the full DB2 client. if you intend to do any serious work with DB2 you would be happy with the set of tools that come with full client. And at this point i am not very sure of thin JDBC driver support with DB2. Also if your work involves porting from an existing database like oracle and you plan to run MTK toolkit then you need a "FAT" client. That helps things. At least when you are in novice stage. With the DB2 client installed, it is pretty easy to setup database aliases on client machine using the configuration assistant. It also includes a command editor that can be used to fire up queries.

My next stops were
  • Using the MTK tool to port existing oracle schema and data to DB2
  • porting oracle stored procedures and triggers by hand
  • Using DB2 with hibernate

Thursday, February 01, 2007

DB2 suffers from too much documentation

January is already gone and that means about 8% of this year too. last few days have been hectic because of DB2 intall. Right now we are running on oracle but the instruction from management is to port our application to DB2. I never had a chance to play with DB2 before. Right now, we are not interested in performance and tuning , we just want to port our oracle tables and procedures over to DB2 as quickly as possible.

The install is supposed to be easy and i guess it should have been like that. But we had some issues with hostname resolution on our open SUSE box. Due to that hostname issue, we were not able to create DB2 instances. That issue is now resolved. However , I noticed two things about DB2 documentation. There is documentation and lot of documentation , so much so in fact that your search token will become that proverbial "needle in haystack". Second thing is , all the errors reported by DB2 are way too generic. They wont be of much help in troubleshooting.

Maybe I need to read it from cover to cover. But you know how it is when you are starting with a new software. You want to get quickly on your feet and start your work straight-away. It is hard to believe that I spent 4 days to get the DB up and running. time to RTFM.

Monday, January 29, 2007

Avoid database insert on page refresh using synchronizer(Deja vu) token pattern

I am writing the DAO (data access objects) for our web application and it is not done yet. I have not plugged-in any data duplication prevention logic and that means you can have duplicate rows in database. Now, duplicate row prevention is a topic in itself and a lot of time you do not want to plug it in because duplicate prevention is time consuming task. Doing a column by column comparison on a large table may not perform very well. Anyways that is not the main story. Main story is, my data API allow duplicate rows, so if a user re-submits some form page by clicking refresh button then we insert a duplicate row in database quite unintentionally.

We are using struts framework for page flows. Page refresh calls the data flushing action again and we want to prevent that. I did a bit of goggling around and looks like everyone is using "synchronizer token pattern" to avoid this problem. The scheme is following:

When a page is requested, generate a token and render this token as part of the page. (maybe as a hidden field). Store this same token in session using the page as key. when the page is submitted, you compare the token from page to the one stored in session, if both of them match then allow submit. If submit is successful then clear the token from session (using page as key).

Now, lets say the user hits refresh. The page still has the old token, so the token from page would not match the token in session and we do not allow submits. If user requests that page again through normal work flow then we re-generate a new token and everything is okay dokay ;o) if I am not feeling very lazy i will try this pattern with a simple form and servlet.

Struts has built in support for synchronizer tokens. Some useful links

Sunday, January 21, 2007

create myspace.com in 4 hours? Part II

Okay, forget myspace.com. Lets see what it takes to create and run a very simple online application. Our requirements are simple.(?)

Requirements

  • Let the user push some data via a web form to a database table.
  • Later the user can query back the same data.
  • One DB Table
  • The application is accessible online on some domain
  • Application works 24x7 with small maintenance windows
  • Anyone can access it
Now lets try to see what all steps are involved in development starting with a virgin machine. I am going to use perl cgi.

First up, we need to set up a development stack . we need a database, a web server and some programming toolkit of your choice. So download mysql , apache and download Camel Pack perl. If you are on linux boxes, most of the stack is ready made ( I do most of my development on win XP box) But creating the right stack is definitely an activity that takes up some time . if you may have perl but not the modules you want. if you may have java but not all the libraries you want. You may have version x when you want version y and so on.

After downloads, come the configuration part. Who has run apache or MYSQL just out of box? You do have to do configuration changes. even creating a single database, 2 users, a single table and configuring apache to handle cgi from your directory takes some time.

Lets come to form development part now. Say you are a very smart designer who knows CSS , HTML and javascript like the back of your hand. You can create both the HTML pages very quickly in a three column layout design. Still you need to make sure that the elements on your form are aligned. You have put the right javascript for form validations. You need to make sure of navigation from search to input and input to search.

Lets say it again that you are very efficient server side coder too. You can create 2 perl CGI scripts in a jiffy. One using DBI to store data in system table and the other to read back the data using tokens. You are a smart developer, so you get everything right in first go. Things like, checking for invalid inputs, making sure of case conversion when doing search , taking care of xss issues so people can not paste javascript URL on your form etc. But still, typing everything in an editor still takes time, isn't it?

Finally , out two forms are ready to be deployed. we input some data , check up the database and see that our table is populated. Now remember that the application is out there in the wild and anyone can do anything to it. So we need to test the forms a bit with all kind of edge cases. numbers-only name, string only dates etc. All goes fine with an occasional bug here and there and we are now ready to upload.

To deploy, you need a domain name and some server space. Lets say I order both from Yahoo. You get your stuff up on hosting server. Getting the domain up and propagated would take also some time. But is this the end once I can access my forms from my very own URL? Is everything over? of course, not. Let 10000 people put their data in your tables and then your searches would literally crawl!!! what happened here? Forgot to analyze your tables, isn't it? How do you take off your site while the maintenance is going on, do you know that?

Anyone can access your application. what if some guy uses some web pager from netbsd machines? what happens to all the groovy ajax validation stuff? Now you see, we need to put in browser match rules in apache config as well.

Morale of the story: slapping together 2 web forms with some server side script is no big deal. But to design and deploy an application that can be accessed by anyone and used in any which way is definitely a big deal. Online application look simple, but they are not!




create myspace.com in 4 hours? Part I

Today, i came across a posting on digg and now i will quote verbatim " A site like myspace.com can be put together in 3-4 hours too. Ditto for del.icio.us and a lot of the other high priced ones." Having done few web applications myself , I really know how much of an understatement that is. Why is that people learn some scripting and rudimentary SQL and think that putting together a decent web application is no big deal? Have they ever tried doing a web app end to end or are we talking super heroes here?? Let me whip put my requirements and some estimates and see how long it will take to do a simple page end to end.

Wednesday, January 17, 2007

Fault tolerant email queues

I had written an application that could send emails . That was done using javamail API. However at that time my requirements were "just" to deliver an email. The usage was not high and our application was on same network as our SMTP server. sending few emails far and between did not turn out to be an issue! However, today , I want to write a component that can send higher volumes of email (something like 20,000 in one shot). My earlier program will surely run out of steam here.
The program should try to resend the failed messages again. If one message send fails then it should just move on to the next one. Some smartness may also be required ,like checking domain names. We should have some way to mark delivered messages and of course we need some kind of queue implementation.
My first reaction now a days is to look for some ready made component. But surprise, surprise!!! so far I have not been able to locate any open source project that fulfils my requirements. I am approaching this problems from three angles
  • See if some message queue or enterprise bus can be used. I do not want to do lot of configurations etc. because my only mode of transport is email and that too one way only! There are a number of open source message queue implementations , I am looking at Mule in particular.
  • See if some MTA like qmail fits my bill. If there are APIs for MTA then I may be through. After all people have been using /bin/sendmail to send emails since time immemorial
  • turn to search.cpan.org as usual (when i am really desperate)
Final option is to write the component by hand. I am ready for that also but the question to ask is If I can trust javamail for a scalable job. still a way to go .....
© Life of a third world developer
Maira Gall