Optimizing MPIs for multi-cores : Thoughts

>> Sunday, June 28, 2009

MPI is still the programming model for most of the scientific applications which runs on super computers. Yes one could argue that map-reduce, dryad style models are coming up, but they are good for embarrassingly parallel apps or they don't exploit the fast interconnects within nodes. Anyway its a different debate.

When we look at current scientific apps running on super computers, they are already parallelized, exploiting most of the MPI constructs. MPI programming model encourages programmers to send messages and pass data around the network. But the problem with MPI applications is that, when we increase number of CPUs, we sometimes get diminishing returns (meaning anti-scaling) , and at one point it starts to give negative effect reducing the perfomance. The reason for this is that due to the increase amount of data being transmitted in the network, with the increase number of CPUs, the communication overhead exceeds actual computation and also saturate the network.

Recently I got to talk with scientists from three different domains and all of them were whining about the network saturation. The naive solutions to this problem are

  1. change their implementation algorithms to reduce the communications (oops, I thought we wanted to exploit the fast interconnects)
  2. optimize the MPI usage by using proper MPI constructs . For example using MPI_Reduce rather than using MPI_send to single node and it reducing it and doing MPI_Bcast. Remember MPI_Reduce exploits the network topology and do a hierarchical reduction.
But with the emergence of multi-cores, this problem gets a better solution, in addition to the naive ones. The idea is to mix the SMP model with MPI model, by using multi-threading within the same node and using MPI for intra-node communication.
(I know this idea is not new [1], [2], but it was very exciting to me still especially with some of my current experiences with it. )

Let's say we have an 8-core 100 node cluster (when I first get to use 2000 core clusters, yeah I was nervous and excited too :) ). This will enable us to run a job on 800-cores and make all these 800-cores banging the network. But rather than scheduling it at the core level, how about scheduling at node-level for 100 nodes and using SMP, with 8 threads each, within a node (same 800-processes)? This will cut down the network traffic, at most 8 times (the number of cores), depending on the application. Not only that, we might see an increase on performance as within each node, 8 processes are on the same address space and communication among them will be much much faster.

Exploiting this requires a change in the application too. One has to use the main thread, in each node, for MPI communications and carefully manage multi-threads within it. The challenges with this hybrid approach includes, but not limited to

  1. Handling multi-threads. Yeah we all know how cumbersome it is
  2. With some implementations, only the main thread is allowed to do MPI calls. So there can be a slight bottle-neck if the threads try to connect to processes that are not on the same address space.
  3. Main thread has to manage all the threads, divide the work/data among the threads, aggregate them before sending to other nodes, control intra-node communication, etc.,
  4. Debugging. Believe me, its very hard and painful.
I think MPICH2 and OpenMP already enable SMP and MPI hybrid model, but not sure how good or convenient they are.

What I like to have is something like in DryadLINQ, where users will write on a bit higher abstraction and the framework will optimize based on the network topology and machine configuration. But I guess we will have to wait till the next generation of programming models/frameworks to see things like that.

Read more...

Why do they talk about a genocide?

>> Friday, May 22, 2009

It was very frustrating to listen and see the lies some of the people are spreading around the world about a Tamil genocide. There is no problem to any Tamil in Sri Lanka.

(if I have money) I can personally bring anyone (of course non-terrorist) to Sri Lanka. Thre is no problem. Politically Tamils have more power than Sinhalese sometimes.

So it is worthwhile to look at why these people are lying like this. If we look at the background of this people it is not that hard to find the reason. All the people who talk about a genocide and protest in washington, DC, Canada or UK are now not living in SL. Non of the Tamils now in Sri Lanka are talks about this genocide, coz they know its not happening.

Then why are these people outside of SL talking about this? Its easy. Take a look at the conditions to be satisfied to get in to refugee status.

The United Nations High Commissioner for Refugees (UNHCR) defines a "refugee" as "a person who has fled his/her country of nationality (or habitual residence) and who is unable or unwilling to return to that country because of a "well-founded" fear of persecution based on race, religion, nationality, political opinion or membership in a particular social group."
But it excludes "
those who have left their homes only to seek a more prosperous life."

Clearly this is the reason. They came out of Sri Lanka, obviously to seek for a more prosperous life. Most of the South Asians will try to go out of their country to Europe or N. America for this reason. But its hard to get in to these countries, if you don't have proper qualifications. The easiest way is becoming a refugee or political asylum.

Complaining about genocide is the easiest way to claim "fear of persecution based on race, nationality". Here we go. When LTTE was there, they were telling this because anyway LTTE was killing tamils and making the lives of those people miserable. Now when that is gone, they need a different reason to stay in their countries. So the new "mantra" is Tamil genocide in SL by SL government.

Very small percentage these people can easily use "
fear of persecution based on membership in a particular social group", because this small percentage is LTTE henchmen. But unfortunately for them most of the countries banned LTTE as a terrorist organization. So they can't use it. Their savior is "Genocide Mantra".

Come on, even now how are you paying monthky as bribes to all LTTE henchmen within the country you are living. How many have rejected and what happened to them? 
Why are these[1, 2] henchmen, trying to prove LTTE leader is alive? Look at what all the tamils have contributed so far and how much do you think they have in the bank? Do you think those henchmen like to give up their free and luxurious life, that they enjoy using your money? Forget it. 

Please don't blame Sri Lanka for your own goodness. Let us build our small country. We've been suffering for more than 30 years. SL natives, born outside SL, and has never been to SL, please just come once and see how good SL is.

If Sinhalese are killing tamils, think about these.
1. How many tamil ministers are in SL cabinet and how many tamils MPs are in parliament right now?
2. Is there any place in Sri Lanka that a Tamil can not go? But during LTTE days, there were lots of places a Sinhalese or a Muslim couldn't go.
3. Colombo is the commercial capital of Sri Lanka. How many Tamils are living in suburbs of Colombo (Wellawatta, Kotahena, Mattakkuliya, Kirulapone, Fort area, etc.,) ?

These are only few questions.

Its ok if you can not come back to Sri Lanka. Everyone likes a good life. But please don't blame the country for that.

Read more...

Sri Lanka - A Land Like No Other

>> Monday, May 18, 2009

Now the war is over and Sri Lanka is a completely safe country for anyone and everyone in the world.

When I was in school, everywhere in all the books it was mentioned that Sri Lanka is a beautiful country. But not seeing or being to any other country, I didn't know how beautiful it is. Now having gone to different places, outside Sri Lanka, I think I know how beautiful it is.

Everywhere it is green. Lots of waterfalls, trees, mountains, fantastic beaches, friendly people (completely opposite of Newyorkers ;) ), wild life, flora and fauna. Yes now I know how beautiful it is. When people ask about Sri Lanka, I used to explain about all these. But when they asked about war, I was shaky. Now things are different, war is over. No terrorists to be afraid of. Just one nation united under the one flag with Sinhalese, Tamils and Muslims. Sri Lanka is one of the best and safest places to visit on this earth.

Look at this video I posted sometime back and see for yourself. So when you plan your next vacation, include Sri Lanka.
And oh, my Sri Lankan native Tamil friends who never visited Sri Lanka. Stop protesting in all these junctions in US, Canada and Europe. Now its the time, go and visit your beautiful country with your friends.

(Oh yes, I will be there as soon as I get money to buy three tickets :D )

Read more...

Clouds vs (Scientific Computing, HPC and MPI)

>> Monday, May 11, 2009

Cloud computing seems to have a bad reputation among the HPC community. May be because most of the computer scientists who comment about this are from grid computing and MPI backgrounds, they are still skeptical about clouds.

It is important to note that HPC applications, which require parallel processing algorithms and software, are only a subset of problems within scientific application. Even within HPC, MPI* (see note below) is just one pattern of doing HPC.

Some researchers have argued that clouds might not be good enough for running MPI applications. It can be true to a certain extent because of the virtualization overhead and possible network latencies. But that doesn’t anyway mean clouds are bad for any scientific application, simply because MPI apps represent only a small percentage of scientific applications.

If you know about workflow engines, Taverna, Kepler are well-known systems. These systems are pioneers in running workflows in your desktop or in a single server. Most of the workflows hosted in myexperiment.org are scufl based taverna workflows. All or most of these workflows doesn’t require MPI or HPC resources to run them. Yet they are scientific applications.

One could argue that most of the workflows hosted in myexperiment.org are bio workflow. Yes, but there are lots of other scientists who use Taverna and Kepler systems who has monolithic workflows, which can be run easily on clouds.

Also there are lots of scientists who doesn’t have access to any of the national or research grids. Also they don’t have very large clusters to run their experiments.

Even if you own a cluster, to get the data of an experiment, which is to be included in a paper with a deadline coming in few days, having more on-demand resources might be useful.

The important thing I wanted to point out this.

It is wrong to generally say that clouds are bad for scientific application. It might be true for certain class of applications (like MPI), but there are tons of other scientific applications that can benefit from cloud resources.

 

*let me clarify this a little bit. For most of the people when they say MPI, it is implicit that message passing within workers is a must. But at the same time, gurus who oppose MapReduce, which has no or few message passing, consider MapReduce to be a specific problem of MPI. So when I say MPI here, I mean to say algorithms that require high-level bandwidth for message passing

Read more...

Road Trip to Redmond, WA from Bloomington, IN – Day Notes

>> Sunday, May 10, 2009

May 3rd

Approximate Distance : 514 miles
Destination : West Des Moines, IA
Map

Started early in the morning (around 11.30am :) ) from Bloomington. GPS wanted me to go through Indianapolis, but I wanted to follow what GMaps suggested me to go through IN-46 and IN-43.

Passed through the other Bloomington in Illinois. Passed through Illinois and entered Iowa. Iowa had well maintained rest areas with wireless internet.

May 4th

Approximate Distance : 614 miles
Destination : Rapid City, SD
Map

This was the most hectic day for everyone. Whole drive through South Dakota was scary, secluded. No fast foods for 150 miles. Rest areas and gas stations were far away. We could see miles far away through the road, without a single vehicle.

But the good thing was for the first time, the speed limit was 75mph.

May 5th

Approximate Distance : 539 miles
Destination : Livingston, MT
Map

Early morning went to see Mount Rushmore. Since it was noon by then, we had to give up the idea of visiting Crazy horse monument.

This time GPS suggested to take US-212 from Spearfish, SD as it is more direct route than I-90 towards Hardin, MT. Total distance through US-212 was 248 miles, but I saw only one gas station for the whole distance. It was very secluded and if something happened to us, we could have been waiting hours for some one to come. Even though this route saved about 1.5 hrs, I don’t recommend any one to take it, if you are not sure about the vehicle.

May 6th

Approximate Distance : 146 miles
Destination : West Yellowstone, MT
Map

Drove towards Yellowstone park, through Gardiner, MT. It was raining and also was cold. Visited Norris, Canyon village areas and Madison. Unfortunately most of the time it was only me driving in that direction to most of these places. Because of my past experience with deer crossings, I was scared to drive fast also.

Even though we never encountered bears and wolves on the road, park rangers warned about those animals as those creatures are coming out of hibernation and were really hungry and dangerous.

May 7th

Approximate Distance : 152 miles
Destination : Bozeman, MT
Map

Drove to old faithful area and watched the beauty of it. Also had the chance to see many geysers and springs around that area. On our way to grand geyser, we saw a wolf chasing a herd of bisons and a calf. But it gave up the chase and came towards us, scaring all the people near us. Fortunately nothing happened.

When we were coming back, a snow storm came in and the rangers had to close the road out of old faithful temporarily.

May 8th

Approximate Distance : 477 miles
Destination : Spokane, WA
Map

This was a very interesting drive, through MT, ID and WA. I was a fan of Need for speed for a long time, but never had a real chance to experience it. But between St Regis and Henderson, MT I really got the chance to do that sort of driving. US roads are excellent in making turns so that they will make sure one can take the bends at the posted speed + 10mph without any problem. But this area had many bends within very short distance making it very very interesting NFS like drive.

Even in Idaho it was beautiful when we went through Rocky mountains. I-90 through Idaho was very beautiful than any other places we drove so far. But the speed limit was dropped to 60mph in lots of places.

Since we reached Spokane early, we had the chance of visiting river front park and to take the skyride to see Spokane falls.

May 9th

Approximate Distance : 281 miles
Destination : Redmond, WA
Map

Being the last leg of the journey, I was sort of happy.

Most of the parts of I-90, through Washington, till Snoqualmie pass, was just like in Montana or South Dakota. No gas stations or fast foods places, but there were lots of vehicle on the road.

Snoqualmie pass was very beautiful with all the snowy mountains and lakes.

Photos of the Yellowstone park visit and road trip can be found in my facebook profile.

Next challenge is drive back to Bloomington, IN after my internship.

Read more...

Road Trip to Redmond, WA from Bloomington, IN – Summary

This sounded little bit crazy for me (initially) and for most of the people, but I really wanted to do this because,

  1. Last time when I was in MSR, one of my fellow intern had done this with his family with 3 children
  2. I always wanted to go across USA to see different states, outside of suburban areas.

When I get the chance to go to MSR this time also, I thought of driving there. A big thank should go to Fazni for encouraging me and to Thushari and Dihini for always staying with me at all times.

Summary

Total Distance : 2723 miles (including drive within Yellowstone)

Time: 7 days (with total driving time of about 45hrs)

Fuel Consumption :

Total Gallons : 92.7

Average Gas Price : $2.2225 (lowest in Indiana $1.999, highest in Washington $2.4)

Maximum Speed : 75mph + allowance (which I’m not gonna mention here ;) )

Driving Experience :

Best area to drive : Between St. Regis and Henderson in Montana. This area gave me the real feeling of Need for speed, taking very sharp turns at 80mph (without drifting) :)

Worst area to drive : South Dakota. and US-212. Damn it, no gas station, food or even a human for hundreds of miles. Some times I was scared to death.

Places visited : Yellowstone National Park for two days, Mount Rushmore and Spokane Falls.

Map


View Road trip to Seattle in a larger map

Read more...

We need impartial SOAP stack comparison

>> Sunday, April 26, 2009

I've been asked by couple of friends who are evaluating different opensource Web services stacks, to be used within their projects. Since I've been an Axis2 developer for a long time it is fair enough to label me as biased.
But ... when I searched online for these comparisons I didn't see a single comprehensive comparison good enough to answer my fiends' questions. Most of these comparisons out there are

1. biased towards one particular implementation where the author is affiliated or "paid"
2. distorts certain facts about other engines
3. doesn't contain a full and comprehensive comparison.

These comparisons are similar to the benchmarks various hardware vendors provide highlighting the features of their hardware. But what I'm asking is a project (may be a good idea for a google summer of code project) where some one talks to most of the communities, at least major ones like Axis2, XFire, CXF, Metro, and then come up with a good document.

Having said that, let me answer some of the comments about Axis2 from my point of view, assuming these will be useful for a future impartial comparison.

  • CXF has Spring support in-built in to it, which I don't think Axis2 has. Axis2 has spring support, but I don't think it has first class support for that.
  • This thread mentions that Axis2 always has proprietary APIs and promotes JAX-WS, JAX-RPC etc. I agree to most of the points mentioned in that thread, but I don't think we need to have first class support for Java apis. Practically only few people care about it. All what matters for most of the developers is to get their Web service deployed easily or invoking a Web service very easily. I don't think they will want to learn one more set of Java interfaces to do that. Stubs and skeletons that Axis2 generates will be more than enough for their requirements, at least IMHO.
  • This blog post highlights XFire for Spring 2.0 XML support, RESTful service support,JSON support, some of WS specs support as well.

Axis2 REST support

In Axis2, we are the pioneers of supporting REST with Web services. Axis2 was the first Web services stack to support standardised way of exposing Web services, using WSDL 2.0 HTTP bindings. During the last WSDL 2.0 interop held in France in 2006, only we had a full implementation of WSDL 2.0 HTTP binding. One would argue that this is not REST (which I also agree to some extent). But for most of the people REST is mostly XML over HTTP.
Again one might say no one uses WSDL 2.0. May be true, but we can use same methods to expose some of the Web services, even without WSDL 2.0, with REST.

Axis2 JSON support

Yes, Axis2 has JSON support. Not only that it has the flexibility of integrating with any type of transport (like HTTP, XMTP, XMPP, TCP, UDP, etc.,) or any type of message formating (JSON, XML, binary XML). Currently we have implementations for all of these. To give the proper credit, we initially evaluated XFire implementation for JSON support, but I'm not sure whether we ended up using it.

WS-* spec support

This is a very disappointing argument to say that Axis doesn't have WS-* spec support. Any Web services engine must have support for WS-* specs. This blog mentions that Axis2 has support for WS-* specs using add-on modules. Thats the whole point of Axis2. We didn't want to put all the spec implementations within Axis2 and I don't think anyone can do that. We wanted to provide a solid framework so that any spec can be implemented and that was one of the main design decisions we had since first Axis2 F2F.
One can write modules implementing various specs and just drop them in to Axis2, which I think is very cool. The same blog complains that one needs knowledge on ant to use Axis2. Yeah, I know people who know about Axis2 will laugh now. We auto-generate ant build files to ease the task of the developer to use the generated code. It doesn't mean one has to use it. Also using ant is so trivial and what is hard in it. If you don't like ant, I'm not sure how good you will be with C deevlopment with make files :)

At last, the only post I found interesting was from Bjorn. But I think it should be more comprehensive. All in all, I think most of the stacks provide almost the same thing. It all depends on the requirements, may be in-terms of performance, features, extensibility, APIs, etc.,
Even I agree that Axis2 is cumbersome to use for smaller tasks. I think that can be the reason why Amazon used XFire for their SOAP api implementation on EC2. Being open-minded rather than being restrained to a certain framework will always be good to get what you want.

(I hope no one will bile-blog after this and hoping to get constructive ideas on this matter)

Read more...

Can We Embed a GPS Tracking Device in to Credit Card/Wallet

>> Saturday, March 28, 2009

When I lost my purse, last week, this was the first thing came in to my mind.

I think most of us forget where we kept our wallet and that mistake is creating me tons of trouble these days.
If a small GPS chip (which can push or can be pulled to get its location) can be embedded in to our wallet or credit card, I think it will be a great idea. If we can login to our credit card company website and ask it to give the GPS location of the credit card, then it will be really cool for us to find our lost wallet and/or credit card.

But this will give a challenge also as this will enable some one to hack in to the system and track anyone they wish. Given enough security around the application, this will be a nice feature for me to find my lost wallet ;)

Read more...