Sunday, July 20, 2008

What is eScience ?

(This will be helpful for me to explain my friends what I am working on currently ;) )

Disclaimer : this will be a basic introduction and might not be sophisticated enough to Achilles in the field.

In simple terms, eScience is where computer scientists blends with scientists from other science fields to solve their problems efficiently. In my view there are two things that are being referred to as eScience these days.

1. Computer scientists apply their algorithms and knowledge on to other science fields. For example, one could use algorithms and methods like neural nets, machine learning, etc., to medical field to efficiently device solutions to those areas.
Even though most of people don't see this as part of eScience, having being to a talk from David Heckerman, I also agree with him.

2. There are algorithms that require large amount of computational power and time to compute something or they act on large amount of data. For these algorithms to work or these tera bytes of data to be mined, one might need the help of super computers.

- Handling these large amount of data
- executing those algorithms on these data
- enabling scientists to work these data, thru GUIs or workflow engines etc., is also regarded as eScience.

This is what emphasized by most people and wikipedia as well.

I think I am also more in to the second area, so I will explain a bit more on that.

Think about the following scenario, related to meteorology, to understand the use case.

A country might have a large number of weather stations reporting various weather conditions to a central location. In case of US, IIRC, there are about 144 weather stations. Each weather station sends data, say once in a hour. If the size of a file sent by each weather station is about, say 1GB (this value will depend on the resolution of measurements), then we will get about 150GB per hour. There are algorithms to go through this data and mine them to find out interesting weather stations. For example, one algorithm will find out, say a set of storms using those data. Since the first phase will act on these data separately, there has to be another algorithm to aggregate the results. If first algorithms shows 5 storms, it can be few of them are related to the same one. Likewise there are different algorithms that can be run on top of this data.
Scientists can either run their algorithms on these data alone, or they can define workflows to run on these data. For example, they can design a workflow which will

  1. first mine these data, find interesting conditions
  2. cluster them to identify unique conditions
  3. talk to individual weather stations to get more data, if needed
  4. come up with a scenario explaining the current conditions
  5. predict on the path of the storm or behaviour
Since all these have to be carried out in a timely manner (You don't want to get today's weather forecast tomorrow, right ;) ), and the data sets involved are large, it is required to use high performance computers for these .

To peform the above mentioned tasks, there has to be some infrastructure which can enable the users to
  1. design, execute, monitor workflows
  2. perform data movements from/to computing resources. These movements will not be easy as it will not only invlove large amounts of data, but also involves working with super computers, data centers, etc.,
  3. schedule and monitor jobs in high performance computing environments like clusters, grids, etc.,
This whole environment can be regarded as an eScience environment. This is just one examlpe and there are lots of problems like this in bio-science, neuro-science, aerospace, etc.,

Friday, July 18, 2008

Places to visit in Washington State - Mt St Helens

Location : Johnston Ridge Observatory, At the end of Spirit Lake Memorial Highway, WA

Directions : Google Maps, About 3 hrs from Bellevue, WA

For GPS : 46.276258,-122.216721

Link :
en.wikipedia.org/wiki/Mount_St._Helens

This was one of the interesting trips I went, with my family. The road to Mt St Helens was full of fascinating scenaries.
After we exit from I-5, the road goes through a small town and after that the road will be full of sharp turns. At one time there was a sign saying, it was the last place to get gas. It was 37 files from that point, but I didn't realize we will be gaining elevation and my car will have to do extra work. (Thanks to Corolla's fuel efficiency I didn't run out of gas :) )
There are couple of view points on the way and most of them were gorgeous. There was one place where you can see the path of mud and lava flow.
When we got to Johnston Ridge Visitor Center, the view was great. Since it was a sunny day we could see the whole mountain without any trouble. There are couple of trails lead by some rangers and one of them was going towards spirit lake. Visitor center also had some movies being played inside a theater.

This is a combination of three photos, showing the mighty Mt St Helens and the living crater.


Mt St Helens and the lava and mud flow path



360' view around the Mt St Helens area. If you look at the surrounding mountains, you can still see some burnt trees


Monday, July 14, 2008

Google thinks I am a "virus"

I was searching for a grocery store in my area, in google, and this is what the result was.

"We're sorry .. but your query looks similar to automated requests from a computer virus or spyware application. .... "

Seems some one had messed up automatic spyware detection. Can this be due to an error in IE?

Update : I just checked couple of more queries, now using firefox, and I got the same error. So it is some thing happening beyond my machine. Can be internal network or google is messing up.



Tuesday, July 08, 2008

Open-source vs source-open

I asked from one of my friends, what opensource means to him. He said, if he has source then he is good with it. Is this the meaning of opensource software? Where is the community component?

If some one builds a software, in-house, and put it out with the source, is this opensource? I personally think there is something missing.

There seems to be a trend in larger projects, that the customers demand for source. Especially large clients (like governments) in Europe tend to lean towards opensource software. So most of the companies are trying to exploit this by putting something out as their source.

What is the meaning of this? In my personal opinion, people should like opensource, because it is/was a community effort and not by a single company. In these sorts of projects, if one contributing company goes out, then the clients have more options. Also there will be competition and better code/product through synergy and open discussions. Since users are also involved in this process, the ultimate product will be what users need.
If company A can not afford to build a software alone, they can create a community around it and build a software. This will benefit the company and also benefit other people as well.

But if you write your code internally, make all your decisions and put it, it is just like the automotive industry in 1970s. Customers get what a company wants and not what they want. Even if the clients get the source, it will be crappy most of the time :)

Apache has this nice rule where a project needs at least 3 different players to be recognized as a project within Apache. One of the reasons for this is to make sure, companies won't dump any code and then claim those are opensource.

Monday, July 07, 2008

What is Open source software (to me)

(Warning I might be biased towards my experience in different Apache projects)

I was doing some background work on this for a while, asking from different people and searching the web to understand what people really expects. I was some what pissed-off by this definition here.

Why do we contribute to opensource projects? Do we need something in return other than the satisfaction (some times the visibility is what really matters when you apply for higher studies or jobs, but those are secondary).

I was really happy to see thousands of users posting questions in axis-dev, commons-dev and other mailing lists in my projects in Apache. We have done something for the betterment of their progress and to the world. Do we need to restrict them? Why do we wanna say if you use this, you need to make your stuff also open source or in other words "dance to my rhythm"? Bull shit !!

Are we trying to make the whole world open source, and by doing so create a different and completely secluded camp? What is the point there? We have to be practical and give something to the people out there.

Why do we wanna enforce viral licenses? My main idea is satisfaction out of it. I am so happy to see the code I've written being used by so many people around the world, without any geographical or language barriers. When I introduce myself as a developer from XX project, people really like to talk with me and my colleagues. Do we need anything else from our contributions?

It is true that I was supported by some organization when I was contributing to those projects, but those organizations had better and far more efficient models to earn money, rather than restricting others, from our contributions.

There are some organizations, which are fully closed source, but use lots of opensource software. For their business, GPL like licenses are not healthy. Do we wanna restrict them too? Why? Yes they earn money from our efforts, so what? Those companies are just some set of users from my point of view. Sometimes they give credit to the opensource projects that they have used. Isn't that enough.
Think about the university research groups using our open source software. They are researching for the betterment of the world. They also try to optimize their funding to do something to the world. Do we wanna add barriers to them?

Apache style licenses add no barriers to the end users of those software. You can do whatever you want with it. Isn't it cool? Isn't it the success behind opensource?

There is also another category which I refer as "source-open", rather than open-source, which I need to research a bit.

Monday, June 30, 2008

Our boat over-turned ... It was close


This should have been a continuation of my "Place to visit in Seattle" series, but this was much more than that.

Couple of my friends working together in the same block within MSR, decided to go on canoeing in Lake Washington. We hired canoes from UW recreation center. Me and David Koop were on the same canoe. I was bit reluctant to take my camera, but I really wanted to take a panorama shot of 520 bridge, (one of??) the longest floating bridges.

We went towards the bridge and crossed the shipping lines to go underneath the bridge. We had to face waves coming up about 3-4 feets high, but all of us managed to get through. Then we went to Arboretum and had some rest.

Since it seemed much safer, I removed my life jacket also.

Then we head back to UW rental center. Both me and David leaned towards the same side of the boat, at the same time, and that over-turned the boat. I had no option but to jumb in to the deep water, and I thought I had the life jacket, but it was not. Some how I came up and my head then hit against the boat, as David was trying to get the boat to proper side. (I think I went down once again). Some how we figured we can not turn the boat and we swimmed about 10-15 meters, pulling the boat also to the shore. At that point only I took out my camera.

My camera is still not working but we were saved (I think it was too close). I dried it, but some water seemed to be leaked in to the lenses.

These some of the last photos I took from my camera.

Panoramic view of 520 bridge

View of Mt. Rainier from with 520 bridge

View of Mt. Rainier from Lake Washington


Me and David in the unlucky canoe


The canoing gang

Saturday, June 21, 2008

Places to visit in Washington State - Snoqualmie Falls

Location : Snoqualmie Falls, WA

Directions : Google Maps. About 30 mins from Bellevue, WA

For GPS : 6501 Railroad Ave Se, 98065 (closest Address)

Link : http://www.snoqualmiefalls.com/

This is one of the spectacular waterfalls, I have ever seen. Due to the melting of ice these days, on the mountains, the waterfalls is full of water. I think this is one of the easiest to see, with about 100 feet walk.
But going down the trail, for about 3/4 mile, you will be able to goto the base of the waterfall.
One of the amazing things in this waterfall is that, there is a hydro-power plant at the top of the falls. And the water coming out from that plant, is going through a the rock, through a man made hole, and comes out at the base the waterfall. According to a veteran I met, this was done prior to World war II.
The beauty of the waterfall and the view of the surrounding area is great. If I were to recommend the best places to visit, Snoqualmie falls will be one of them for sure.

If you have time, don't forget to visit the factory outlets in North Bend, which is like 15 mins away.

Panoramic View of the waterfall

View of the waterfall from the top

View of the waterfall from the base



Places to visit in Washington State - Little Si Hike

Location : Little Si, North Bend, WA

Directions : Take I-90 and take exit 31. You will get in to Bendigo Blvd. Go straight till you find SE North Bend way and you wanna take a right when you meet North Bend Way. Go till you find Mount Si Road and turn left. Once you pass the bridge over Snoqualmie river, turn right and follow the directions for Little Si parking lot.

For GPS : 47.49867, -121.756228 or 434th Avenue SE and SE Mt Si Road, North Bend, WA.

Links : http://www.mountsi.com/

We went on a hike today to Little Si (North Bend, WA) with my friends in MSR. It was like 30 mins drive from Bellevue through I-90 and getting there was pretty easy.

The hike was abt 5 miles (round-trip) to about 1600 feet. I took my 22 months old kid with me, and it was hard, but was doable. It was just that my body resisted for a hard workout after a long time.
We were passing a nice forest on the way and it was nice to see all those rock climbers working hard to get to the summit. When we reached to the peak, we could see the fascinating view of Snoqualmie river and North Bend area.

Panoramic view of the cascades, from the middle of little Si

Panoramic view from the peak of Little Si

View of Mount Si and Little Si from I-90

Tough Hike, Ha ...!!

The Gang