Thursday, October 10, 2013

Hadoop shows us a possible future of computing

Computing has traditionally been processor-centric. The classic model of computing has a "central processing unit" which performs computations. The data is provided by "peripheral devices", processed by the central unit, and then routed back to peripheral devices (the same as the original devices or possibly others). Mainframes, minicomputers, and PCs all use this model. Even web applications use this model.

Hadoop changes this model. It is designed for Big Data, and the size of data requires a new model. Hadoop stores your data in segments across a number of servers -- with redundancy to prevent loss -- with each segment being 64MB to 2GB. If your data is smaller than 64MB, moving to Hadoop will gain you little. But that's not important here.

What is important is Hadoop's model. Hadoop moves away from the traditional computing model. Instead of a central processor that performs all calculations, Hadoop leverages servers that can hold data and also perform calculations.

Hadoop makes several assumptions:

  • The code is smaller than the data (or a segment of data)
  • Code is transported more easily than data (because of size)
  • Code can run on servers

With these assumptions, Hadoop builds a new model of computing. (To be fair, Hadoop may not be the only package that builds this new model of distributed processing -- or even the first. But it has a lot of interest, so I will use it as the example.)

All very interesting. But here is what I find more interesting: the distributed processing model of Hadoop can be applied to other systems. Hadoop's model makes sense for Big Data, and systems with Little (that is, not Big) data should not use Hadoop.

But perhaps smaller systems can use the model of distributed processing. Instead of moving data to the processor, we can store data with processors and move code to the data. A system could be constructed from servers holding data, connected with a network, and mobile code that can execute anywhere. The chief tasks then become identifying the need for code and moving code to the correct location.

That would give us a very different approach to system design.

6 comments:

mareddyonline said...

This information was really very helpful for hadoopers I am going to share this with some of my class mates and friends coz I want them to see that this blog is having such nice content.
Hadoop Training in hyderabad

Unknown said...

Nice piece of article you have shared here, my dream of becoming a hadoop professional become true with the help of Best Hadoop Training in Chennai, keep up your good work of sharing quality articles.

Unknown said...

Hi, I am Jackson from Chennai. I am technology freak. I did Hadoop Training Chennai at FITA. This is useful for me to make a bright career in IT field.

Unknown said...

Hi friends,This is Johnson from Chennai.Thanks for sharing this informative blog. I did Unix certification course in Chennai at Fita academy. This is really useful for me to make a bright career.
Regards..
Unix Training

Unknown said...

I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
Regards,
Python Classes in Chennai|Python training courses|Python training in velachery

Unknown said...

Thanks for your interesting ideas.the information's in this blog is very much useful
for me to improve my knowledge.
Salesforce Training in T nagar
Salesforce Certification Training in T nagar
Salesforce Training in Anna Nagar
Salesforce courses in Anna Nagar