Open Data Platform

My personal views on recent open data platform announcement.


I present views as industry user of Hadoop platform and its related ecosystem tools. I don't have any association with any of core Hadoop distribution companies. My bread and butter is Hadoop so it excites me :)

So what's good for us with ODP

Often when we operate in organizations with multi flavor Hadoop deployments we face the situation to find answer for question will this tool work with this vendor of Hadoop. Do we have to test performance or compatibility for anything. With ODP we can be sure that if product is compatible with X version of ODP then all the flavors will support it.
Same is the situation with BI tools vendors I think their headache will be reduced to be continuously trying to prove that the product will work on x flavor of Hadoop.
So its win win situation for BI tools vendors and Implementors.

Now just some thought on Vendors

I read the post [1] by Cloudera guys on original announcement for not joining ODP. Cloudera at this moment is far ahead of traditional Hadoop flavours (Leaving MapR for now ) Cloudera team has invested a lot to produce awesome products which made them far far ahead in competition. (And best product is not any code artifact , read ahead) The news of revenues of $100m also prove the same. By sheer amount of hard work they did , Cloudera know the position they have in market , we can see the same as pride behind the post on ODP , you can read it as we don't need anyone :) They are already doing pretty good in sponsoring Apache foundation.

These guys are like most hardworking people in class who will do nothing but coding all the time and will happily join with whosoever say the word open source :)  Collaboration with community and open source is in DNA of Hortonworks and that what makes it class apart from all other vendors. They missed the original wave of Hadoop market (money) due to there lenient attitude of not giving any attention to documention in Hadoop space when it was started. I think this is what made Cloudera ahead in the race. When everyone was starting no one knew how to use hadoop and cloudera documented it properly. Slowly they gathered trust of customers and customers told them what to build and sell :)
What's in for them in ODP , I cannot see much value for them they are already awesome in HDP space. If you just remove H in HDP and replace with O , that's what the million dollar investment is for in this initiative.  Hortonworks I am big fan of yours :)

Hmm I need to think few times before saying anything what this company is trying to do. At this moment ODP seems to be desperate attempt by Pivotal to create its footprint by some way. Pivotal strategy of closed source distribution did not work may be its due to fact that pivotal is like mixed breed of open source and closed source founders. They were unable to sell or decide what they really want to do. I am bit surprised to read Cloudera reply about pivotal open source contribution , they did pretty good work in Cloud foundry , Redis , Rabbitmq. (That comment seems to be driven by some proud ) Now what ODP will do for Pivotal. I felt they should have dropped own distribution and gone for HDP flavor. The selling point of Pivotal should be support and expertise in implemeting Hawq , Greenplum , Gemfire solutions for customer Although they have to prove it yet that these tools will work in competition which we have in market. I am too much surprised too see they dropped Kafta Storm from distribution which they want to bring. Its definitely wrong decision. Another aspect which is awesome for pivotal is Cloud foundary stack. Combining with BigData apps and  presentation layer they can bring their expertise to the table. Special Hint for them IoT is the key :) Do your homework properly ODP also won't work for you if you do wrong planning and with current stack which you announced it won't work for you.
Suggestion for them is to just stick to things in which they are awesome and bring that to table for customers , don't care who says what about them and let the code do the talk :) after apache incubation , start talk about code which you opensourced onto bigtop mailing lists. Ask comminity what would be good to be moved and packaged to bigtop. , learn from Hortonworks :)

Closing thoughts
ODP , Hadoop

I wonder why ODP move was not proposed discussed in Bigtop mailing list , who seems to be natural umbrella for all integrations under Hadoop space. Even if it would have been sub project , which was focused on checking compatibility of third party tools under given bigtop release. No individual BI or other Hadoop ecosystem vendor can manage the whole integration stack and that is the reason why ODP has come up. There are license issues which guides Apache to integrate selected components into products produced by Bigtop this is one of the factor for spawning of ODP. Could Bigtop have been better place to talk all what ODP team members think ?

At this moment very little is released about plans of ODP except some thing on [2] and official announcement. So we have to wait for some more time to see what's coming up

I feel everyday a new challenge for people like me who work in this space and can easily feel the challenges which companies whose whole bread and butter is to make those systems. Hadoop has changed life of many (including me) and I am sure it will be exciting space in future too.
ODP will bring some kind of uniform platform in the future , so let's be open and welcoming to efforts people are trying to make. Even if it fails , it doesn't matter they are just trying to make some level playing field for everyone :)