Wednesday, February 22, 2012

Oracle Big Data approach

In a previous post I already zoomed in at the way Oracle is thinking about big data. The post Map reduce into relation of Big Data and Oracle there was a outline on how Oracle is defining big data and how they are intending to use map reduce and Hadoop in their approach to handling big data. As you might know Oracle has launched a big data appliance which integrates and makes use of a couple of important components in the big data components currently used. The Oracle big data appliance will provide you an out of the box working solution where the supplier has engineered all the components like in all the other solutions in the Oracle Exa- stack. Or as Oracle likes to state "hardware and software engineered to work together"


As you can see in the above diagram the Oracle Big Data Appliance makes use of some of known and important components. The decision is made to run the entire system on Oracle Linux, an option would have been to run it on Solaris however due to the wide adoption of Oracle Linux and the fact that a majority of the Hadoop solutions is primarily focusing on Linux and not on Solaris it is running on Linux (assumption from my side)

For the rest we see the Oracle NoSQL database as integrated part of the appliance which is also not a big surprise as Oracle is pushing its NoSQL solution into the market to gain market share in the NoSQL market. Looking at the Oracle NoSQL solution they do a quite good job and have launched a good NoSQL product with a lot of potential. 

As we are talking about big data Hadoop is part of this appliance and this comes as no surprise, what also not comes as a surprise however is very good to see is the integration in this appliance with the Oracle loader for Hadoop and the Oracle Data Integrator

Oracle Loader for Hadoop:
"Oracle Loader for Hadoop is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. Oracle Loader for Hadoop sorts, partitions, and converts data into Oracle Database formats in Hadoop, then loads the converted data into the database.  By preprocessing the data to be loaded as a Hadoop job on a Hadoop cluster, Oracle Loader for Hadoop dramatically reduces the CPU and IO utilization on the database commonly seen when ingesting data from Hadoop. An added benefit of presorting data is faster index creation on the data once in the database."

Oracle Data Integrator:
"Oracle Data Integration provides a fully unified solution for building, deploying, and managing real-time data-centric architectures in an SOA, BI, and data warehouse environment. In addition, it combines all the elements of data integration—real-time data movement, transformation, synchronization, data quality, data management, and data services—to ensure that information is timely, accurate, and consistent across complex systems."

The Big Data Appliance fits into the overall exa strategy from Oracle where they are delivering appliances and it also fits in the overall big data strategy.


As you can see a lot of the steps in the acquire and the organize stages of the big data approach from Oracle are covered by the big data appliances. 

No comments: