Monday, July 07, 2014

Using R and Oracle Exadata

Currently the R language is the choice for statistical computing and is widely adopted in the commercial and scientific community performing statistical computing. R is a free statistical language developed in 1993 and released under the GNU General public license. Traditionally R has been used to do statistical computations on large sets of data and due to this it is seeing an adoption in the Big Data ecosystem even though it is not as widely adopted as for example the MapReduce programming paradigm which has it high adoption rate thanks to Apache Hadoop.

Even though, R is claiming its place in the Big Data ecosystem and is seeing enterprise grade adoption. Due to this there are a number of enterprise ready R implementations available. Oracle is one of the companies who have developed enterprise ready R named “Oracle R Enterprise”. Interesting about the Oracle R Enterprise distribution is that it will become a part of the database server itself.

In general the idea is that on the database server multiple R engines will be spawned and will work in parallel to execute the computations needed. Depending on your programming the results can be stored in the database, can be given to a workstation or can be sending to a Hadoop cluster to execute additional computations. As an addition to this, due to the Hadoop connector R inside the database server can potentially also make use of data inside the Hadoop cluster if needed. From a high level perspective this will look like the diagram below.


Oracle is providing engineered systems for both Big Data, Analytics and the Oracle database. This means we can also deploy the above outlined scenario on an engineered systems deployment. In the below diagram we will use a pure Oracle Engineered Systems solution however this is not required, you can use Oracle engineered systems where you deem them needed and leave them out where you do not need them. However, there are large benefits when deploying a full engineered system solution. 

In the above example diagram the deployment is using Oracle Exadata, Oracle Big Data Appliances and the Oracle Exalytics machine. By combining those you will benefit from both R and from the capabilities of the Oracle Engineered systems. When you are in need of deploying R for analytical computing and you are also using Oracle databases and applications on a wider scale in your IT landscape it will be extreemly beneficial to give Oracle Enterprise R a consideration and depending on the size of your data to combine this with Oracle Engineered systems.

No comments: