Friday, 5 June 2015

Spark Reaches for the Holy Grail: Federated Queries

Hallowed Ground: Data Federation

Data federation technology is software that provides end-users with the ability to aggregate data from disparate sources and formats with virtual database objects. The benefits of this technology includes increased availability and reliability as well as improved access times for BI and data analysis.

The major data warehouse players - IBM, Oracle, SAS, Teradata - set the bar for federation, and the solutions from these companies allow seamless access to data from multiple, external sources via the vendor’s SQL interfaces and APIs. In other words, if you have JSON files, DB2 data, XML, Sybase data... with these federation technologies, you can query across all these sources in a single SQL statement. Of course, these solutions have a cost, both in terms of software licenses and specialized engineers. Apache hopes to alter the landscape with its Open Source Spark project.

Oracle Keeps Pace

Oracle has aggressively kept up with supporting federated queries and providing a unified experience for working with data. Their latest suite of products, including GoldenGate for Big Data and ODI make it possible to tap into data across a wide range of sources, including Hadoop. This quote is from their Data Service Integrator website:

 

“Oracle Data Service Integrator provides ...


Read More on Datafloq


Read the full article here by Datafloq
Post a Comment