Author: abishek


SQL over anything with Optiq

Of late there has been a lot of attention over SQL query planning engines, especially with the rise of “SQL on Hadoop”. With the proliferation of multiple storage solutions like Hadoop, Hbase and NoSQL DBs has also come the problem of accessing data in a uniform way. Each storage solution has come up with its own set of APIs and “SQL-like” querying languages. This poses a serious challenge and steepens the learning curve for scientists and researchers accessing data on a large scale. Many players in the Big Data space have realized this and are moving towards ANSI SQL standards.

Interactively analyzing a large JSON in memory

I have been doing a comparative study on different ways to analyse a large JSON file in memory. Our basic requirement is to do interactive analysis on nested data; for test purposes I am refraining from using a distributed/big data set up. For this use case, what’s interesting is the variance in test results for analysis in a simple row vs columnar fashion.