The waving trends on Big Data, specialisation or generalisation?

February 2014

Storing, processing and extracting valuable knowledge from data has been the most general use case in the last 40 years of business applications. We have seen a waving trend in data storage, manipulation and access that still repeats. Technologies have been moving from general approaches to specialised techniques and back to generalised with revisited techniques.
Looking back 15 years we were at the peak of specialisation with OLAP cubes as tools to manipulate application specific data, rooted at Data Marts, fed by dedicated ETL tools and accessed through XML/A interfaces. Knowledge management and analytics were able to scale thanks to very specific bases of knowledge and de-normalisation.
Five years later, and thanks to the Cloud computing shift and increased resource availability, generalisation came back with distributed, normalised data-stores and ORM frameworks that were able to abstract the persistence layer and split datasets intelligently, while using underneath, our old friend the Structured Query Language (SQL) — remember that Facebook was using MySQL databases –.
SQL databases however, were imposing ACID constraints which in many use cases were not necessary. Note that not all applications require referential integrity or transactional capabilities. So 5 to 6 years ago specialisation was returning with the emergence of NoSQL paradigm. Key value stores implementing the Big Table structure such as Hbase, Column oriented stores such as Cassandra and Document stores such as MongoDB emerged quickly demonstrating an increase of performance by several orders of magnitude for certain types of applications. However, these models required lots of specialisation, meaning that for example an application tailored to Hbase would not be easily converted to a MongoDB application.
A key point in that shift to specialisation again was the concept of Map Reduce, enabling batch processing of enormous amounts of data to extract knowledge – as an answer to BI approaches from 10 years back -.
In the last 4 years, most of the Internet large scale applications are storing their data in a NoSQL data store, but now they realise that specialisation imposes a big restriction on flexibility to query data. Impala, Hive, Kiji, Pig, etc. are shifting the trend again, our old friend SQL is returning, this time on NoSQL data-stores.
To summarise, NoSQL data-stores are very important enabling Internet applications to scale, however, do not underestimate the potential of BI technologies from 15 years ago. OLAP cubes still rock!

About Worldsensing

Worldsensing is a global IoT pioneer. Founded in 2008, the infrastructure monitoring expert serves customers in more than 70 countries, with a network of global partners to jointly drive safety in mining, construction, rail and structural health.

Worldsensing is headquartered in Barcelona and has a local presence in the UK, North and South America, Singapore, Australia and Poland. Investors include Cisco Systems, Mitsui & Co, McRock Capital, ETF, Kibo Ventures and JME Ventures.

Press contact:
Jennifer Harth


Sign up for IoT remote monitoring news.