Category Archives: ibmdeveloperworks

bluemix, ibmdeveloperworks, webdevelopment, webservices

Process home monitoring data using the Time Series Database in Bluemix

September 21, 2014 Martin MC Brown

I keep a lot of information about my house – I have had sensors and recording units in various parts of my house years, recording info through a variety of different devices.

Over the years I’ve built a number of different solutions for storing and displaying the information, and when the opportunity came up to write about a database built specifically for recording this information I jumped at the change, and this is what I came up with:

As home automation increases, so does the number of sensors recording statistics and information needed to feed that data. Using the Time Series Database in BlueMix makes it easy to record the time-logged data and query and report on it. In this tutorial, we’ll examine how to create, store, and, ultimately, report on information by using the Time Series Database. We’ll also use the database to correlate data points across multiple sensors to track the effectiveness of heating systems in a multi-zone house.

You can read the full article here

big data, hadoop, ibmdeveloperworks

Harvest machine data using Hadoop and Hive

April 10, 2014 Martin MC Brown

A new article on has been published on IBM developerWorks, looking at the basics of processing machine data using Hadoop, from extracting the core data, storing it, and then determining the baselines and trigger points required to identifying worrying trends and points. From the intro:

Machine data can come in many different formats and quantities. Weather sensors, fitness trackers, and even air-conditioning units produce massive amounts of data, which begs for a big data solution. But how do you decide what data is important, and how do you determine what proportion of that information is valid, worth including in reports, or valuable in detecting alert situations? This article covers some of the challenges and solutions for supporting the consumption of massive machine data sets that use big data technology and Hadoop.

Harvest machine data using Hadoop and Hive.

big data, data mining, ibmdeveloperworks

Process complex text for information mining

February 5, 2014 Martin MC Brown

My latest article on data mining text information is now available:

Text — an everyday component of nearly all social interaction, social networks, and social sites — is difficult to process. Even the basic task of picking out specific words, phrases, or ideas is challenging. String searches and regex tools don\’t suffice. But the Annotation Query Language (AQL) within IBM InfoSphere® BigInsights™ enables you to make simple and straightforward declarative statements about text and convert that into easily manageable data chunks. Learn how AQL and InfoSphere BigInsights can process text into meaningful data and find out how to convert that information into something usable within the BigSheets environment to get statistical and visualized data from the raw material.

Read Process complex text for information mining.

big data, datamining, hadoop, ibmdeveloperworks

Building flexible apps from big data sources

December 28, 2013 Martin MC Brown

My article on how to build flexible apps on top of the BigInsights platform has been published. This demonstrates a cool way to combine some client-end JavaScript and existing technologies to build a Big Data query interface without developing a specialised application for the purpose.

It’s no secret that a significant proportion of the needs for big data have come from the explosion in Internet technologies. Up until 10-20 years ago, the idea of a public-facing application having more than a few million users was unheard of. Today, even a modest website can have millions of users, and if it’s active, can generate millions of data items every day. The irony is that the very infrastructure and systems that create big data can also work in reverse, and provide some of the better ways to integrate and work with that data. Usefully, InfoSphere® BigInsights™ comes with support for managing and executing data jobs through a simple REST API. And through the Jaql interface, we can run queries and get information directly from a Hadoop cluster. This article looks at how these systems work together to give you a rich basis for capturing data and provide an interface to get the information back out again.

Building flexible apps from big data sources.

big data, hadoop, ibmdeveloperworks

Process big data with Big SQL in InfoSphere BigInsights

December 24, 2013 Martin MC Brown

The ability to write an SQL statement against your Big Data stored in Hadoop provides some much needed flexibility. Sure, using Hive or HBase you can perform some of those operations, but there are other alternatives that may suit your needs better, such as the Big SQL utility. My latest article on this tool is provided here:

SQL is a practical querying language, but is has limitations. Big SQL enables you to run complex queries on non-tabular data and query it with an SQL-like language. The difference with Big SQL is that you are accessing data that may be non-tabular, and may in fact not be based upon a typical SQL database structure. Using Big SQL, you can import and process large volume data sets, including by taking the processed output of other processing jobs within InfoSphere BigInsights™ to turn that information into easily query-able data. In this article, we look at how you can replace your existing infrastructure and queries with Big SQL, and how to take more complex queries and convert them to make use of your Big SQL environment.

Process big data with Big SQL in InfoSphere BigInsights.

big data, hadoop, ibmdeveloperworks

SQL to Hadoop and back again, Part 3: Direct transfer and live data exchange

December 23, 2013 Martin MC Brown

The third, and final article in my series on migrating data to and from Hadoop and SQL databases is now available:

Big data is a term that has been used regularly now for almost a decade, and it — along with technologies like NoSQL — are seen as the replacements for the long-successful RDBMS solutions that use SQL. Today, DB2®, Oracle, Microsoft® SQL Server MySQL, and PostgreSQL dominate the SQL space and still make up a considerable proportion of the overall market. In this final article of the series, we will look at more automated solutions for migrating data to and from Hadoop. In the previous articles, we concentrated on methods that take exports or otherwise formatted and extracted data from your SQL source, load that into Hadoop in some way, then process or parse it. But if you want to analyze big data, you probably don’t want to wait while exporting the data. Here, we’re going to look at some methods and tools that enable a live transfer of data between your SQL and Hadoop environments.

SQL to Hadoop and back again, Part 3: Direct transfer and live data exchange.

big data, data, ibmdeveloperworks

SQL to Hadoop and back again, Part 2: Leveraging HBase and Hive

October 9, 2013 Martin MC Brown

The second article in a series covering Big Data and SQL interaction is available now:

“Big data” is a term that has been used regularly now for almost a decade, and it — along with technologies like NoSQL — are seen as the replacements for the long-successful RDBMS solutions that use SQL. Today, DB2®, Oracle, Microsoft® SQL Server MySQL, and PostgreSQL dominate the SQL space and still make up a considerable proportion of the overall market. Here in Part 2, we will concentrate on how to use HBase and Hive for exchanging data with your SQL data stores. From the outside, the two systems seem to be largely similar, but the systems have very different goals and aims. Let\’s start by looking at how the two systems differ and how we can take advantage of that in our big data requirements.

SQL to Hadoop and back again, Part 2: Leveraging HBase and Hive.

datamining, hadoop, ibmdeveloperworks

SQL to Hadoop and back again, Part 1: Basic data interchange techniques

September 25, 2013 Martin MC Brown

I’ve got a new article, which is part of a new three-part series, on moving data between SQL and Hadoop, both the export to Hadoop and importing processed content back into an SQL store.

In this first one, we look at the basic mechanics and considerations before you start the migration of data, such as the data format, content, and export techniques.

Read: SQL to Hadoop and back again, Part 1: Basic data interchange techniques

couchbase, datamining, ibmdeveloperworks

Data Mining in a Document World

February 18, 2013 Martin MC Brown

As databases evolve, learning how to get the best out of the different solutions out there is the key to understanding and extracting the data in the way you need from your required data store. Document databases, like MongoDB, CouchDB, Couchbase Server and many others provide a completely different model and set of problems for interfacing and extracting data.

You need to be able to understand your structure, how you can query the information, and how to perform different data mining techniques on what is very obviously a completely different structure of information.

In this article, I try to take you through the basics of data mining when using a document database.

Read: Data mining in a document world

couchbase, hadoop, ibmdeveloperworks

Couchbase and Hadoop Article in Portuguese

December 27, 2012 Martin MC Brown

There is nothing cooler than finding that one of your articles has been translated into another language, and I just found out recently that my Using Hadoop with Couchbase article has been translated into Portuguese here: Usando o Hadoop com Couchbase

Planet MCB Guru

Category Archives: ibmdeveloperworks

Process home monitoring data using the Time Series Database in Bluemix

Harvest machine data using Hadoop and Hive

Process complex text for information mining

Building flexible apps from big data sources

Process big data with Big SQL in InfoSphere BigInsights

SQL to Hadoop and back again, Part 3: Direct transfer and live data exchange

SQL to Hadoop and back again, Part 2: Leveraging HBase and Hive

SQL to Hadoop and back again, Part 1: Basic data interchange techniques

Data Mining in a Document World

Couchbase and Hadoop Article in Portuguese

All the MCB Guru blogs that are fit to print