All the technical detail, expertise and advice

Back

Keep Calm and Ask Any Question

Published: Sunday, 26 April 2015 08:53 by Ant Phillips, Senior Developer
Digital Intelligence

One of the most striking trends we have seen recently is how our customers are using our product for real-time personalisation, not just on websites but also across other digital channels e.g. SMS and email, and non-digital ones e.g. call centre. Historically, the offers and promotions presented to website visitors have always been focused around an individual and their behaviours on the site.

For example, a visitor could be identified as interested in car insurance if they browse to pages involving car insurance, or search for the words car insurance and quote in the site search. Based on this segmentation, an appropriate offer could be presented. The key point in this simplified example is that the segmentation is based on characteristics of that individual, in isolation.

Now consider a case where you want to offer your existing customers a discount or coupon, but only if they show an interest and haven’t yet purchased a car insurance product from you via any channel. Perhaps this offer is for just a few select customers based on their value to the business. So how best to calculate this value? Well a first pass might segment the customers by their location (for example, country), add up their purchases over the last few months, and then rank them with only the top n in each country receiving the offer. You can see straight away that this is a very different approach to the earlier example. To calculate this list requires a view across all the visitors and all their activity not just on the website site, but also across other channels such as call centre conversations and in-branch purchases.

Total purchases is one measure but it misses some important facts. For example, how profitable are the products they purchased? To calculate profitability requires a view on the supply chain. Equally, you might also include in the calculation how much support have they required, either in calls to a contact centre or call-outs (for example, home visits), to understand the profitability of that particular customer. To calculate this kind of most profitable customer requires many data sources to be integrated from across an enterprise.

The requirement to integrate many data sources and then action on the results has driven the latest integration in Celebrus, we call it Ask Any Question. With our latest release we can feed customer data into Hadoop and Teradata Aster, run wide ranging queries across huge data sets, and then action off the results.

Furthermore we provide a playbook which shows you step-by-step how to achieve this. For instance, one example feeds customer data into HDFS using Apache Avro. The data is loaded into Apache Hive where analytical queries determine the most valuable customers. The results are formatted using Apache Pig and pushed into Celebrus ready for presentation when the customers next visit.

Apache Avro is one of the latest generation of file formats adopted by the Hadoop ecosystem. It is designed to solve many of the problems inherent in binary sequence files (versioning, language independence and schema awareness). The Hadoop Data Loader creates Avro files and pushes them directly into HDFS using webhdfs. Once the files are in HDFS they can be added into Hive tables (and by extension, Impala) with the LOAD DATA INPATH command. This command is very efficient because it simply moves the files in HDFS into the Hive warehouse directory.

If you would like to learn more about the Celebrus Data Loader, take a look at these slides.