Travel Analytics Blog
Feeding your ELK Stack
Posted by Jonathan Boffey on Thursday, July 12, 2018
There is a massive interest in Open-source technologies for data analytics including ELK/Kibana. Large travel suppliers and distributors are quick to embrace these "Big Data" technologies to underpin corporate data analytics. This is a trend typically led by IT who are keen to deliver exciting data to the business.
The 'ELK' stack comprises ElasticSearch, LogStash and Kibana. In this configuration, LogStash collects data from logfiles, ElasticSearch provides a fast data store and Kibana provides the user reporting interface. The popularity of ELK appears well founded since Kibana is a very enticing UI and the rest of the stack provides the ability to collect, process and store system log and web log data. So what's the challenge?
Well, we don't live in a one channel world. Typical travel suppliers have direct brand.com websites but probably more than half their bookings come via a set of indirect channels that rely heavily on XML based APIs. Further up the supply chain, the wholesalers, distributors, aggregators and switch/gateways have technology platforms that are nearly 100% based on APIs. I say nearly because some run portals for agent logins etc which is more akin to a website. In any event, ELK is ideal for the tracking the website for the direct channel but its log file scanning provides little more than IT level information when it comes to the indirect XML channel. There are several reasons.
The first is that APIs have separate request and response events that appear at different times in the flow of log data with hundreds if not thousands of other requests and responses in between. Then the content of each isn't a one liner at the end of a log file - its many lines forming a full blown XML document which is tree structured and full of lists of items/offers etc. Even with the best will in the world, simple scanning doesn't match the request with the response and doesn't have sufficient functionality to interpret the complex data and fish out the answers to our key business questions such as when and where is the traveller going? and what did we offer? and at what price?
Kibana can only report the data that its is supplied with - it's our old 'garbage in' problem being revisited but this time its more of an 'unbalanced diet' that needs supplemented. Processing potentially high volumes of XML traffic in real time for reporting without killing your booking platform with the logging effort is a challenge in itself. Getting the right 'nutrients' from this huge transactional data volume is the next obstacle. Feeding them into the ELK stack to provide the actionable business information is the final stage. Only actions lead to ROI.
So if you are one of those distribution managers that is living with low quality information about your indirect channels this might very well be the reason why. ELK isn't the only animal in this technology jungle and there are similar problems with data being fed through Kafka and other big data technologies.
Triometric has recently completed a couple of very significant projects with major travel organisations to address these issues - using a mature XML processing technology to plug into the corporate big data strategy and make a real difference to the ROI. This approach is now more widely available as the Trio Data Engine.
(This article first appeared on my profile on Linkedin)