COLUMN

Why do BI/DW Based Analytics Projects Fail?  Hint: It's All in the Data

posted by Terence Craig

In my last column, I talked about how advances in the computing environment and increases in data volume have made legacy analytic systems (from the Data Warehouse and Business Intelligence worlds) inadequate. Now I would like to take a look at one of its most problematic processes.

One of the major components of traditional data warehouses applications is the Extract, Transform, and Load process.  The goal of this process is to take data from an existing application(s) that is usually implemented using modern user-centric design techniques like Domain-Driven Design and do the following:

  • Extract  –  pull the data from a structure that the user understands then
  • Transform – the structure of the data by loading it into a data structure that makes it convenient for the data warehouse to access and then
  • Load – load the data into this transformed structure in the data warehouse.

Essentially, this process takes data from a model which often makes intuitive sense to an organizations’ end users—for example, Product sales by store and by customer—and then transforms it into something that is more convenient for the data warehouse (quite often this “transformation” is completely incomprehensible to the end user). Let me show you what I mean. In this example, the data about “product sales by store and by customer” is spread across three different tables.

Tables diagram

This of course means that for the end user to use the tool, one of two things has to happen.  Someone—usually a group of highly paid someone’s—has to write code to translate the data back into something the user understands. Or you have to train the end users to see the world the way the tool does—and good luck with that one.  Bottom line: this represents quite a bit of work and is the primary reason that large BI/DW/Analytics projects are notorious black holes even for software development where the premise is that 90% of projects will fail. 

But if they don’t fail before they “get out of the gate,” deployed DW projects are often universally hated by everyone (although the person who authorized the gazillion dollar software/hardware purchase will never admit it, but if you spent that much money would you?).

And don’t even get me started on SQL. 

Here’s what I think: every BI vendor on the planet should send money to Microsoft because despite the huge outlays on DW/BI technology, almost all useful analysis is done in Excel by frustrated end users who download data from the BI tool and then transform it back into something that makes sense to them. And all of this is accomplished while they copiously curse their IT department for wasting their time.

This is frustrating for everyone in the organization and just doesn’t make sense.  And that’s why I founded PatternBuilders; we believe there is a better way to do this. And I will start to lay it out in my next post.


Hello World - AKA How Can PatternBuilders Help You Cross the Analytics Gap?

posted by Terence Craig

One of the joys (and dangers) of being the founder and CEO/CTO of a startup is that it is very hard for the highly qualified members of your team who actually understand corporate communications to shut you up when you decide you want to launch a blog on the corporate website – but since at least two of the members of my executive staff are expert shots I will try and restrain myself appropriately and focus on writing a blog on what is the most important thing to everyone here at PatternBuilders – how our analytics platform  and expertise can help your organizations use analytics to make your jobs easier and more efficient.  Traditionally, I would open this first post with a bit of self-aggrandizement about our team and advisors  but since we have already done that here and here, I am going to dive directly into the  obvious question: what do we have to offer customers like you that are looking for an improved analytics solution?  Or put another way: what can we offer you over and above what large company X offers—you know, the one that claims to be the alpha and omega of all your analytic needs?

Here’s the short answer. We fill a very large void in software solutions that are focused on Analytics as well as their runty, but famous, cousins Business Intelligence and Data Warehouses.

Here’s the big issue that you are probably dealing with today. You have all these automated transactional systems that, together, run your enterprise, whether it’s your manufacturing operations, POS Systems, ERP, Clinical Operations systems, etc. These systems produce a tremendous amount of data—but your attempts to produce analytics to improve your operations based on this data are expensive, cumbersome, unusable to most of your staff, and not ready for the real-time Web 2.0 world that defines your operating environment.  You’ve read “Super Crunchers,” by Ian Ayres and the Freakeconomics column of the New York Times and know the  sort of stuff that can be achieved if you can turn this data into information, but your BI and Data Warehouse vendors just don’t seem to be able to get you there.

Why is this such a big problem since Business Intelligence and Data Warehouses would seem to be natural fits? Here’s the thing: they were designed in, and for, a different era.  Back then:

  • Jimmy Carter was President
  • Sun was an independent company
  • A gigabyte of data was a lot
  • A gigabyte of memory was impossible
  • A low-end personal computer cost $6,000
  • Multi-threading was a research project

At that time, technologies and methodologies we take for granted today did not even exist such as:

  • The Web
  • XML
  • Java
  • .NET
  • Domain-Driven Design (a concept where software development projects are driven by the domain and domain logic—for example, interfaces should be designed using the language and processes of the business user)  

To put it more succinctly, the original Data Warehouse designers had a lot more constraints to deal with in terms of computing power and a lot less data and users to deal with.  In fact, the idea of designing a system that would provide real-time analysis of data that comes at rates measured in Terabytes/Day would have seemed like science fiction back then, as would the idea that they would have to create a front-end that could be distributed securely world-wide to potentially thousands of users across the globe simultaneously.  But these are the requirements of a modern analytics system and lead in to my much longer answer as to why we need a new analytics technology platform. In my next post, I will begin to talk about the differences in analytics approaches and the advantages our technology and services offer.