COLUMN
Why do BI/DW Based Analytics Projects Fail? Hint: It's All in the Data
posted by Terence Craig
In my last column, I talked about how advances in the computing environment and
increases in data volume have made legacy analytic systems (from the Data Warehouse
and Business Intelligence worlds) inadequate. Now I would like to take a look at
one of its most problematic processes.
One of the major components of traditional data warehouses applications is the Extract,
Transform, and Load process. The goal of this process is to take data from
an existing application(s) that is usually implemented using modern user-centric
design techniques like Domain-Driven Design and do the following:
- Extract – pull the data from a structure that the user understands then
- Transform – the structure of the data by loading it into a data structure that makes
it convenient for the data warehouse to access and then
- Load – load the data into this transformed structure in the data warehouse.
Essentially, this process takes data from a model which often makes intuitive sense
to an organizations’ end users—for example, Product sales by store and by customer—and
then transforms it into something that is more convenient for the data warehouse
(quite often this “transformation” is completely incomprehensible to the end user).
Let me show you what I mean. In this example, the data about “product sales by store
and by customer” is spread across three different tables.
This of course means that for the end user to use the tool, one of two things has
to happen. Someone—usually a group of highly paid someone’s—has to write code
to translate the data back into something the user understands. Or you have to train
the end users to see the world the way the tool does—and good luck with that one.
Bottom line: this represents quite a bit of work and is the primary reason that
large BI/DW/Analytics projects are notorious black holes even for software development
where the premise is that 90% of projects will fail.
But if they don’t fail before they “get out of the gate,” deployed DW projects are
often universally hated by everyone (although the person who authorized the gazillion
dollar software/hardware purchase will never admit it, but if you spent that much
money would you?).
And don’t even get me started on SQL.
Here’s what I think: every BI vendor on the planet should send money to Microsoft
because despite the huge outlays on DW/BI technology, almost all useful analysis
is done in Excel by frustrated end users who download data from the BI tool and
then transform it back into something that makes sense to them. And all of this
is accomplished while they copiously curse their IT department for wasting their
time.
This is frustrating for everyone in the organization and just doesn’t make sense.
And that’s why I founded PatternBuilders; we believe there is a better way to do
this. And I will start to lay it out in my next post.
Hello World - AKA How Can PatternBuilders Help You Cross the Analytics Gap?
posted by Terence Craig
One of the joys (and dangers) of being the founder and CEO/CTO of a startup is that
it is very hard for the highly qualified members of your team who actually understand
corporate communications to shut you up when you decide you want to launch a blog
on the corporate website – but since at least two of the members of my executive
staff are expert shots I will try and restrain myself appropriately and focus on
writing a blog on what is the most important thing to everyone here at PatternBuilders
– how our analytics platform and expertise can help your organizations use
analytics to make your jobs easier and more efficient. Traditionally, I would
open this first post with a bit of self-aggrandizement about our team and advisors
but since we have already done that here and here, I am going to dive directly into the obvious
question: what do we have to offer customers like you that are looking for an improved
analytics solution? Or put another way: what can we offer you over and above
what large company X offers—you know, the one that claims to be the alpha and omega
of all your analytic needs?
Here’s the short answer. We fill a very large void in software solutions that are
focused on Analytics as well as their runty, but famous, cousins Business Intelligence
and Data Warehouses.
Here’s the big issue that you are probably dealing with today. You have all these
automated transactional systems that, together, run your enterprise, whether it’s
your manufacturing operations, POS Systems, ERP, Clinical Operations systems, etc.
These systems produce a tremendous amount of data—but your attempts to produce analytics
to improve your operations based on this data are expensive, cumbersome, unusable
to most of your staff, and not ready for the real-time Web 2.0 world that defines
your operating environment. You’ve read “Super Crunchers,” by Ian Ayres and
the Freakeconomics column of the New York Times and know the sort of stuff
that can be achieved if you can turn this data into information, but your BI and
Data Warehouse vendors just don’t seem to be able to get you there.
Why is this such a big problem since Business Intelligence and Data Warehouses would
seem to be natural fits? Here’s the thing: they were designed in, and for, a different
era. Back then:
- Jimmy Carter was President
- Sun was an independent company
- A gigabyte of data was a lot
- A gigabyte of memory was impossible
- A low-end personal computer cost $6,000
- Multi-threading was a research project
At that time, technologies and methodologies we take for granted today did not even
exist such as:
- The Web
- XML
- Java
- .NET
- Domain-Driven
Design (a concept where software development projects are driven by the domain
and domain logic—for example, interfaces should be designed using the language and
processes of the business user)
To put it more succinctly, the original Data Warehouse designers had a lot more
constraints to deal with in terms of computing power and a lot less data and users
to deal with. In fact, the idea of designing a system that would provide real-time
analysis of data that comes at rates measured in Terabytes/Day would have seemed
like science fiction back then, as would the idea that they would have to create
a front-end that could be distributed securely world-wide to potentially thousands
of users across the globe simultaneously. But these are the requirements of
a modern analytics system and lead in to my much longer answer as to why we need
a new analytics technology platform. In my next post, I will begin to talk about
the differences in analytics approaches and the advantages our technology and services
offer.
Navigation
- Management Team
- Advisors
- Column
- Library
- Careers
Links
In the News
The feedback economy: Companies that employ data feedback loops are
poised to dominate their industries.
What does privacy mean in an age of big data?
O'Reilly Webcasts
Click
here to
replay our privacy panel on The Evolution from Private to
Public: Is there privacy in the digital age?
Click here to
replay our webcast on Privacy and Big
Data.
Book
Book--Privacy and Big Data: The Players, Regulators, and Stakeholders
From Our Blog
Big Data Made Easy -- Video
McKinsey Study: Location, Location, Location, Part 2
McKinsey Study: Location, Location, Location, Part 1
Confessions of a Privacy Junkie (and a list of favorite privacy resources!)
Analytic Audit
A great way to identify your sources of data and determine the best ways to use
and analyze them to maximize ROI.
Read more...
Recommended Reading
McKinsey Global Institute Study on Big Data
This study explores Big Data and analytics opportunities in terms of innovation,
competition, and productivity.
About Us
We provide analytic services and solutions that help organizations across industries
to understand and improve their operations. If you have a complex analytics problem
to solve we can help:
- We are analytic experts.
- We use our own development Framework.
- We believe that those closest to the problem should be given the tools to solve
it.