Two Major Reasons Behind Failure in Data Science Projects

Why We Chose MapR to build a Real-Time Database for Hadoop
March 29, 2017

Two Major Reasons Behind Failure in Data Science Projects

de9243d4-b0ea-42fd-80f2-db1069b89627

de9243d4-b0ea-42fd-80f2-db1069b89627

2017 has been a phenomenal year for digital transformation and data literacy. Indeed, experts are calling it the year of digital transformation and data literacy. Without data, there can be no digital transformation worth talking about. For this reason, more business executives are allocating a huge chunk of company resources towards data and data analytics. More than any other time in the past, businesses are reimagining how to approach data and data analytics project in order to drive growth.

 

This comes at the backdrop of statistics that show that all too often, business organizations approach data science the wrong way. It is only by identifying areas of failure can these companies figure out what they need to change in their strategy to achieve better results in data science.

 

According to recent research on data and analytics, only 13% of projects in data science reach completion. Of the 13% completed projects, only a mere 8% of company executives say they are 100% satisfied with the final outcome of these projects. The question therefore begs: what is it about data science projects that gives businesses such dismal results? Consider the following reasons with us:

The projects often start with the wrong set of questions

One major mistakes that repeats itself in many data science projects is having the wrong questions at the beginning of a data science project. Instead of initiating the data science project with a clear, established goal that ultimately creates value for the business, many data science project leaders begin the project by analyzing data, hoping that somehow, an interesting insight will appear from which the rest of the project will grow.

 

As a result, they find themselves with far too many potential projects within the initial project, each of which has the potential to yield compelling results, but with few making any strong business case to warrant further pursuit. This broad approach can by design not give useful analysis and, sadly, simply wastes the IT resources of a business enterprise.

 

Data science projects should always have a set of specific questions and hypotheses that need answers and confirmation or dismissal. This way, the team will be clear on the data that needs to be analyzed and the conclusions will be within the boundaries of implementable results.

 

The projects often use faulty data

Whether it’s a project to understand how the existing asset tracking systems (https://corp.trackabout.com/) within the business are influencing growth, or a deeper look into customer data and ways that the business can use it to increase sales, using faulty data will result in the failure of the entire project. Data science experts from TrackAbout say that many people tend to underestimate what big an impact faulty data can have on a project.

This faulty data is often as a result of inadequate cleaning of existing data because of time constraints. Data experts recommend allocating 80% of the entire time set for the project on cleaning data. Doing this will save a lot of time later in the project. After all, no one wants to end up with wrong insights that will lead to wrong decisions and either losses or missed business opportunities.

Leave a Reply

Your email address will not be published. Required fields are marked *