When we started this blog some three years ago, data science was widely seen as a mere buzzword. The interest in the concept seems to be here to stay – and grow steadily. Google Trends shows how the interest in the search term Data Science has developed in the last three years:
Data Science is here to stay
We think that Data Science is more than just a fancy term for statistics and agree with the popular blog KDnuggets: Data science is about creating value through data and supporting digital transformation of other processes in a company such as marketing, customer service, production etc. We believe that the positive impact of advanced analytical methods is something that can be generated across industries and is not limited to the corporate sector. Earlier this year, we discussed the adoption of advanced analytics by nonprofits. Given the already existing relevance of data science, we asked ourselves what 2020 might bring - and found some interesting hypothesis on the web.
Take everyone onto the Data Science Journey
According to towardsdatascience, some 100 papers on Machine Learning were published in 2019 on a single day. This reflects that Data Science as a whole is here to stay. Hand in hand with increasing presence comes certain differentiation. The mentioned blog sees a trend towards specialization among different roles in data science. On the one hand, there are experts on bringing models into production and providing the necessary infrastructure. On the other hand, there are people involved in investigative work and decision support.
The footprint of Data Science is getting larger as models are becoming an indispensable part of business operations. This implies the ongoing challenge to further increase model performance, the possible need for model retraining or rebuilding as well as continuous levels of support for model stakeholders.
We mentioned before that Data Science is essentially about turning data into value for the respective organization. This value creation is, according to towardsdatascience, not only dependent on the “physical technology” consisting of algorithms and data flows. The “social technology”, i.e. effective lines of related communication and decision-making or executive awareness (or even better, a basic understanding provided by interesting in-house-trainings in Data Science) are at least as important.
People and Tools are needed
Data Science is done by Data Scientists. According to a study by IBM, the demand for Data Scientists will grow by some 28% until 2020 (compared to 2017). Some might go as far as to call Data Scientists the “sexiest job of the 21st century” – like Harvard Business Review did back in 2012. Regardless of any labels, it can be expected that the perceived shortage of expert staff will remain in 2020 both across industries and on a global level. The good news is that further developed self-service tools will gradually improve the ease of data preparation, exploration, visualization and modelling.
Natural Language Processing
Most people think of structured information in rows and columns when they hear the term “data”. In fact, an unbelievable large amount of unstructured data, i.e. texts, speech, sounds and videos are produced every single day. This also applies to different forms of personalized data and general customer communication. A powerful approach to make most of unstructured data is so called Natural Language Processing. It is essentially about classifying texts in categories, sentiments, similarities etc. What happens under the hood is that characters are translated into numbers and further processed by models such as Neural Networks. Breakthroughs in Machine Learning and emerging libraries like Tensorflow have drastically increased the possibility to apply NLP models to unstructured data.
Data Privacy and Security as relevant constraint
There is no data science without data. The “raw material” for analysis and models is often personal data, be it from customers or donors. Particularly in a European context, the public has become more aware and careful regarding the ownership of personal data. The ongoing challenge for any kind of organization involved in data science is to keep highest data security and protection standards, aligned with best practices and being transparent upon customer request. If organizations stick to that, there is no need to become paranoid about data protection at the same time.
What is it that fundraising nonprofits can do or learn about Data Science in 2020?
We think that a classic quote by Mark Twain gives valuable hints into this direction:
The secret of getting ahead is getting started. The secret of getting started is breaking your complex overwhelming tasks into small, managable tasks and starting on the first one.
No matter how far away you see yourself away from applied and sophistiacated Data Science, it definitely will pay off to be even more data driven in 2020 and beyond. As we outlined earlier this year, there still seems to be a competitive edge in the industry for "analytcial NPOs" (see our blog post for facts and figures in this regard if you are interested. Do not hesitate to ask experts or organizations you trust for guidance - also joint systems will be happy to help throughout 2020. :-)
We wish you merry Christmas holidays and a good start into a happy, healthy and successful 2020.
David Weber and Johannes Spiess