Non-profit organisations (NPOs) use offline-marketing strategies to both attract the attention of potential donors as well as collect donations. With mostly donors from “aging” generations reacting to direct campaigns and because those campaigns are slowly decreasing its impact, NPOs are investing into new ways of collecting donations, including online platforms. Not only online media has the potential to reach a wider profile of users; also, other advantages exist, including traceability, the option to gather user-level data and the fact that customers can potentially be better addressed over online channels. In addition, online media have proven to have a greater effectiveness in terms of customer conversion than traditional advertising.
With increasing inflation and life costs and shrinking marketing budgets, it is crucial for NPOs to understand their customer interests and behaviour to better and more effectively address them. If talking about online marketing platforms, this would translate into study what online sources (like email links, YouTube ads or paid search) are the ones bringing more potential donors to our websites.
Each person accessing a website to actually buy a product or donate something, may have visited the website before to research about the (donation) product, before finally deciding on converting or making a donation. Obviously, it is interesting to know the influence of each online source in a final donation through a website; this is known as the attribution problem and can be solved by using attribution-modelling techniques. Different attribution-modelling techniques exist, which try to predict the importance of the different marketing channels in the total conversion of customers (or donors). While advertisers tend to use very simplistic, heuristic ones, academia has focused on more complex and data-driven methods, which have been proven to obtain better results.
Using data from the visits and donations done during the year 2021 on the website of a SOS organisation, we studied the effect that different marketing channel sources (i.e. "from where do users access a website") had on donations. Analysis were done on Jupyter Notebooks using Python and two different modelling approaches:
After doing some first data exploration and filtering data representing an “abnormal” visiting behaviour (visits to the career site and those done on #GivingTuesday), we applied both attribution-modelling methods.
From our analysis, it could be easily concluded that different online marketing channels do have different effects on donations and donation revenues.
However, and to our surprise, minimal differences were found between LTA and MC results for both, estimated number of donations and revenue per channel, probably caused by a “special” behaviour of donors: as it seems, most of them will visit the website just once during the year and decide, on the fly, if they would like to donate something. This overrepresentation of paths with just one touchpoint causes both heuristic and data-driven methods to assign the same (donation) value to the channels.
While results are similar, we find that working with data-driven methods as Markov Chains still has advantages, including the fact that:
Do you want to know more about our results and insights? Are you also interested in a similar study? Do you have online data but are not making use of it?
Get in touch with us! 💻📱📧
We at joint systems can help you get the most out of your data 🙂😉
Some 3.5 years ago we discussed the state of data science in the nonprofit sector in this blog post. The world has significantly changed since then, however, being an insight-driven (nonprofit) organization is more imperative than ever. So, what is actually the status quo of data science and analytics in the nonprofit sector?
The most comprehensive survey on the state of data science and machine learning is the annual Machine Learning and Data Science Survey conducted by the platform Kaggle.com. In 2021, almost 26.000 people took part all across the globe. The participants were also asked about the industry they currently work in, Luckily, survey designers had added "Nonprofit & Services" as an option for the mentioned industry-related question. This enabled us to download the full survey response dataset from the Kaggle website. Using a global filter to focus on the responses from the nonprofit sector, we managed to put together this dashboard:
Back in 2019, when we last blogged about the status of data science in the nonprofit sector, we had already started our joint journey with our customers and partners. Still, we are continuous learners. However, together with our clients, we managed to write numerous success stories on how data science and analytics can make fundraising more efficient and successful. If you want to learn more, please go ahead and browse through the free resources we offer on our platform analytical-fundraising4sos.com or watch the video below for some inspiration.
We wish you all the best in these turbulent and challenging times. Let´s keep in touch and jointly make the most of fundraising data!
The world has become - and maybe always was to an extent - a volatile, uncertain, complex and ambiguous place (VUCA). This has turned the communication of complex topics and interdependencies into an urgent need. Data is ubiquoutous but let alone it is useless unless converted into information and ultimately knowledge. This is where the concept of Data Visualization or more holistically ideas from the field Data Storytelling can make crucial contributions. According to Dr. Jennifer Aaker, an American behavioural scientist at the University of Stanford, stories are remembered up to 22 times more than facts alone.
Often, data storytelling is simply considered an effective data visualization. In fact, the practise of creating data stories is a structured approach with the goal to communicate data insights as an interplay of the three elements data, visuals and narrative. Creating convincing narrative visualizations not only requires the skills of data analysis experts but al so the knowledge from designers, artists and psychologists employing certain techniques, following specific structures, frameworks and using tools.
The creator of the visual may not be able to put his or her intention into explicit knowledge since successful data visuals are often “a matter of taste” and creating representations of data visually hence a subjective process . This suggests that communicating data driven insights to decision makers requires data storytellers, who are skilled in the “art of data storytelling" which can best be learned through best practises. Data Storytelling is essentially about an "art and science mindset".
Lisa Oberascher, data analyst and alumna of Management, Communication & IT at MCI Innsbruck recently finished her bachelor thesis about Data Storytelling. Lisa conducted interviews with data practicioners and put together a great Data Storytelling cheat sheet.
The respective PDF can be downloaded here:
We wish you a great summer and happy data-storytelling :-)
Data is becoming more and more pervasive across industries. Analytics has come quite some way in recent years. A growing number of organizations have implemented analytics solutions and started exploring the potential of data science. With continuing technological advances and accelerating digitization, it is not always easy to overview the current developments in advanced analytics and data science. This end-of-year post tries to provide readers with information in a nutshell on contemporary issues in analytics and data science from a nonprofit and fundraising standpoint.
The infographic below is a "one-pager" for decision makers, analysts, data scientists and anyone interested. We differentiate between the topics that seem to be here to stay and relevant trends that should definitely be considered. In addition, we drop some hyped buzzwords that might be topics for the future and are worth observing. Please feel free to download, share, comment etc.
It has been a challenging but also inspirational year for many of us. We wish you and your dear ones a happy and peaceful Christmas 2021 and a good start in a successful, healthy and happy 2022.
These are our Christmas wishes in dozens of languages 🎄!
All the best and see you in 2022!
The effects of the COVID-19 pandemic acted as an accelerator for digitalization in terms of processes, services, or whole business models. Digital technologies are transforming the economy and are becoming ubiquitous. An increasingly widespread application of algorithms is decision-making in businesses, governments, or society as a whole. Algorithms might, for instance, determine who is recruited and promoted, who is provided a loan or housing, who is offered insurance, or even which patients are seen by doctors. Algorithms have become important actors in organizational decision making, i.e. a field that has traditionally been exclusive to humans. As these decisions often have an ethical dimension, the delegation of roles and responsibilities within these decisions deserves scrutiny. This is where Coporate Responsibility comes into play ...
Luckily, as social nonprofit organizations work in the interest of the common good in one way or the other way, Corporate Responsibilty tends to be rooted in the "DNA" of nonprofits. At the same time, algorithms have also made their way into the sector of fundraising nonprofit organization as we had already highlighted in specific a blogpost from 2019. Compared to other contexts such as human resource management or the labour market (see for example this critical discussion of the algorithm used at the Austrian Labour Market agency "AMS"), the consequences of algorothmic decision making in the context of fundraising nonprofits will tend to be rather harmless. However, in the light of technological advances and the need for nonprofits acting as as active members of modern society that have a voice, NPO decision makers should be aware of the big picture in terms of "Ethical AI".
Implications and Challenges
In the course of scrutinizing the ethics of algorithms, not only considering the algorithms themselves but also their actual implementation in software and platforms should be scrutinized. Two groups of concerns can be identified in terms of the ethical challenges implied by algorithms. there are epistemic concerns on the one hand when evidence provided by algorithms is inconclusive, inscrutable, or misguided. On the other hand, there are normative concerns related to unfair outcomes, transformative effects and traceability. These normative concerns have in common that they are related to the actions derived from algorithmic results. In a nutshell, the mentioned concerns can be summarized as follows:
So what? Three things nonprofit decision makers can do (at least)
Any questions or input? Let´s keep in touch!
We wish you a smooth start in a hopefully pleasant and successful fall of 2021.
All the best!
For the last decades, lots of efforts have been put in developing machine learning algorithms and methods. Those methods are currently being widely used among companies and let us extract meaningful insights from our raw data to solve complex problems that could hardly be solved otherwise. They make our life (and our job) easier, but at what cost?
There is a good reason why Machine learning methods are known as being “black-box”: They have turned so complex that is hard to know what is exactly going on inside them. However, understanding how models work and making sure our predictions make any sense is an important issue in any business environment. We need to trust our model and our predictions in order to apply them for business decisions. Understanding the model also help us debug it, potentially detect bias, data leakage and wrong behaviour.
Towards interpretability: The importance of knowing our data
We should take into account that, whenever we talk about modelling, there needs to be a lot of work behind related to data preparation and understanding. Starting with the clients’ needs or interest, those need to be translated into a proper business question, upon which we will then design an experiment. That design should specify, not just the desired output and the proper model to use for it, but also – and more important – the data needed for it. That data needs to exist, be queried and have enough quality to be used. Of course, data also needs to be explored and useful variables (i.e. variables related to the output of interest) be selected.
In other words: Modelling is not an isolated process and its results cannot be understood without first understanding the data that has been used to get those results, as well as its relationship with the predicted outcome.
Interpretable vs. non interpretable models
Until now, we have just talked about black-box models. But are actually all models hard to interpret? The answer is no. Some models are simpler and intrinsically interpretable, including linear models and decision trees. But since that decrease in complexity comes with a cost on the performance, we usually tend to use more complex models, which are hardly interpretable. Or are they?
Actually, intensive research has been put into developing model interpretability methods and two main type of methods exist:
Those model methods can also be grouped, depending on their predictions scope, into:
Some Global interpretability examples
As previously mentioned, probably the most widely method used is the calculation of the feature importance, and many packages have their own functions to calculate it. For instance, package caret has the function varImp(), which we have used to plot the following example. There, we can see how feature “gender-male” and “age” seem to be the most important features to predict the survival probability in the titanic (yes! we have used the famous Kaggle titanic-dataset to build our models).
Partial dependence plots are also widely used. These plots show how predicted output changes when we change the values on a given predictor variable. In other words, it shows the effect of single features on the predicted outcome, controlling for the values of all other features.
In order to build them, function partial() from package pdp can be used. For instance, in the following partial depende plot we can see how paying a low fare seems to have a positive effect on the survival – which makes sense, knowing for instance that children had preference on the boats!
Some local interpretability examples
Local interpretability techniques can be studied with the packages DALEX and modelStudio, which let us use a very nice and interactive dashboard – where we can choose which methods and which observations are we most interested at.
One of the best methods contained are the so-called break-down plots, which show how the contributions attributed to individual explanatory variables change the mean model prediction to yield the actual prediction for a particular single observation. In the following example of a 30 year old male travelling on 2nd class, which payed 24 pounds and boarded in Cherbourg, we can see how the boarding port and the age had a positive contribution on the survival prediction, whereas his gender and the class had a negative one. In this way, we can study each of the observations which we want or have to focus on – for instance, if we think that the model is not working properly on them.
Shap values is a similar method, which consists on taking each feature and testing the accuracy of every combination of the rest of features, checking then how adding that feature on each combination improves the accuracy of the prediction.
On the following example, and for the same observation as we just analysed, we can see that result are very similar: gender shows the biggest and most negative contribution, while the boarding port has the biggest and most positive effect on the survival prediction, for that specific passenger.
Last, if we are interested on how observations’ predictions change when changing feature values, we can study the individual conditional expectation plots. Even though they can just display one feature at a time, it let us have a feeling on how predictions change when feature values change. For instance, on the following example we can see how increasing the age have a negative effect on the survival of the titanic passengers.
Some last words
In this post, we have made a brief introduction on the interpretability of machine learning models, we have explained why is important to actually be able to interpret our results and we have shown some of the most used methods. But just as a reminder: for a similar performance, we should actually always prefer simpler models which are interpretable per se, over super complex machine learning ones!