The future of ... data for good

How to close the gap between the data haves and the data have-nots

January 18, 2024 | By Vicki Hyman

In India, smallholder farmers are reducing their harvest loss and increasing their selling prices through the use of cold storage, which is connected to a mobile app that tracks the shelf life of produce in real time for digital inventory management.

In Mozambique, many laborers advertise for work by painting their phone numbers on planks nailed to trees. Now they’re getting labor market insights delivered directly to their phones and access to tools to improve their marketing and business management skills.

And when auditing lending records from banks in Colombia, Mexico and India, researchers found a high risk of future bias against women applicants. The low representation of women in the existing data skewed the algorithms used to spot creditworthy applicants. So they built a new one that’s gender-fair.

None of this would have been possible a decade ago. But the rise in connected devices and exponential growth in the data produced, combined with rapid advances in artificial intelligence and machine learning, has unleashed the potential of data science. Until recently, however, governments, nonprofits and civic organizations have lacked the budgets, staff and capacity to take full advantage of data science to help more people.

Founded by the Mastercard Center for Inclusive Growth and the Rockefeller Foundation in 2020, data.org democratizes data by forging partnerships around the world to harness data science to tackle society’s most pressing issues. Among its initiatives: the Inclusive Growth and Recovery Challenge, which funded breakthrough data for social good concepts, including the three above, and the Capacity Accelerator Network, which aims to train a million data professionals by 2032 through hubs in more than 20 countries and counting. Mastercard continues to support data.org as it accelerates impact through data science at scale.

“Data has the potential to widen the gap between the haves and have-nots,” says Shamina Singh, founder and president of the Center for Inclusive Growth. “We need to continue to do the work that's necessary to ensure that data science for social impact creates inclusive growth.”

The Mastercard Newsroom spoke to Singh and Danil Mikhailov, the executive director of data.org, about tackling the challenges facing the nascent field of data for social impact, why diversity is crucial in developing the talent pipeline and where they’re seeing success.

Shamina, your work at the Center aims to reduce economic inequality and narrow the digital divide that perpetuates that inequality. There’s also information inequality. What is the “data divide” and how does that manifest itself in the communities in which the Center works?

Singh: Data is power, and the enormous race we’re seeing around AI right now is proof of that. What we need to keep top of mind is that the people who are losing the race are also those people who could benefit the most from data access. Information inequality is the idea that information is incredibly powerful, incredibly useful, and those who have access to it could really accelerate their growth, while those who don't have access or capacity could be left behind. This tracks with what we’ve seen in the financial inclusion space, where as more people entered the digital economy, we saw more people being left behind.

We’re trying to address these gaps that are forming through data-driven initiatives and the intentional implementation of new technologies with inclusion at the forefront. What we’re doing at the Center is creating capacity for the social sector to realize the power of their own data to put it to use for social good. That’s the basis of this partnership and the creation of data.org — to actually create a new institution, a new way of doing this work using new technology and new data resources.

Danil, a year ago data.org released a report on the key trends and tensions as the field of data for social impact matures. What are some of the challenges that organizations still face? Has there been any progress? Has anything new popped up since generative AI has taken hold in the last year?

Mikhailov: Shamina, I love your framing of information inequality: the data haves and the data have-nots. That is still a reality. The gap isn’t shrinking. The gap is increasing. We are making progress to try and reduce the rate at which the gap is increasing. We’ve been using data in different fields for decades, centuries. The current generation of data science technologies, including gen AI, is so new and is changing so fast, that it leads to a field that is very fragmented. Everyone is launching their startups or their new tools and approaches, which means it’s hard to bring the field together around some of the common challenges. And the definition of what data for social impact is, or impact AI — they’re new terms and the field hasn’t yet coalesced around by what we mean by that. What data.org is doing is bringing together many of the actors in the space — social impact organizations, tech companies and startups, academic institutions, etc., around what we mean by them.

Can you give me an example?

Mikhailov: We’ve done a lot of work in defining a role we think is missing in the field — a data ecosystem designer. We think, fundamentally, what we need to make much healthier data ecosystems is to find people who will focus on bringing organizations together around certain principles. Just as in urban planning — building a healthy city that has good public spaces and private spaces, a good balance of architecture —you need somebody who is in charge. So we’ve been doing work with other organizations to unite around this concept of ethical data ecosystem design. Then we need to fix funding. There is a lot of work that needs to go into this, and Mastercard has been building sustainable funding streams for this kind of work. Technology is expensive. There’s no way to avoid it. When you build data science techniques, you’re competing with other big tech companies in the world. So what we’ve being doing is actually flipping this on its head by partnering with big tech giants, to connect them with the social impact sector, providing a channel to help the sector and for them to use their muscles to help lift others.

With support from the Mastercard Center for Inclusive Growth, data.org's U.S. Financial Inclusion Accelerator advances curricula on data for social impact through a consortium that includes historically Black colleges and universities, Hispanic-serving institutions and community colleges.

The report also called out the need for more data scientists, particularly in developing countries where the work can have great impact but also where there's perhaps not the educational infrastructure to train data scientists. What are you doing to address this?

Mikhailov: We need more data scientists. We need them to be different. We need them in different places. First of all, numbers. We’re not training enough data scientists for the new economy, not in the private sector let alone the social impact sector. So we need more investment in universities, more informal training, more professional training for those who already have jobs to retrain and add the desired skills to their existing skillsets. But we also need those people to be different. So we need to invest more in making sure that there are more women data scientists, and that there are more data scientists from different, currently disenfranchised, backgrounds, particularly in the Global South. We also need the skills to be different. So at the moment, we're teaching data science just as a technical discipline — mathematics, coding, which are hugely important. But often, to solve problems and create social impact, you need to understand the subject matter. So you need to understand, for example, when you talk about climate or health or financial inequality, what are the causes behind those problems? If you only have technical skills, you often can do more harm than good when you try to create solutions. So teaching data scientists to have the interdisciplinary skills of both the tech knowledge and understanding of the subject matter is something that we focus on in our Capacity Accelerator Network.

Singh: At Mastercard, we’ve always said the customer is at the center of our innovation, but the truth is that if you're coding or creating technologies or solutions, you’re going to do so based on what you know, who you are and your life experiences. What we've intentionally tried to do with the Capacity Accelerator Network is expand the pool of tech talent in the development phase to ensure that the inputs represent the diversity of the communities, countries and regions we’re operating in.

"We want to ensure emerging technology, things like generative AI, are a part of the development from the outset. That inclusion is prioritized from the beginning."

Shamina Singh

To accomplish this, we are working to ensure that we are providing training and building diversity in data science through engagements with things like HBCU networks and Hispanic-serving institutions to make sure we are building with — not just for — those who these solutions are meant to support. This is a divergence from the norm where the data haves or technology haves typically gain first access and then the technology makes its way to the rest of the population and they have to engage with something that’s not necessarily built with them in mind. We want to ensure emerging technology, things like generative AI, are a part of the development from the outset. That inclusion is prioritized from the beginning.

When you think about the initiatives you’ve supported through data.org, what is one that you feel epitomizes the power of data for good?

Mikhailov: I'll cheat and go for two. So one example is really kind of high-end, global, and one example is much more local. So on the high-end scale, it’s Epiverse, a global collaborative developing a data analysis ecosystem that can help everyone get ahead of the next public health crisis. It’s about creating a set of open-source tools for public health analysts and data scientists, epidemiologists, that kind of community. Obviously it’s inspired by the pandemic; we need to act globally, and we need to be able to share insights quickly. It’s now in a half dozen countries, and over the next few years, it’s projected to scale to another 10 to 20 countries. Then I’d go for our Inclusive Growth and Recovery Challenge, where we supported nine great projects around the world, each of those in a very specific local setting, driven by local communities and having a lasting impact. For example, helping communities in the U.S. use data on brownfield sites in their cities to help communities regenerate that land.

Singh: Let me build on that. The importance of the Inclusive Growth and Recovery Challenge was that we wanted to create the demand and the supply for data science for social impact, a relatively new concept. Our intention is to utilize financial backing in selected winners to help them scale and yield further investment, growing their mission and our mission at the same time. The way you know it's successful is if that prize yields even more investment. And we’ve been thrilled to see this come to fruition, where notably the challenge winners have actually generated something like $30 million in additional investment. When others become invested in the capacity building, that’s when you know change is happening.

Banner photo: In Mozambique, informal workers can register with the Biscate business platform, which connects them with clients. A data initiative partly funded through the data.org Inclusive Growth and Recovery Challenge uses this kind of market data to deliver insights to help build their businesses.

Vicki Hyman, director, communications, Mastercard

vicki.hyman@mastercard.com

How to close the gap between the data haves and the data have-nots

Company

Mastercard Sites