Data Equity Principle 1

Employ ethical behavior to respect the rights of individuals who provide data, promote greater equity and well-being, and minimize the risk of harm.

Description

Ethical behavior requires data users to evaluate data practices to determine whether they have the potential to contribute to greater equity, as opposed to reinforcing the status quo or even causing harm to communities already most marginalized, such as Black and Indigenous people. It requires data users to consistently challenge ideas, practices, or policies that fuel systemic racism. To combat systemic racism means to challenge the notion that differences between racial groups are simply inherent, rather than understanding that racial disparities are a product of longstanding oppressive systems and policies. Data users must question whether they are addressing the underlying structural factors that perpetuate inequity, respecting the dignity and autonomy of all individuals, and maximizing benefits while minimizing the risk of harm.

Although Institutional Review Boards (IRBs) determine whether ethics are upheld in research,^xxxi in practice IRBs are not well equipped to perform deep reviews that center the concerns of marginalized groups to advance racial equity. For example, IRBs have allowed people of color to be systematically underrepresented in clinical trials, even when they are most affected by the health conditions being studied.1 In addition, many data projects occur in settings with little or no ethical oversight. Data users must carefully assess data projects’ potential risks and benefits to the well-being of individuals and society at large to avoid being extractive and exploitative. Data users must weigh the risks and benefits holistically, with an eye toward the groups that might be differentially affected to ensure both risks and benefits are distributed fairly, and racial equity is being promoted.

Data users should be attentive to uses of data that carry a high risk of causing harm, such as algorithms, or data-based decision tools, that may lead to discriminatory practices. Algorithms reflect the biases of the people who develop them and of the underlying data. If considering using an algorithm to inform decision making, data users must ensure transparency, assess algorithmic bias, and determine the potential positive and negative consequences of applying the algorithm in practice. Decisions based on a data algorithm should always be reviewed by humans, and affected individuals should have the ability to contest the decision. Data users should also be attentive to minimizing the amount of data collected on sensitive topics (for example, mental health) and rigorously protecting personally identifiable information.

At the outset of any data project, decision makers should identify and communicate who is funding the project and what their priorities are, the types of decisions the data will inform, the data project’s stated public benefit and equity goals, whether the data project meets the needs and addresses the concerns of the intended beneficiaries, and whether the data project could lead to unintended consequences or have racial equity implications (good or bad). They must engage the groups of people whom the data project might affect to make these determinations, be responsive to their feedback, and ensure transparency.

Community engagement is especially critical if the project could have serious or disproportionate impact on marginalized groups or those facing multiple barriers. Involving multiple partners, including proximate leaders from affected communities, in data governance, institutional review, and advisory structures, can help data users ensure the project is successful in promoting equity and well-being. Ideally, community members can co-create project goals and plans with proximate leaders to ensure the data are meaningful and actionable to them and counter existing power structures. These bodies should be convened early and offer continued input and oversight throughout the data life cycle.

^{xxxi Ethical principles of research are described in the}^{Belmont Report}^{, which guides human subjects’ protections in research (but does not have a racial equity lens).}

The importance of transparency in ethical data use

Mount Saint Mary’s University, a small, private college in Maryland, made the news in 2016 after a plan to use student data to boost retention rates became public. New students would have to take a survey that the school would use to predict their likelihood of dropping out; students with a high probability would then be encouraged to unenroll before they were counted in the retention data that colleges report to the federal government. Mount Saint Mary’s did not disclose to students that their survey responses could be used to encourage them to leave (Ekowo & Palmer, 2016), a major ethical breach. In contrast, other colleges, such as Georgia State University and Temple University, have successfully used predictive analytics to improve graduation rates by involving students and staff in the process. Transparency is at the heart of using data ethically and equitably, allowing for greater oversight and accountability.

Applying this Principle

Context-setting

Hold listening sessions with community members to learn what types of data projects the community thinks are relevant to improve their lives. Consider the impacts of structural racism on the priority community, and listen to the stories of community members to identify ways the work could be beneficial to them. Examine the results of past data projects, including past approaches to centering equity, to identify strengths and areas for improvement.

Planning

Establish a governance or review body with representation from multiple contributing groups, including proximate leaders from affected communities. Convene this body to agree on the goals of the project, identify risks and benefits, develop mitigation strategies, and inform decisions at each phase of the data cycle. Consider formalizing a commitment to ethical data use by drafting a social impact statement that outlines how to put principles into practice.2

Collection

Minimize the collection of sensitive and personally identifiable information unless it is critical to achieving the project’s intended benefits. Eliminate the collection of any nonessential data to minimize burden on individuals. Individuals, especially those in marginalized communities, may perceive the collection of unnecessary personal information as over-surveillance and question whether the data collection has hidden purposes.

Access

As appropriate, securely share data with partners to reduce the burden of duplicate data collection (see Principle 2 for additional considerations on data privacy and access). Communicate policies on data storage, access, and use in lay terms.

Analysis

Clearly describe the methods and algorithms used to analyze the data, their potential for inaccuracy and bias, and how they will be used to inform decision making. Seek out and incorporate communities’ interpretation of the data.

Reporting

Return data and research results to community members in a form they can use. Create channels to report grievances. Publicly disseminate the results of the analysis and invite others to build on the research in an ethical manner that will produce continuous benefits to the community. Accurately identify the strengths and weaknesses of the data.

Reflection Questions

Who would benefit from or be burdened by the data project? Are both benefits and burdens shared equitably?
What are the potential risks of the project versus the risks of not proceeding with it?
Could you modify the project to enhance positive impacts or reduce negative impacts?
Are governance and oversight mechanisms in place? Do they include community representation?
How will you know whether the intended benefits to the community were achieved?

Be On The Lookout

“Early warning” and other predictive indicators can be powerful tools to help E-W systems support students earlier and more effectively. However, they should not be used for increased monitoring or punitive action. Data users must be aware that biases in the inputs used to form predictions can perpetuate stereotypes and even lead to discriminatory treatment. For example, although past suspensions are predictive of high school graduation, they also reflect racial bias in school-based disciplinary actions.3, 4 Thus, algorithms should never override the judgment of individuals. Balancing information from the algorithm with the judgment of practitioners, students and parents, and other qualitative or contextual data can help ensure equitable outcomes are achieved.

Additional Resources

Principles for Advancing Equitable Data Practice. This brief by the Urban Institute describes the Belmont Report’s ethical principles and offers examples of practices and resources to integrate the principles throughout the data life cycle with an equity lens.
The Data Equity Framework. This framework from We All Count identifies key equity-impacting decision points in data projects and offers practical tools for developing and implementing ethical data projects that center equity.
A Toolkit for Centering Racial Equity Throughout Data Integration. This toolkit by Actionable Intelligence for Social Policy includes chapters on “Racial Equity in Planning” and “Racial Equity in Algorithms/Statistical Tools” which describe positive and problematic practices with ethical implications, as well as citing brief case studies.
Forum Guide to Data Ethics. This report by the National Forum on Education Statistics offers nine “canons” of data ethics in education, along with real-life examples and resources to implement these canons.
Racial Equity Considerations and the Institutional Review Board. This Child Trends blog post describes why racial equity matters in IRB submissions and offers suggestions for applying an anti-racist lens when submitting to an IRB.

References

1
Strauss, D. H., White, S. A., & Bierer, B. E. (2021). Justice, diversity, and research ethics review. Science, 371(6535), 1209–1211. https://doi.org/10.1126/science.abf2170
2
Diakopoulos, N., Fridler, S., Arenas, M., Barocas, S., Hay, M., Howe, B., Jagadish, H. V., Unsworth, K., Sahuguet, A., Venkatasubramanian, S., Wilson, C., Yu, C., & Zevebergen, B. (2017). Principles for accountable algorithms and a social impact statement for algorithms. Fairness, Accountability, and Transparency in Machine Learning. https://www.fatml.org/resources/principles-for-accountable-algorithms
3
Staats, C. (2014). Implicit racial bias and school discipline disparities: Exploring the connection. Kirwan Institute, Ohio State Unviersity. https://kirwaninstitute.osu.edu/research/implicit-bias-school-discipline
4
Capatosto, K. (2017). Foretelling the future: A critical perspective on the use of predictive analytics in child welfare. Kirwan Institute, Ohio State University. http://kirwaninstitute.osu.edu/wp-content/uploads/2017/05/ki-predictive-analytics.pdf