Imagine accessing a single, easy-to-search hub of vast amounts of data on life-affecting issues – such as water quality, disease prevention, land-use practices or the opioid crisis – complete with analytical tools for easy visualization of the data.
See yourself analyzing the social and economic impact of regulatory and legislative actions from around the nation, without having to search hundreds of sources to find the data. Envision yourself as a better educated citizen, with the data you need to engage in democracy.
This type of data hub is the goal of a multi-discipline, multi-university project housed at UNC Charlotte and led by political scientist Jason Windett. The project, called “Building the Federal Data and Advanced Statistics Hub” or F-DASH for short, is non-partisan and geared toward a variety of people.
“We realize that there are so many different users that are going to access this tool,” Windett said. “We want to make this a tool so that people can become more informed, not just on what is going on in their state, but with what’s going on in other states as well.”
People want their government to demonstrate transparency, openness and availability. Yet, often, citizens do not have easy access to the big picture.
“If you are a citizen and you are interested in learning about government, there are not many websites that you can go to and get a full breadth of what’s going on,” Windett said. “If you go to a state legislative webpage, you would have to have a law degree to understand what’s going on, or you would have to be really engrossed in policy.”
The research team received $1 million in funding from the National Science Foundation’s Convergence Accelerator program to develop its prototype.
The team includes Samira Shaikh, Stephanie Moller, Gordon Hull, Matt Parker and Rick Hudson from UNC Charlotte. Partner organizations include Open States, the Society for Public Health Educators, and the State Politics and Policy Section of the American Political Science Association, as well as North Carolina A&T State University, Kansas State University, the University of Notre Dame, the University of Rochester, and the University of Virginia.
The project team was intentionally constructed to be broad in skills, knowledge and resources, drawing from fields including applied statistics, computer science, geography, philosophy, political science, public policy, and sociology.
Hull, an expert in data ethics, is essential to the development of standards for ethical uses of the data and for storing and accessing the data. He and others on the team also interviewed potential users, such as students, policy directors for national organizations, legislators, and faculty from diverse disciplines, including fields that study constituencies including women and people of color, to bring diverse voices and ideas into the project.
Cognitive scientist Shaikh brings two distinct areas of research to the project, with expertise in natural language processing and machine learning techniques.
“This project necessarily involves a trove of data in the form of unstructured, natural language data, be it court proceedings or legislative bills,” Shaikh said. “My role in the project is to help apply the state-of-the-art language processing techniques to piece this complex, heterogeneous data together into a coherent and comprehensive whole. Additionally, my work involves applying machine learning techniques to create models that can predict outcomes from the data.”
Words and Image: Lynn Roberson, CLAS Communications Director |
Image: Rick Hudson (from left), Samira Shaikh, Jason Windett, Gordon Hull and Stephanie Moller are members of the interdisciplinary team that is developing a data hub.