The U.S. Census Bureau Research Data Center--A Valuable New Resource for Penn State Demographers
On April 7–8, 2014, the Department of Economics and the Population Research Institute (PRI) held an opening conference—Social Science Research Using the RDC Network—to introduce Penn State’s new U.S. Census Bureau Research Data Center (RDC) and raise awareness of the RDC among Penn State researchers. With 40 out-of-town presenters and roughly 80 Penn State faculty and graduate students from across the university, the event highlighted the potential of the RDC to enhance research in the fields of economics, demography, statistics, sociology, and health services as well as introduced Penn State researchers to key U.S. Census Bureau stakeholders and other researchers in the RDC network.
Bringing the RDC to Penn State
The idea to open an RDC at Penn State came in the fall of 2011, when Jennifer Van Hook, Director of PRI, as well as other PRI staff and representatives from University Libraries were visiting Cornell’s RDC and data facilities. The initial plan was to set up a consortium with Cornell to use its facilities. However, on the 3.5-hour van ride back to State College, Dr. Van Hook and the other Penn State representatives got to thinking: Why not set up an RDC at Penn State?
Shortly thereafter, in the spring of 2012, Dr. Van Hook and Mark Roberts, Professor of Economics, surveyed Penn State faculty to determine if there would be interest in an RDC. The response was an overwhelming yes. In fact, quite a few faculty members had already successfully used RDC data in past research and were excited about the prospect of having such a facility at Penn State.
In the fall of 2012, Drs. Van Hook and Roberts, in collaboration with Penn State faculty from Sociology (John Iceland), Health Policy Administration (Marianne Hillemeier), and Rural Sociology (Leif Jenson) wrote a proposal to the National Science Foundation (NSF) to open the Penn State RDC and secured funding from the university (i.e., the President of the University and Vice President for Research, the College of the Liberal Arts, the College of Agricultural Science, the College of Health and Human Development, the Eberly College of Science, the Dean of the Library, the Social Sciences Research Institute, and PRI). The NSF decided to fund the proposal in July of 2013. By the winter of 2014, construction on the Penn State RDC was finished, and Dr. Roberts was appointed the first director.
A Central Location and New Opportunities
The Penn State RDC is located in two rooms on the second floor of the Paterno Library—a central location that provides researchers from all over campus easy access to this important resource. The library has recently taken a second look at the social sciences, creating a research hub, providing statistical services, and ramping up data and geographic information system services to support Penn State social science research. According to Social Sciences Librarian Stephen Woods, the library decided to house the RDC to enhance interdisciplinary research at Penn State and to create “synergy that [the library] can tap into as [it] pursue[s] service for other restricted data collections as well as high-end data services within the research hub.”
The purpose of the RDC is to provide researchers with a secure connection to restricted data collected by the U.S. Census Bureau and the National Center for Health Statistics. Unlike public-use data, the RDC datasets have not been anonymized and thus provide valuable information and variables for analysis (e.g., county FIPS codes, geospatial data, linked data from administrative sources, and business data). These unique data are essential to move research forward in demography—a very data-driven discipline for which innovation comes from new datasets. Not only will these data generally accelerate and expand research opportunities for demography, they will also enable researchers to move beyond merely working with micro-data to linking macro-data to contextual factors.
This notion of the RDC opening new research doors was a common theme at the conference, with many of the guest speakers and even some Penn Staters recounting their use of restricted datasets for novel research, including demography research. For example, John Iceland, Head of the Department of Sociology, used restricted data from the 1990 and 2000 decennial censuses to explore how residentially segregated very specific groups are from each other at the neighborhood level.
Shannon Monnat, Assistant Professor of Rural Sociology and Demography, is hoping to use restricted data to explore access to healthcare among Hispanics in new versus established destinations and whether the characteristics of the Hispanics themselves or of the places they live matter in influencing healthcare access and use. The RDC data Dr. Monnat wants to utilize have both important variables related to this issue and geographic identifiers, thus enabling her to dig deeper and find connections that would be impossible to uncover with public data.
According to Pamela Short, Director of the Center for Health Care and Policy Research, restricted data available through the RDC will also be essential in understanding heterogeneity and variation in the implementation of the Affordable Care Act (ACA). Because the Supreme Court gave states the choice of whether to dramatically expand Medicaid or not and because states were able to create their own exchanges, the national ACA reform is actually more like a state-by-state reform. Thus, researchers who want to measure the impact of the ACA will have to know the geographic location of the individuals in their samples, which is often suppressed in publicly available datasets.
These are just a few of the many research areas discussed at the opening conference, but as Dr. Short put it, having an RDC at Penn State is “going to open up all kinds of new possibilities for new lines of research that other people could probably think of and pursue but would find it just too hard” to complete without having an RDC nearby.
In addition to creating opportunities for interesting new lines of study, the RDC and the research stemming from restricted data will also provide Penn State professors and students a competitive edge. In terms of funding, the limited availability of RDC data and the unique research it enables may make proposals more attractive to potential funders. Dr. Monnat, for instance, applied for a highly competitive grant from the Stanford Center on the Study of Poverty and Inequality while at her previous university and was turned down. She reapplied this year after coming to Penn State, this time proposing to use restricted data from the Survey of Income and Program Participation. According to Dr. Monnat, “I really feel that [including restricted data] had a big impact on me getting this grant. . . . Having the RDC here allowed me to not only demonstrate that I’m in a really good research environment that has a good research infrastructure but also allowed me to ask questions that the Stanford Center on the Study of Poverty and Inequality really cares about.” In addition, this type of research and the publications that result can set graduate students apart from their counterparts who do not have access to restricted data as they compete for jobs post-graduation.
Getting Started at the RDC
Researchers and students need to take several steps to obtain access to the RDC’s highly sensitive data. The first and most involved step in the process is writing a detailed proposal explaining which dataset and variables will be used in the research, why public-use data is insufficient, and what benefit the research will have for the Census Bureau. This last detail—the benefit to the Census Bureau—is particularly important because as part of its commitment to privacy and data confidentiality, the Census Bureau will only provide data access to research projects that benefit its mission.
After the proposal has gone through the review process and been accepted, the researchers on the project will undergo a background check and be given special sworn status as Census researchers before they can access the RDC or use the data.
While this process is intensive, Penn Staters do not have to go it alone. “Writing and getting an application approved for using these data is an extensive, elaborate process,” explained Dr. Short. However, “as time goes on, there are going to be more and more people at PRI and Penn State in general who have successfully negotiated that process as well as staff who can help faculty, which will make the process less time consuming and increase the probably of success.”
There’s one thing all the scholars at the conference agreed on: Penn State is incredibly fortunate to have access to the RDC. In return for a small time investment up front, Penn State scholars can reap the benefits of having access to restricted data over their entire careers.