In This Section

Perfectly Paired: Scientific Investigation and Data Analysis

Published on February 17, 2022 in Cornerstone Blog · Last updated 11 months 1 week ago


Subscribe to be notified of changes or updates to this page.

9 + 3 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.

Alexander Gonzalez, MS, MBA

Cornerstone is celebrating International Love Data Week, hosted by the Inter-University Consortium for Political and Social Research and observed this year from Feb. 14-18, by highlighting an overarching data resource available to Children's Hospital of Philadelphia Research Institute investigators: Arcus.

Arcus is a project-centered program driven by a team of digital archivists, educators, analysts, and programmers whose expertise supports researchers with the computing environments and research data management needed to bring their projects to life.

Cornerstone spoke with Alexander Gonzalez, MS, MBA, supervisor with the Department of Biomedical and Health Informatics Translational Research Informatics Group, a team of data integration analysts and data scientists specializing in data integration solutions designed to manage complex rare disease data in biomedical research. Gonzalez and his team have worked with the Inflammatory Bowel Disease Research Group, Center for Injury Research and Prevention (CIRP) and the Division of Neurology in implementing data integration solutions at scale.

For this Q & A, we asked him to provide a glimpse into a collaboration with CIRP colleagues to provide data support for an ongoing project.

Q: What request did CIRP bring to the Arcus team?

The Winston - Ohio Crash Predictors from Driving Simulation project was one of the first users of Arcus infrastructure and came about from a collaborative chat between Flaura Winston, MD, PhD, founder and scientific director of CIRP, and Jeff Pennington, MSCS, associate vice president and research informatics officer. Elizabeth Walshe, PhD, a research scientist, is studying driving habits of pediatric populations as part of the CIRP Neuroscience of Driving group. My team in the Translational Research Informatics Group (TRiG) is assisting Dr. Walshe with electronic honest brokering of the data, providing support with cohort definition, and deidentification.

Q: What is memorable about this collaboration?

What is most memorable about this collaboration is the novelty of the project. Studying driving habits and outcomes in this way, at this scale, is such a unique and rare thing. Through a collaboration with the State of Ohio, and Diagnostic Driving (a startup company spun out from the Research Institute), we received driver data and crash outcome data over a multiyear period.

These data not only consist of millions of records, but also consist of datatypes and modalities our team was not used to working with. When we started, CIRP, TRiG, and Arcus came together to set up a pathway to securely store and move these data around to facilitate analysis. A project like this shows that we can take on just about any new data thrown at us, no matter how novel to us it is, from crash outcome data to licensing data.


Arcus collaborates with the Center for Injury Research and Prevention to study the driving habits of pediatric populations.

Q: How did you accomplish your objectives?

Two major components of Arcus helped to further this project. First, Arcus has a very intuitive way of cataloguing datasets and providing metadata around these datasets. We used the project structure given to us by the Arcus Library Science team to expedite data review.

Second, we have our data stored within Arcus. Knowing that data is stored and backed up securely is a big relief off my team's shoulders, allowing us to free up our time for other tasks. For example, we annotated a licensing database from the State of Ohio, unified crash outcomes over multiple years, and wrote a web application to provide a robust data-dictionary around the data.

Q: What was the project's outcomes?

As part of this ongoing project, we continually receive new licensing data and crash outcomes data from outside collaborators like the State of Ohio. Through this data, however, CIRP can academically publish and perform analyses on large amounts of data. In addition to Dr. Walshe, CIRP team members Shukai Cheng, MS, information analyst; Natalie Oppenheimer, clinical research program manager; and Sarah O'Brien, clinical research assistant; work in conjunction with the TRiG team and Jeff Pennington from Arcus.

Do you have a research project that could benefit from a data-driven collaboration? Make the first move and visit Arcus to find out how your research and their resources can match up.