Big data is helping researchers unlock the secrets of how Earth works. DCO's Data Science Team is providing the keys to these secrets by working in collaboration with scientists to uncover the interactions, synergies, and dependencies of the total planetary carbon cycle.
DCO data science combines informatics, data management, library science, network science, computer science, and domain science. This all encompassing approach enables the analysis and analytics of all aspects of “data” generated or acquired by DCO researchers. Combining state-of-the-art analytics software, cyberinfrastructure, and information technologies, the data science team is helping to amplify the work of DCO researchers, and consequently advancing understanding of the quantities, movements, forms, and origins of carbon inside Earth.
Featured Projects

Carbon Mineral Evolution is a revolutionary data mining project bringing novel network analysis to the field of mineralogy.

This ambitious project will result in the first integrated thermodynamic model of the magma-fluid system, making it possible to predict how carbon moves between solid, liquid, and fluid phases in response to temperature and pressure inside Earth.



Other Projects & Initiatives
ENKI
ENabling Knowledge Integration (ENKI) is a collaborative, web-based model-configuration and testing portal that provides tools in computational thermodynamics and fluid dynamics. Data Science Team Leader Peter Fox in collaboration with Mark Ghiorso (OFM Research) are advancing the work of this project launched in fall 2016 with support from the National Science Foundation. To learn more about ENKI, watch this webinar.
Publishing and Mining Data
Since 2013, the Deep Energy Community has been working on the global characterization of Noble Gas Isotopes. Much progress has been made thanks to Igor Tolstikhin (Kola Science Center of Russian Academy of Sciences (Russia) and colleagues, who have been compiling datasets suitable for global analyses. The datasets were first published in 2013, with a second version published in 2015
Data Rescue
Ever wonder where that heat capacity input data for your thermodynamic modeling calculation came from? Mark Ghiorso (OFM Research, USA) of the Extreme Physics and Chemistry Community and the DCO Data Science team did. His curiosity prompted Ghiorso to work with the Data Science Team to launch a systematic effort to rescue a significant amount of published thermodynamic data from tables and figures in published literature. These data were published via the DCO Data Portal and are available for community access. These data also are soon to become available in Jupyter Notebooks.
DCO Knowledge Graph
The Data Science Team at Rennselear Polytechnic Institute has laid the groundwork for a research-focused discovery tool that enables users to visualize interconnectedness between objects across the DCO Science Network. Information on people, departments, institutions, datasets, grants, research, and publications can be browsed, searched, and visualized via the DCO Data Portal.
Computer Cluster
High-end computational services are readily available to DCO collaborators. DCO has its own computation center with a dedicated cluster that enable it to organize and prioritize computational runs for DCO needs, without the inconveniences of using existing services. From chemical and physical modeling to genomic analyses, the DCO Computer Cluster can run numerous software packages and scientific programs for theoretical calculations of C-bearing phase structures and properties, geodynamics calculations, thermochemical modeling, and other computations. To request time on the cluster, visit here.
Jupyter Notebooks
Jupyter notebooks are a powerful, open source software that allows one to do data science in a single location. Within a typical notebook, a user can import a data set, do statistical modeling, enter code, enter text, and perform any number of other numerical functions, in a variety of languages.The Data Science team has created a Jupyter notebooks hub specifically for DCO network members. Watch this webinar to learn you can use Jupyter notebooks to manipulate and visualize your data.
VIVO
The Data Science Team has contributed to the progress of Cornell University's VIVO project, which serves as the skeleton for the Deep Carbon Observatory Data Portal. Several customizations were developed in conjunction with the work, including a custom Sparql module for Drupal, Shibboleth integration, and significant work on the VIVO application itself. Visit Tetherless World's Github page.
Team
-
closePeter FoxRensselaer Polytechnic Institute, USA
Dr. Peter Fox is professor and Tetherless World Research Constellation chair at Rensselaer Polytechnic Institute in Troy, New York. Fox's research and education agenda covers the fields of data science and analytics, ocean and environmental informatics, materials informatics, computational logic, semantic Web, cognitive bias, semantic data frameworks, and solar and solar-terrestrial physics. Fox works to ensure that his is research is applied to large-scale distributed networks and data science collaborations. Fox leads DCO’s Data Science Team and is a member of Synthesis Group 2019.
-
closeKaryn L. RogersRennselaer Polytechnic Institute, USA
Dr. Karyn Rogers is an assistant professor in the Departments of Earth and Environmental Sciences and Biological Sciences at the Rensselaer Polytechnic Institute. She is Co-Director of DCO’s PRIME (Piezophile Research Instrumentation for Microbial Exploration) Facility and Associate Director of the New York Center for Astrobiology. Rogers’ research focuses on the relationships between microbial communities and environmental conditions in extreme ecosystems, and is broadly applied to understanding the nature of the origin of life on Earth, the potential for life throughout the solar system, and the extent of life in modern extreme environments. She is a member of the Deep Life Community.
-
closeKathleen FontaineRensselaer Polytechnic Institute, USA
Dr. Kathleen Fontaine is an adjunct professor of data policy at Rensselaer Polytechnic Institute (RPI) and co-lead of the Data Science Team there. Her research focuses on the integration of data into public policy. Before coming to DCO, she was Managing Director of Research Data Alliance/US. She remains an active participant in RDA. Prior to joining RPI, Fontaine worked at the National Aeronautics and Space Administration (NASA) Global Change Data Center, at Goddard Space Flight Center, where she managed the Earth Science Data Systems Working Groups. She also worked as a policy analyst for NASA for a decade, participating in two international, interagency organizations – the Committee on Earth Observations Satellites Working Group on Information Systems and Services and in the Group on Earth Observations. Fontaine still works with GEO, and continues to be actively involved in the international data policy arena.
-
closeAhmed EleishRensselaer Polytechnic Institute, USA
Ahmed Eleish is visiting researcher at Carnegie Institution for Science and a graduate student in the School of Science at Rensselaer Polytechnic Institute, working with the Tetherless World Constellation. Ahmed oversees day-to-day operations, troubleshooting, service administration, and website maintenance for the DCO website. He received his Bachelor's degree in Computer Science from Helwan University in Egypt, then went on to work at Oracle Corporation as a consultant. He is presently pursuing his PhD in Multidisciplinary Science, exploring a data-driven approach to the traditional scientific method. His research interests include knowledge representation and discovery, computational linguistics, and artificial intelligence.
-
-
-
-
-
-
Further Reading
Publications
Pan D and Galli G (2016) The fate of carbon dioxide in water-rich fluids under extreme conditions. Science Advances 2:e1601278
Patankar S, Gautam S, Rother G, Podlesnyak A, Ehlers G, Liu T, Cole DR, Tomasko DL (2016) Role of Confinement on Adsorption and Dynamics of Ethane and an Ethane–CO2 Mixture in Mesoporous CPG Silica. The Journal of Physical Chemistry C 120(9):4843-4853
Gautam S, Tingting L, Patankar S, Tomasko D, Cole D (2016) Location dependent orientational structure and dynamics of ethane in ZSM5. Chemical Physics Letters 648:130-136
Boulard E, Pan D, Galli G, Liu Z, Mao W (2015) Tetrahedrally coordinated carbonates in Earth’s lower mantle. Nature Communications 6(6311)
Gautam S, Cole DR (2015) Molecular dynamics simulation study of meso-confined propane in TiO2. Chemical Physics 458:68-76
Pan D, Wan Q, Galli G (2014) The refractive index and electronic gap of water and ice increase with increasing pressure. Nature Communications 5(3919)
Presentations
Gautam S, Liu T, Rother G, Jalarvo N, Mamontov E, Welch S, Cole D (2014) Effect of temperature and pressure on the dynamics of nanoconfined propane. AIP Conference Proceedings 1591:1353-1355
News Articles
27 February 2018 DCO Data Science Team Takes Steps to Improve Data Curation and Research Reproducibility
20 December 2017 To Jupyter and Beyond: A Conversation with Peter Fox
1 August 2017 Big Data Points Humanity to New Minerals, New Deposits
30 May 2017 Building A Lasting Virtual Network for Deep Carbon Science
25 January 2017 DCO Scientific Collections Browser Now Available in the DCO Data Portal
26 October 2016 New Study of Carbon and Water at the Bottom of Earth’s Upper Mantle Challenges Previous Models
24 February 2016 Scientific Data Types for DCO Data
18 February 2015 Novel Carbon Bonding at High Pressure
27 May 2014 Water Surprises Again: Light Refraction and Absorption Under Pressure
26 November 2013 DCO Computer Cluster Comes Online