Hosted by Fang Huang, Rensselaer Polytechnic Institute, USA
It is often said that 80% of data analysis time is spent cleaning and preparing the data. Moreover, data cleaning is not a one-time job – it is an ever-present need while performing data analysis.
In this webinar, Fang focuses on data processing. He starts by introducing rules that define a tidy dataset. Bearing these rules in mind, he shows how to use relatively simple python code to deal with geoscience data with some visualization. The last part of the webinar will highlight an ongoing project on methane experiments. The webinar should be of interest to any researchers working on data science-related projects.
- Watch previous DCO webinars for background
Introduction to Jupyter Notebook (Feifei Pan)
Visual Tools for Big Data Network Analysis (Ahmed Eleish and Shaunna Morrison)
Data Science for Geosciences: Data Acquisition (Hao Zhong)
- DCO Jupyter Notebook login page (registration required)
All slides in this talk are directly transformed from a Jupyter notebook file using the notebook extension called RISE
- Beautiful Soup package offers support for scraping information from webpages
- Introduction to the Pandas plot function