This webinar is part of the Power of Population Data Science Series
Visualising results of statistical modeling is a key component of data science workflow. Statistical graphs are often the best means to explain and promote research findings. However, in order to find that one graph that tells the story worth sharing, we sometimes have to try out and sift through many data visualizations. How should we approach such a task? What can we do to make it easier from both production and evaluation perspectives?
This presentation will demonstrate a reproducible graphing system designed for the IPDLN-2018 hackathon. The system evaluates synthetic socioeconomic and mortality data with logistic regression. The data was prepared for the hackathon by Statistic Canada and represents Canadian population.
Topics covered will include:
- Introduction to a visualisation technique that uses color to create meaningful expectations from the results of a logistic regression.
- Details related to the workflow of the project that implements this graphing system (github.com/andkov/ipdln-2018-hackathon )
- Building the case for preference of reproducible workflows with version control over computational notebooks (e.g. Jupyter, R Notebook).
Watch recorded presentation below.
Please take a few minutes to complete our online survey. Your feedback will help shape future webinar series!
Speakers
Andriy Koval, Ph.D. is a data scientist with background in quantitative methods and interests in data-driven models of human aging. He is a Health System Impact Fellow with the BC Observatory for Population and Public Health and an incoming tenure-track assistant professor at the University of Central Florida. Andriy’s works centers around developing tools for reproducible research with R and GitHub as key components. Presently, Andriy’s work focuses on developing statistical methods for analysing transactional data extracted from the electronic health records (EHR) of Vancouver Island Health Authority. His current interests include design of information displays with R, literate programming, statistical modelling in general, and probabilistic computing in particular. See more at https://github.com/andkov/bio