SoDA Workshop Series - Introduction to Data Science with R (Workshop 10: Reshaping Data in R -- Long and Wide Data)

Workshop 10: Reshaping Data in R (Long and Wide Data)
When Nov 16, 2017
from 12:00 PM to 01:00 PM
Where B001 Sparks -- The Databasement
Contact Name
Contact Phone 814-267-2720
Attendees All interested members of the PSU community are welcome to attend.
Add event to calendar vCal
iCal
Workshop 10: Reshaping Data in R (Long and Wide Data)

In this workshop, we will get the first introduction to data wrangling with the R "tidyverse." We will discuss the "gather" and "spread" operations for transforming data between long and wide formats. (You might have seen similar operations in other contexts referred to as "unpivoting & pivoting", "folding & unfolding," or "melting & casting" of data.) This is particularly common in representations of panel data. Data in long format are often of the form where each row represents an observation of a particular individual at a particular time point. Data in wide format have one row for each individual, but lots of columns recording time-varying variables at different time points. Different analysis tasks often require that we be able to switch between these two representations, and this workshop shows you how to make the conversion in both directions..
 
General Information about the Workshop Series

Do you want to develop the skills to program and manage data using R? If so, this workshop series is for you! We will be meeting (almost) weekly for an hour throughout the semester to cover everything from basic R programming up through big data analytics and high performance computing. This workshop series will start with several weeks introducing R and basic R programming, so no prior experience is required (only a laptop). We will then move on to a series of workshops on reading in, cleaning, transforming, and combining multiple, complex datasets (including text and social network data) -- using our newfound R programming skills. Once we have the basics of data mangament down, we will cover web-based data collection, both from traditional web pages, and from the Twitter API. Finally, we will get into performance and scalability issues, and go over the steps for accessing the ICS cluster resources at Penn State.

These workshops will be offered (most) Thursdays during the Fall 2017 semester from 12:00-1:00 in Sparks B001 (The DataBasement). Directions to the DataBasement here: http://bdss.psu.edu/pdf-folder/finding-the-sparks-databasement . Bring laptop! 

The instructor for the workshop is Matt Denny, who can be contacted at mdenny@psu.edu.

Materials (including slides, video tutorial, and pictorial tutorial) are available on the workshop website: https://github.com/matthewjdenny/SoDA-Workshop-Series-Introduction-to-Data-Science. 

Filed under: , ,