HRUMC 2016 Titles and Abstracts

Office & Department Directory
Math, Computer Science, and Statistics Department
HRUMC 2016 Titles and Abstracts

HRUMC 2016 was held April 2,nd at Saint Michael’s College, Colchester, Vermont

Ms. Kelsey A. West, Mr. Christopher J. Romagna, Ms. Bailey J. O'Keeffe, Ms. Shauna Bulger, Mr. Matthew Monhart,Mr. Michael A. Theobald, Ms. Sydney A. Bell, Josiah Bartlett, Mr. Michael L. Lengieza, Ms. Alexandria J. Haehl, Mr. John H. Tank, III, Mr. Curtis J. Hurlbut, Ms. Julia H. Simoes, Mr. Colton F. Ransom

Title: A Sports Analytics Consulting Course: Part II
Abstract: In this talk, we describe student projects done as part of a course taught at St. Lawrence University during the Fall 2015 semester. As part of that course thirteen students were statistical consultants for five St. Lawrence University sports teams: men's soccer, women's soccer, track, volleyball and women's hockey over the course of the semester. Throughout the semester, the student collected and analyzed data for each sport. A summary of the work done for each team will be presented. Further, we present some of the student final projects that were part of this course as well as discuss the challenges and rewards of this course.

Scarlet Qi
Title: Shiny Bayes: Developing an App to Illustrate Bayesian Inference
Abstract: Bayesian inference is a statistical method for using data to update beliefs about location of a parameter. We use the R shiny package to create an interactive web app that allows users to specify a prior distribution, input data and observe the resulting posterior distribution. We discuss how this app can be used to demonstrate Bayesian inference for parameters such as a binomial proportion, Poisson mean, or normal mean.

Michael Schuckers
Title: Sports Analytics Consulting Course: Part I
Abstract: In this talk, I describe a course taught at St. Lawrence University during the Fall 2015 semester. As part of that course thirteen students were statistical consultants for five St. Lawrence University sports teams: men's soccer, women's soccer, track, volleyball and women's hockey over the course of the semester. This talk will discuss the motivation for the course, the structure of the course, the outcomes of the course and lessons learned from running the course from a faculty perspective.

Elizabeth Escobar
Title: Methods and Applications of Quantile Regression Models
Abstract: Quantile regression is a type of regression function that models relationships among predictor variables using various quantiles of the response variable. In contrast to standard linear regression which models changes in the average response, quantile regression examines changes in the quantiles of the response variable based on changes in the predictors, giving us more solid estimates based on these variations. Ecologists often use quantile regression to measure causal relationships, as it is a very effective method of predicting how different quantiles of organism responses, such as population size, are affected by various environmental factors. It is also an effective tool for detecting abnormal growth patterns in growth charts. Quantile regression allows us to distinguish the complex manner in which predictors affect the response variable. This presentation will introduce the concept of quantile regression and discuss how some of the common standard regression ideas, such as model selection and resampling based inference, can be incorporated into quantile regression.

Brooke McGraw
Title: Using Linear Discriminant Analysis to Predict Beer
Abstract: Linear discriminant analysis (LDA) is a classification technique commonly used for dimensionality reduction. LDA uses existing information to compute latent explanatory variables that maximizes separation between multiple classes. Like other classification techniques (such as linear regression), LDA can be used for predictions and to determine important variables in the model. This talk will introduce LDA and apply it to the classic Fischer’s Iris dataset as well as classifying beer styles based on home-brew recipes.

Curtis Hurlbut
Title: Exploring Robust Alternatives to Least Squares Regression
Abstract: Robust regression is a form of regression analysis designed to not be overly affected by violations of assumptions. Robust regression models are not as sensitive to outliers as ordinary least squares estimates are. In the case of the presence of outliers, least squares estimation is inefficient and can be biased, in which case robust regression is a viable alternative. This presentation will introduce the method of robust regression and examine how model selection and resampling based inference is done in the robust case.

Sirius Amerman
Title: Using Factor Analysis to Predict Bull and Bear Oil Markets
Abstract: Factor analysis is the process of collapsing a number of correlated variables into a single factor that represent a latent or indirectly measured variable. This statistical method is especially useful when analyzing data that is affected by a large number of variables. Applying this method to the oil industry, we can explore the joint relationships between different supply and demand factors to determine whether or not the price of crude oil (WTI) is in a bull or bear market.

Janelle Fredericks
Title: An Introduction to Gaussian Mixture Modeling for Model-Based Clustering
Abstract: There are many ways to understand how different datasets can be clustered together. First, we will take an in depth look at Gaussian Mixture Model Clustering via the EM algorithm to see how points are determined to be in each individual cluster. Second, we will use simulated data with a predetermined number of cluster to examine how different classic clustering algorithms perform compared to Gaussian Mixture Modeling. Specifically, I will be comparing, Gaussian Mixture Modeling (as implemented through the R package “mclust”), the k-means algorithm, and hierarchical clustering techniques to determine which methods are optimal in different clustering situations.

Bailey O’Keeffe
Title: Statistical Examination of Female Representation in Oscar Best Picture Nominated Films
Abstract: Due to criticism about the gender discrepancy in Oscar-nominated films, we analyzed the screen time of male and female leads in films nominated for Best Picture from 2006-2014 using R and Minitab. We discuss the differences found between lead screen times according to year and director gender, the relationship between how long a female lead is on screen and how likely it is that a film will win the Oscar, a logistic regression model that we built to predict Oscar Winners, and also the lack of racial diversity in these films.

Xuehang Pan
Title: Giving Real People Access to Big Data – Analyzing Bike Rentals in NYC
Abstract: We are now in an era of “Big Data” but challenged to find ways for people to extract meaning from the data effectively. We discuss the process of data scraping, building a database, and giving a user tools to investigate the data. We build these tools using R and provide a user interface with Shiny apps. We illustrate these ideas using data from the Citi Bike service in New York City that covers 330 stations and millions of rentals in 2015.

Xuehang Pan
Title: Giving Real People Access to Big Data – Analyzing Bike Rentals in NYC
Abstract: We are now in an era of “Big Data” but challenged to find ways for people to extract meaning from the data effectively. We discuss the process of data scraping, building a database, and giving a user tools to investigate the data. We build these tools using R and provide a user interface with Shiny apps. We illustrate these ideas using data from the Citi Bike service in New York City that covers 330 stations and millions of rentals in 2015.