Search
School of Informatics and Computing Menu

INFO-I 415 Introduction to Statistical Learning

3 credits

Prerequisites: PBHL-B 302 (or other approved statistics)

This course applies statistical learning methods for data mining and inferential and predictive analytics to informatics-­related fields. The course also covers techniques for exploring and visualizing data, assessing model accuracy, and weighing the merits of different methods for a given real-­world application. This course is an essential toolset for transforming large, complex informatics datasets into actionable knowledge.

Learning Outcomes

  • Analyze datasets with the following supervised learning methods: for functional approximation, multiple linear regression, splines, and local regression; for classification, logistic regression, linear discriminant analysis, decision trees, bagging, random forests, and boosting, and support vector machines.
  • Analyze datasets with the following unsupervised learning methods: for dimensionality reduction, principal components analysis; for grouping, k­-means clustering and hierarchical clustering.
  • Explore, transform, and visualize large, complex datasets with graphs in R.
  • Solve real­-world problems by adapting and applying statistical learning methods to large, complex datasets.
  • Identify and select appropriately among statistical learning methods for a particular real­world problem; analyze each method with respect to a given dataset or research question in terms of modeling accuracy and the bias­variance trade­off; perform model assessment (i.e., estimate test error rates) and selection by resampling: cross­validation and bootstrapping; identify overfitting and underfitting; perform model selection and regularization by subset selection and shrinkage methods: ridge regression and Lasso; explain the relative advantages and disadvantages of each statistical learning method for the real-­world problem.
  • Write programs to perform data analytics on large, complex datasets in R.
  • Analyze data from case studies in informatics­related fields (e.g., digital media, human­computer interaction, health informatics, bioinformatics, and business intelligence).

Course Delivery

  • On-Campus

Course Schedule

Syllabi