School of Informatics and Computing Menu

Data Science Ph.D. Learning Outcomes

Students will demonstrate competency in research:

  • Critically evaluate the published scholarly record.
  • Critically apply the theories and methodologies of data science to new research in their primary area of study.
  • Apply appropriate principles, frameworks, and models to evaluate and interpret the frontiers of knowledge in their primary area of study.
  • Demonstrate expository and oral communication skills appropriate to a Ph.D., publishing and presenting work in their field.
  • Critique data practices for ethical issues, including discriminatory practices, power imbalances, and invasions of privacy.
  • Demonstrate advanced competency in data science tools and techniques, applied statistical analysis, and a domain area relevant to their area of specialization.
  • Develop a record of relevant scholarship.
  • Demonstrate an ability to conduct independent, original research with a depth of knowledge in the chosen area of specialization.

Students will demonstrate competency in data analytics:

  • Design and execute ethical research using quantitative and experimental methods.
  • Organize, visualize, and analyze large, complex datasets using descriptive statistics and graphs to make decisions.
  • Apply inferential statistics, predictive analytics, and data mining to informatics-related fields.
  • Analyze datasets with supervised learning methods for functional approximation, classification, and forecasting and unsupervised learning methods for dimensionality reduction and clustering.
  • Identify, assess, and select appropriately among data analytics methods and models for solving a particular real-world problem, weighing their advantages and disadvantages.
  • Write programs to perform data analytics on large, complex datasets.

Students will demonstrate competency in data management and infrastructure:

  • Design and implement relational databases using commercial database management systems according to database concepts and theory.
  • Diagram a relational database design based on an identified scenario.
  • Produce database queries using SQL.
  • Perform database administration tasks.
  • Describe the data management activities associated with the data lifecycle.
  • Overcome difficulties in managing very large datasets, both structured and unstructured, using nonrelational data storage and retrieval (NoSQL), parallel algorithms, and cloud computing.
  • Apply the MapReduce programming model to data-driven discovery and scalable data processing for scientific applications.