School of Informatics and Computing Menu
Human-Centered Computing Department Menu

Master of Science in Applied Data Science

Gain new insights into big data decision-making

When organizations are awash with data, they need a lifeline. You can be the one who channels the data and converts it into actionable knowledge that adds value.

Medical statistics, institutional knowledge, consumer buying habits—where there’s data, there’s the potential for knowledge. Data science can optimize the delivery of health care, or improve a company’s marketing strategy.

Learn to manage massive stores of data in the cloud and the data life cycle when you earn our Master of Science degree in Applied Data Science.

A leader in the field of informatics and data science

The first of its kind in the nation, the IU School of Informatics and Computing at IUPUI is a pioneer in the field of informatics and data science. Key ties to industry, employment, and research distinguish our school and its programs.

Students learn methods of data mining, ways to transform large datasets into usable knowledge, and how to represent information visually. The master’s in Applied Data Science provides students with core competencies in the latest methods of data management, analysis, and infrastructure and high-throughput data storage.

Our curriculum includes instruction in client–server application development, and ethical and the professional management of informatics projects.

Internet of Things and more careers in Data Science

Emerging technologies such as GPU-based deep learning and the Internet of Things have fueled the demand for data science skills. Students who earn their master’s degree can find jobs in many sectors, including:

Plan of Study

The plan of study is comprised of 30 credit hours. It includes eight required courses on the following topics: informatics, data visualization, relational databases, statistics, web and database development, project management or research design, statistical learning, and cloud computing. In addition, there are six credit hours of approved electives.

Substitute Courses

Substitute course accommodate special interests and scheduling needs, such as for an online course. A substitute course satisfies the core course requirement.

A student wishing to substitute a course or take an elective course from our Data Science program in Bloomington must apply for graduate non-degree status one month before the start of the semester.

International students on an F-1 visa may only take one online course per semester.

Course Replaces
CSCI 54100 Database Systems (3 cr., prerequisite: CSCI 44300 Database Systems or equivalent) LIS S511, INFO B512, or INFO B556
CSCI 55200 Data Visualization (3 cr.) INFO H517
CSCI 57300 Data Mining (3 cr., prerequisites: CSCI 24000 Computing II and STAT 30100 Elementary Statistical Methods I or STAT 35000 and MATH 35100 Elementary Linear Algebra and MATH 51100 Linear Algebra with Applications) INFO H515
CSCI 59000 Cloud Computing (3 cr.) INFO H516
LIS S517 Web Programming (3 cr.) NEWM N510
ECON E570 Fundamentals of Statistics and Econometrics (3 cr.), HPER T591 Introduction to Statistics in Public Health (3 cr.), PBHL B561 Introduction to Biostatistics I (3 cr.), or STAT 51100 Statistical Methods I (3 cr.) PSY 60000
INFO I575 Informatics Research Design (3 cr.), LIS S506 Introduction to Research (3 cr.), or STAT 514 Design of Experiments (3 cr.) INFO B505


The following approved electives are available in the fall and spring semesters.

Thesis or Project in Applied Data Science

The Thesis/Project is available to highly motivated students ready to carry out publishable research. Students must prepare a prospectus and gain a commitment from a primary faculty advisor with research interests in data science by the end of the first semester. By the end of the second semester, students must complete a course on research design and methods (e.g., INFO-I 575, LIS-S 506, or STAT 514).

The thesis or project must be completed within two semesters or within a semester and summer. Students register for a total of six credits of Thesis/Project. They are required to prepare and defend a research proposal with a timeline of deliverables in addition to the thesis or project.

Learning Outcomes

Master of Science in Applied Data Science Core

Students will demonstrate competency in data analytics.

  1. Differentiate between research fields, theoretical concepts, epistemologies, and qualitative and quantitative methods.
  2. Analyze critically and speak publicly about field-specific scholarly research, projects executed in class, and data management issues.
  3. Design, implement, test, and debug extensible and modular programs involving control structures, variables, expressions, assignments, I/O, functions, parameter passing, data structures, regular expressions, and file handling.
  4. Apply software development methodologies to create efficient, well-structured applications that other programmers can easily understand.
  5. Analyze computational complexity in algorithm development.
  6. Investigate research questions and designs by loading, extracting, transforming, and analyzing data from various sources.
  7. Test hypotheses and evaluate reliability and validity.
  8. Implement histograms, classifiers, decision trees, sampling, linear regression, and projectiles in a scripting language.
  9. Decompose and simulate systems to process data using randomness.
  10. Employ supervised and unsupervised machine learning for functional approximation and categorization.
  11. Display, interpret, and explore data using descriptive statistics and graphs.
  12. Explore assumptions about the data, including normality, skew, and kurtosis.
  13. Use random variables and probability distributions.
  14. Determine whether and how to perform statistical inference.
  15. Perform parametric (e.g., t-test, ANOVA, ANCOVA, MANOVA) and nonparametric (e.g., chi-square) hypothesis testing and correlation.
  16. Fit linear regression models and interpret their parameters.
  17. Design and execute ethical research using quantitative and experimental methods.
  18. Organize, visualize, and analyze large, complex datasets using descriptive statistics and graphs to make decisions.
  19. Apply inferential statistics, predictive analytics, and data mining to informatics-related fields.
  20. Analyze datasets with supervised learning methods for functional approximation, classification, and forecasting and unsupervised learning methods for dimensionality reduction and clustering.
  21. Identify, assess, and select appropriately among statistical learning methods and models for solving a particular real-world problem, weighing their advantages and disadvantages.
  22. Write programs to perform data analytics on large, complex datasets.

Students will demonstrate competency in data management, infrastructure, and the data science lifecycle.

  1. Design and implement relational databases using tables, keys, relationships, and SQL commands to meet user and operational needs.
  2. Diagram a relational database design with entity–relationship diagrams (ERDs) using crow’s foot notation to enforce referential integrity.
  3. Evaluate tables for compliance to third normal form and perform normalization procedures on noncompliant tables.
  4. Write triggers to handle events and create views to enforce business rules within a relational database.
  5. Demonstrate an understanding of the data lifecycle, including data curation, stewardship, preservation, and security.
  6. Evaluate the social and ethical implications of data management.

Students will demonstrate competency in client–server application development.

  1. Design and implement client–server applications that solve real-world problems.
  2. Create well-formed static and dynamic webpages using current versions of PHP, HTML, CSS, and JavaScript or their equivalents.
  3. Implement the model-view-controller software pattern in web and mobile user interfaces.
  4. Apply client-side and server-side programming skills including design, coding, implementation, and integration with relational databases.
  5. Extract data from JavaScript Object Notation (JSON) and Extensible Markup Language (XML) documents.
  6. Transmit objects between the browser and server by converting them into JSON.
  7. Evaluate a given web application based on different criteria such as structure, dynamics, security, embedded systems, and interactivity.
  8. Diagram the phases of the secure software development lifecycle.
  9. Demonstrate the techniques of defensive programming and secure coding.
  10. Design user-friendly web and mobile interfaces.

Students will demonstrate competency in the management of massive, high-throughput data stores, and cloud computing.

  1. Research the main concepts, models, technologies, and services of cloud computing, the reasons for the shift to this model, and its advantages and disadvantages.
  2. Examine the technical capabilities and commercial benefits of hardware virtualization.
  3. Analyze tradeoffs for data centers in performance, efficiency, cost, scalability, and flexibility.
  4. Evaluate the core challenges of cloud computing deployments, including public, private, and community clouds, with respect to privacy, security, and interoperability.
  5. Create cloud computing infrastructure models.
  6. Demonstrate and compare the use of cloud storage vendor offerings.
  7. Develop, install, and configure cloud-computing applications under software-as-a-service principles, employing cloud-computing frameworks and libraries.
  8. Apply the MapReduce programming model to data analytics in informatics-related domains.
  9. Enhance MapReduce performance by redesigning the system architecture (e.g., provisioning and cluster configurations).
  10. Overcome difficulties in managing very large datasets, both structured and unstructured, using nonrelational data storage and retrieval (NoSQL), parallel algorithms, and cloud computing.
  11. Apply the MapReduce programming model to data-driven discovery and scalable data processing for scientific applications.

Students will demonstrate competency in data visualization.

  1. Assess the purpose, benefits, and limitations of visualization as a human-centered data analysis methodology.
  2. Conceptualize and design effective visualizations for a variety of data types and analytical tasks.
  3. Implement interactive visualizations using modern web-based frameworks.
  4. Evaluate critically visualizations using perceptual principles and established design guidelines.
  5. Conduct independent research on a range of theoretical and applied topics in visualization and visual analytics.

Students will demonstrate competency in the ethical and professional management of informatics projects.

  1. Apply project management methods to overcome the complexities of informatics projects.
  2. Plan informatics projects, setting their scope and assigning team members appropriately to roles.
  3. Apply to informatics projects time management concepts, such as network diagrams, CPM, and PERT.
  4. Apply cost management and budgeting principles.
  5. Manage unanticipated changes in informatics projects.
  6. Perform risk analysis by means of quantitative and qualitative methods.
  7. Employ both “hard” and “soft” skills in leading a project team.
  8. Use project management software effectively.
  9. Apply communication, negotiation, and group decision-making abilities in team projects.
  10. Demonstrate ethical and professional behavior in response to ethically challenging situations.