Search
School of Informatics and Computing Menu

Master of Science in Applied Data Science

Gain new insights into big data decision-making

When organizations are awash with data, you can be their lifeline by channeling the data and converting it into actionable knowledge that adds value.

Medical results, institutional knowledge, consumer buying habits—where there’s data, there’s the potential for knowledge. Data science can optimize the delivery of healthcare or improve a company’s marketing strategy.

Learn to manage massive stores of data in the cloud and the data lifecycle when you earn our Master of Science in Applied Data Science.

A leader in data science

The first of its kind in the nation, the Indiana University School of Informatics and Computing at IUPUI is a pioneer in the fields of informatics and data science. Key ties to industry, research centers, and employers in Indianapolis and across the country distinguish our school and its programs. Careers services can arrange an internship in Central Indiana and beyond.

Students learn methods of data mining, to transform large datasets into usable knowledge, and how to represent information visually. The master’s in data science provides students with core competencies in the latest methods of analysis, data management, and infrastructure and high-throughput data processing and storage.

Our curriculum includes instruction in client–server application development and the professional and ethical management of data science projects.

Internet of things and more careers in data science

Emerging technologies such as GPU-based deep learning and the Internet of things have fueled the demand for data science skills. Students who earn their master’s degree can find jobs in many sectors, including:

Request Information

Plan of Study

The plan of study is 30 credit hours. It includes nine required courses on the following topics: informatics and mathematical foundations, data visualization, relational databases, statistics, statistical learning, cloud computing, web and database development, and project management or research design.

Substitute Courses

Substitute course accommodate special interests and scheduling needs, such as for an online course. A substitute course satisfies the core course requirement.

A student wishing to substitute a course or take an elective course from our Data Science program in Bloomington must apply for graduate non-degree status one month before the start of the semester.

Students may test out of INFO I501 or INFO H611.

International students on an F-1 visa may only take one online course per semester.

Course Replaces
CSCI 54100 Database Systems (3 cr., prerequisite: CSCI 44300 Database Systems or equivalent) LIS S511, INFO B512, or INFO B556
CSCI 55200 Data Visualization (3 cr.) INFO H517
CSCI 57300 Data Mining (3 cr.), CSCI 57800 Statistical Machine Learning (3 cr.), INFO-B 518 Applied Statistical Methods for Biomedical Informatics (3 cr.), INFO-I 526 Applied Machine Learning (3 cr., IUB) INFO H515
CSCI 59000 Cloud Computing (3 cr.) INFO H516
LIS S517 Web Programming (3 cr.) NEWM N510
ECON E570 Fundamentals of Statistics and Econometrics (3 cr.), HPER T591 Introduction to Statistics in Public Health (3 cr.), PBHL B561 Introduction to Biostatistics I (3 cr.), or STAT 51100 Statistical Methods I (3 cr.) PSY 60000
INFO I575 Informatics Research Design (3 cr.), LIS S506 Introduction to Research (3 cr.), STAT 514 Design of Experiments (3 cr.), or TECH 50700 Measurement and Evaluation in Industry and Technology (3 cr.) INFO B505

Approved Electives

The following approved electives are available in the fall and spring semesters.

Thesis or Project in Applied Data Science

The Thesis/Project is available to highly motivated students ready to carry out publishable research. Students must prepare a prospectus and gain a commitment from a primary faculty advisor with research interests in data science by the end of the first semester. By the end of the second semester, students must complete a course on research design and methods (e.g., INFO-I 575, LIS-S 506, or STAT 514).

The thesis or project must be completed in two semesters or in a semester and summer. Thesis students register for a total of 6 credits and project students register for a total of 3–6 credits of INFO-H 695 Thesis/Project in Applied Data Science. Students are required to prepare and defend a research proposal with a timeline of deliverables in addition to the thesis or project. The length of the program is up to 33 credit hours for students who do not test out of the informatics and mathematical foundations requirements.

 

Get started

Learning Outcomes

Master of Science in Applied Data Science Core

 

Students will demonstrate competency in data analytics.
  1. Differentiate between research fields, theoretical concepts, epistemologies, and qualitative and quantitative methods.
  2. Analyze critically and speak publicly about field-specific scholarly research, projects executed in class, and data management issues.
  3. Design, implement, test, and debug extensible and modular programs involving control structures, variables, expressions, assignments, I/O, functions, parameter passing, data structures, regular expressions, and file handling.
  4. Apply software development methodologies to create efficient, well-structured applications that other programmers can easily understand.
  5. Analyze computational complexity in algorithm development.
  6. Investigate research questions and designs by loading, extracting, transforming, and analyzing data from various sources.
  7. Test hypotheses and evaluate reliability and validity.
  8. Implement histograms, classifiers, decision trees, sampling, linear regression, and projectiles in a scripting language.
  9. Decompose and simulate systems to process data using randomness.
  10. Employ supervised and unsupervised machine learning for functional approximation and categorization.
  11. Display, interpret, and explore data using descriptive statistics and graphs.
  12. Explore assumptions about the data, including normality, skew, and kurtosis.
  13. Use random variables and probability distributions.
  14. Determine whether and how to perform statistical inference.
  15. Perform parametric (e.g., t-test, ANOVA, ANCOVA, MANOVA) and nonparametric (e.g., chi-square) hypothesis testing and correlation.
  16. Fit linear regression models and interpret their parameters.
  17. Design and execute ethical research using quantitative and experimental methods.
  18. Organize, visualize, and analyze large, complex datasets using descriptive statistics and graphs to make decisions.
  19. Apply inferential statistics, predictive analytics, and data mining to informatics-related fields.
  20. Analyze datasets with supervised learning methods for functional approximation, classification, and forecasting and unsupervised learning methods for dimensionality reduction and clustering.
  21. Identify, assess, and select appropriately among statistical learning methods and models for solving a particular real-world problem, weighing their advantages and disadvantages.
  22. Write programs to perform data analytics on large, complex datasets.
Students will demonstrate competency in data management, infrastructure, and the data science lifecycle.
  1. Design and implement relational databases using tables, keys, relationships, and SQL commands to meet user and operational needs.
  2. Diagram a relational database design with entity–relationship diagrams (ERDs) using crow’s foot notation to enforce referential integrity.
  3. Evaluate tables for compliance to third normal form and perform normalization procedures on noncompliant tables.
  4. Write triggers to handle events and create views to enforce business rules within a relational database.
  5. Formulate queries in relational algebra using selection, projection, restriction, Cartesian product, join, and set operators.
  6. Demonstrate an understanding of the data lifecycle, including data curation, stewardship, preservation, and security.
  7. Evaluate the social and ethical implications of data management.
Students will demonstrate competency in client–server application development.
  1. Design and implement client–server applications that solve real-world problems.
  2. Create well-formed static and dynamic webpages using current versions of PHP, HTML, CSS, and JavaScript or their equivalents.
  3. Implement the model-view-controller software pattern in web and mobile user interfaces.
  4. Apply client-side and server-side programming skills including design, coding, implementation, and integration with relational databases.
  5. Extract data from JavaScript Object Notation (JSON) and Extensible Markup Language (XML) documents.
  6. Transmit objects between the browser and server by converting them into JSON.
  7. Evaluate a given web application based on different criteria such as structure, dynamics, security, embedded systems, and interactivity.
  8. Diagram the phases of the secure software development lifecycle.
  9. Demonstrate the techniques of defensive programming and secure coding.
  10. Design user-friendly web and mobile interfaces.
Students will demonstrate competency in the management of massive, high-throughput data stores, and cloud computing.
  1. Research the main concepts, models, technologies, and services of cloud computing, the reasons for the shift to this model, and its advantages and disadvantages.
  2. Examine the technical capabilities and commercial benefits of hardware virtualization.
  3. Analyze tradeoffs for data centers in performance, efficiency, cost, scalability, and flexibility.
  4. Evaluate the core challenges of cloud computing deployments, including public, private, and community clouds, with respect to privacy, security, and interoperability.
  5. Create cloud computing infrastructure models.
  6. Demonstrate and compare the use of cloud storage vendor offerings.
  7. Develop, install, and configure cloud-computing applications under software-as-a-service principles, employing cloud-computing frameworks and libraries.
  8. Apply the MapReduce programming model to data analytics in informatics-related domains.
  9. Enhance MapReduce performance by redesigning the system architecture (e.g., provisioning and cluster configurations).
  10. Overcome difficulties in managing very large datasets, both structured and unstructured, using nonrelational data storage and retrieval (NoSQL), parallel algorithms, and cloud computing.
  11. Apply the MapReduce programming model to data-driven discovery and scalable data processing for scientific applications.
Students will demonstrate competency in data visualization.
  1. Assess the purpose, benefits, and limitations of visualization as a human-centered data analysis methodology.
  2. Conceptualize and design effective visualizations for a variety of data types and analytical tasks.
  3. Implement interactive visualizations using modern web-based frameworks.
  4. Evaluate critically visualizations using perceptual principles and established design guidelines.
  5. Conduct independent research on a range of theoretical and applied topics in visualization and visual analytics.
Students will demonstrate competency in the ethical and professional management of informatics projects.
  1. Apply project management methods to overcome the complexities of informatics projects.
  2. Plan informatics projects, setting their scope and assigning team members appropriately to roles.
  3. Apply to informatics projects time management concepts, such as network diagrams, CPM, and PERT.
  4. Apply cost management and budgeting principles.
  5. Manage unanticipated changes in informatics projects.
  6. Perform risk analysis by means of quantitative and qualitative methods.
  7. Employ both “hard” and “soft” skills in leading a project team.
  8. Use project management software effectively.
  9. Apply communication, negotiation, and group decision-making abilities in team projects.
  10. Demonstrate ethical and professional behavior in response to ethically challenging situations.