Curriculum

Please see below for information about the Master of Biostatistics Program curriculum and courses.

Curriculum Overview
Course Planning
Course Descriptions

Curriculum Overview

The Master of Biostatistics degree, a professional degree awarded by the Duke University School of Medicine, requires 36 credits of graded course work, a practicum experience, a qualifying examination, and a master’s project for which 6 units of credit are given. Completed in the second year, the master’s project serves to demonstrate the student’s mastery of biostatistics. Eleven courses (BIOSTAT 701, 702, 703, 704, 705, 706, 707, 721, 722, 801, 802) constitute 26 credits that are required for all degree candidates.The Master of Biostatistics Program curriculum is structured as follows:

Core Courses

Foundational courses required of all degree-seeking students. 

BIOSTAT 701. Introduction to Statistical Theory and Methods I
BIOSTAT 702. Applied Biostatistics Methods I
BIOSTAT 703. Introduction to the Practice of Biostatistics I
BIOSTAT 704. Introduction to Statistical Theory and Methods II
BIOSTAT 705. Applied Biostatistical Methods II
BIOSTAT 706. Introduction to the Practice of Biostatistics II
BIOSTAT 707. Statistical Methods for Learning and Discovery
BIOSTAT 721. Introduction to Statistical Programming I 
BIOSTAT 722. Introduction to Statistical Programming II 
BIOSTAT 801 Biostatistics Career Preparation and Development
BIOSTAT 802 Biostatistics Career Preparation and Development II
BIOSTAT 821 Software Tools for Data Science for students in BDS Track
BIOSTAT 822 R for Data Science for students in BDS Track

Practicum

All candidates for the Masters of Biostatistics degree are required to complete a practicum. The practicum is an experiential learning opportunity. The main goal of the practicum is to allow students to develop their analytic ability, biological knowledge, and communication skills. The practicum is typically completed during the summer after the first year, but can be completed during the second year.

Qualifying Examination

All candidates for the Master of Biostatistics degree are required to pass a written Qualifying Examination demonstrating their mastery of fundamental concepts acquired through completion of the first-year core courses (BIOSTAT 701 – 706 inclusive). Students are expected to take the Qualifying Examination after completing the first year of study in the program and prior to beginning their elective coursework.

Master's Project (two semesters, totalling six credits)

All candidates for the Master of Biostatistics degree are required to complete a Master’s Project. Completed in the second year, the two-semester Master’s Project serves to demonstrate the student's mastery of core statistical concepts and the practice of biostatistics.
BIOSTAT 720. Master's Project (3 credits per semester for a total of six credits)

Elective Courses (two credits each)

Full-time Master of Biostatistics students will select five two-credit elective courses during the second year of study:
BIOSTAT 708. Clinical Trial Design and Analysis 
BIOSTAT 709. Observational Studies 
BIOSTAT 710. Statistical Genetics and Genetic Epidemiology 
BIOSTAT 713. Survival Analysis 
BIOSTAT 714. Categorical Data Analysis 
BIOSTAT 718. Analysis of Correlated and Longitudinal Data 
BIOSTAT 719. Generalized Linear Models 
BIOSTAT 823 Statistical Programming for Big Data for students in the BDS Track
BIOSTAT 824 Case Studies in Biomedical Data Science for students in the BDS Track

Professional Development Courses (.5 credits each)

BIOSTAT 801. Biostatistics Career Preparation and Development I
BIOSTAT 802. Biostatistics Career Preparation and Development II

[Top of Page]

Course Planning

During the first year of study, full-time Master of Biostatistics students will typically take eight core courses:

Fall Semester (11.5 credits)

BIOSTAT 701. Introduction to Statistical Theory and Methods I 
BIOSTAT 702. Applied Biostatistical Methods I 
BIOSTAT 703. Introduction to the Practice of Biostatistics I 
BIOSTAT 722. Introduction to Statistical Programming I
BIOSTAT 801. Biostatistics Career Preparation and Development I

Spring Semester (11.5 credits)

BIOSTAT 704. Introduction to Statistical Theory and Methods II 
BIOSTAT 705. Applied Biostatistical Methods II 
BIOSTAT 706. Introduction to the Practice of Biostatistics II
BIOSTAT 721. Introduction to Statistical Programming II
BIOSTAT 802. Biostatistics Career Preparation and Development II 

During the second year of study, full-time Master of Biostatistics students will typically take two core courses, a set of elective courses, and receive credit toward the completion of the master's project. A typical sequence is as follows:

Fall Semester (10 credits)

BIOSTAT 707. Statistical Methods for Learning and Discovery 
BIOSTAT 720. Master's Project
Two Elective Courses (2 credits each)

Spring Semester (9 credits)

BIOSTAT 720. Master's Project
Three Elective Courses (2 credits each)

[Top of Page]

Course Descriptions

BIOSTAT 701. Introduction to Statistical Theory and Methods I. This course provides a formal introduction to the basic theory and methods of probability and statistics. It covers topics in probability theory with an emphasis on those needed in statistics, including probability and sample spaces, independence, conditional probability, random variables, parametric families of distributions, sampling distributions, and the central limit theorem. Core concepts are mastered through mathematical exploration, simulations, and linkage with the applied concepts studied in BIOSTAT 704.  Offered in the fall.
Prerequisite(s): 2 semesters of calculus or its equivalent (multivariate calculus preferred). Familiarity with linear algebras is helpful.
Corequisite(s): BIOSTAT 702, BIOSTAT 703
Credits: 3

BIOSTAT 702.  Applied Biostatistical Methods I. This course provides an introduction to study design, descriptive statistics, and analysis of statistical models with one or two predictor variables. Topics include principles of study design, basic study designs, descriptive statistics, sampling, contingency tables, one- and two-way analysis of variance, simple linear regression, and analysis of covariance. Both parametric and non-parametric techniques are explored. Core concepts are mastered through team-based case studies and analysis of authentic research problems encountered by program faculty and demonstrated in practicum experiences in concert with BIOSTAT 703. Computational exercises will use the R and SAS packages. Offered in the fall.
Prerequisite(s): 2 semesters of calculus or its equivalent (multivariate calculus preferred). Familiarity with linear algebras is helpful.
Corequisites(s): BIOSTAT 701, BIOSTAT 703, BIOSTAT 721
Credits: 3

BIOSTAT 703. Introduction to the Practice of Biostatistics I. This course provides an introduction to biology at a level suitable for practicing biostatisticians and directed practice in techniques of statistical collaboration and communication. With an emphasis on the connection between biomedical content and statistical approach, this course helps unify the statistical concepts and applications learned in BIOSTAT 701 and BIOSTAT 702. In addition to didactic sessions on biomedical issues, students are introduced to different areas of biostatistical practice at Duke University Medical Center. Biomedical topics are organized around the fundamental mechanisms of disease from both evolutionary and mechanistic perspectives, illustrated using examples from infectious disease, cancer and chronic /degenerative disease. In addition, students learn how to read and interpret research and clinical trial papers. Core concepts and skills are mastered through individual reading and class discussion of selected biomedical papers, team-based case studies and practical sessions introducing the art of collaborative statistics. Offered in the fall.
Corequisite(s): BIOSTAT 701, BIOSTAT 702
Credits: 3

BIOSTAT 704. Introduction to Statistical Theory and Methods II. This course provides formal introduction to the basic theory and methods of probability and statistics. It covers topics in statistical inference, including classical and Bayesian methods, and statistical models for discrete, continuous and categorical outcomes. Core concepts are mastered through mathematical exploration, simulations, and linkage with the applied concepts studied in BIOSTAT 705. Offered in the spring.
Prerequisite(s): BIOSTAT 701 or its equivalent
Corequisite(s): BIOSTAT 705, BIOSTAT 706
Credits: 3

BIOSTAT 705. Applied Biostatistical Methods II. This course provides an introduction to study design, descriptive statistics, an analysis of statistical models with continuous, dichotomous and survival outcomes, with one or more predictor variables. Topics include mixed effects models, likelihood and Bayesian estimation, generalized linear models (GLM) including binary, multinomial and log-linear models, basic models for survival analysis and regression models for censored survival data, clustered data, and model assessment, validation and prediction. Both parametric and non-parametric techniques are explored. Core concepts are mastered through team-based case study and analysis of authentic research problems encountered by program faculty and demonstrated in practicum experiences in concert with BIOSTAT 706. Computational exercises use the SAS and R packages. Offered in the spring.
Prerequisite(s): BIOSTAT 702 or its equivalent; linear and matrix algebra
Corequisite(s): BIOSTAT 704, BIOSTAT 706, BIOSTAT 722
Credits: 3

BIOSTAT 706. Introduction to the Practice of Biostatistics II. Successful working Biostatisticians draw on a wide range of skills including knowledge of biostatistical theory and methods, understanding of general biology and medicine, and communication with collaborators at all levels.  This course will build on fundamentals learned in BIOSTAT 701, 702 and 703 with an emphasis on integrating that knowledge in practice.  The course will be primarily participatory with students interacting with Duke researchers to design biostatistical analyses.  Researcher and student presentations will be a large part of the course, supplemented with readings from the literature.   As with BIOSTAT 703, there will be strong emphasis on the development of communication skills via written and oral presentations. Offered in the spring.
Prerequisite(s): BIOSTAT 703
Corequisite(s): BIOSTAT 704, BIOSTAT 705
Credits: 3

BIOSTAT 707. Statistical Methods for Learning and Discovery. This course surveys a number of techniques for high dimensional data analysis useful for data mining, machine learning and genomic applications, among others. Topics include principal and independent component analysis, multidimensional scaling, tree based classifiers, clustering techniques, support vector machines and networks, and techniques for model validation. Core concepts are mastered through the analysis and interpretation of several actual high dimensional genomics datasets. Offered in the fall.
Prerequisite(s): BIOSTAT 701 through BIOSTAT 706, or their equivalents
Credits: 3

BIOSTAT 708. Clinical Trial Design and Analysis. Topics include: history/background and process for clinical trial, key concepts for good statistics practice (GSP)/good clinical practice (GCP), regulatory requirement for pharmaceutical/clinical development, basic considerations for clinical trials, designs for clinical trials, classification of clinical trials, power analysis for sample size calculation, statistical analysis for efficacy evaluation, statistical analysis for safety assessment, implementation of a clinical protocol, statistical analysis plan, data safety monitoring, adaptive design methods in clinical trials (general concepts, group sequential design, dose finding design, and phase I/II or phase II/III seamless design) and controversial issues in clinical trials. Offered in the spring.
Prerequisite(s): BIOSTAT 701 and BIOSTAT 704, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 709. Observational Studies. Methods for causal inference, including confounding and selection bias in observational or quasi-experimental research designs, propensity score methodology, instrumental variables, and methods for non-compliance in randomized clinical trials. Offered in the spring.
Prerequisite(s): BIOSTAT 701 and BIOSTAT 702, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 710. Statistical Genetics and Genetic Epidemiology. Topics from current and classical methods for assessing familiality and heritability, linkage analysis of Mendelian and complex traits, family-based and population-based association studies, genetic heterogeneity, epistasis, and gene-environmental interactions. Computational methods and applications in current research areas. The course will include a simple overview of genetic data, terminology, and essential population genetic results. Topics will include sampling designs in human genetics, gene frequency estimation, segregation analysis, linkage analysis, tests of association, and detection of errors in genetic data. Offered in the spring.
Prerequisite(s): BIOSTAT 701 and BIOSTAT 704, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 713. Survival Analysis. Introduction to concepts and techniques used in the analysis of time to event data, including censoring, hazard rates, estimation of survival curves, regression techniques, applications to clinical trials. Interval censoring, informative censoring, competing risks, multiple events and multiple endpoints, time dependent covariates; nonparametric and semi-parametric methods. Offered in the fall.
Prerequisite(s): BIOSTAT 701 and BIOSTAT 704, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 714. Categorical Data Analysis. Topics in categorical modeling and data analysis/contingency tables; measures of association and testing; logistic regression; log-linear models; computational methods including iterative proportional fitting; models for sparse data; Poisson regression; models for ordinal categorical data, and longitudinal analysis. Offered in the fall.
Prerequisite(s): BIOSTAT 701, BIOSTAT 702, BIOSTAT 704, and BIOSTAT 705, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 718. Analysis of Correlated and Longitudinal Data. Topics include linear and nonlinear mixed models; generalized estimating equations; subject specific versus population average interpretation; and hierarchical model. Offered in the spring.
Prerequisite(s): BIOSTAT 701, BIOSTAT 702, BIOSTAT 704, and BIOSTAT 705, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 719. Generalized Linear Models. The class introduces the concept of exponential family of distributions and link function, and their use in generalizing the standard linear regression to accommodate various outcome types. Theoretical framework will be presented but detailed practical analyses will be performed as well, including logistic regression and Poisson regression with extensions. Majority of the course will deal with the independent observations framework. However, there will be substantial discussion of longitudinal/clustered data where correlations within clusters are expected. To deal with such data the Generalized Estimating Equations and the Generalized Linear Mixed models will be introduced. An introduction to a Bayesian analysis approach will be presented, time permitting. Offered in the fall.
Prerequisite(s): BIOSTAT 701, BIOSTAT 702, BIOSTAT 704, and BIOSTAT 705, or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 720. Master’s Project. Completed during a student’s final year of study, the two-semester master’s project is performed under the direction of a faculty mentor and is intended to demonstrate general mastery of biostatistical practice. Offered in the fall and spring (three credits per semester for a total of six credits)
Prerequisite(s): BIOSTAT 701 through BIOSTAT 706
Corequisite(s): BIOSTAT 707
Credits: 3

BIOSTAT 721. Introduction to Statistical Programming I (R). This class is an introduction to programming in R, targeted at statistics majors with minimal programming knowledge, which will give them the skills to grasp how statistical software works, tweak it to suit their needs, recombine existing pieces of code, and when needed create their own programs. Students will learn the core of ideas of programming (functions, objects, data structures, input and output, debugging, and logical design) through writing code to assist in numerical and graphical statistical analyses. Students will learn how to write maintainable code, and to test code for correctness. They will then learn how to set up stochastic simulations and how to work with and filter large data sets. Since code is also an important form of communication among scientists, students will learn how to comment and organize code to achieve reproducibility. Programming techniques and their application will be closely connected with the methods and examples presented in the co-requisite course. The primary programming package used in this course will be R. Offered in the fall.
Prerequisite(s): None; familiarity with linear algebras is helpful
Corequisite(s): BIOSTAT 702
Credits: 2

BIOSTAT 722. Introduction to Statistical Programming II (SAS). This class is an introduction to programming in SAS, targeted at statistics majors with minimal programming knowledge, which will give them the skills to grasp how statistical software works, tweak it to suit their needs, recombine existing pieces of code, and when needed create their own programs. Students will learn the core of ideas of programming (data step, procedures, macros, ODS, input and output, debugging, and logical design) through writing code to assist in numerical and graphical statistical analyses. Students will learn how to write maintainable code, and to test code for correctness. They will then learn how to set up stochastic simulations and how to work with and filter large data sets. Since code is also an important form of communication among scientists, students will learn how to comment and organize code to achieve reproducibility. Programming techniques and their application will be closely connected with the methods and examples presented in the co-requisite course. The primary programming package focus used in this course will be SAS. Offered in the spring.
Prerequisite(s): None; familiarity with linear algebras is helpful
Corequisite(s): BIOSTAT 705
Credits: 2

BIOSTAT 801. Biostatistics Career Preparation and Development I. The purpose of this course is to give the student a holistic view of career choices and development and the tools they will need to succeed as professionals in the world of work. The fall semester will focus on resume development, creating a professional presence, networking techniques, what American employers expect in the workplace, creating and maintaining a professional digital presence and learning how to conduct and succeed at informational interviews. Practicums in this semester include an informational interviewing and networking practicum with invited guests. Students participate in a professional “etiquette dinner” and a “dress for success” module as well an employer panel. Offered in the fallFor more details, also see the description of our Career Development and Job Placement program.
Corequisite(s): BIOSTAT 701 through BIOSTAT 703
Credits: 0.5

BIOSTAT 802. Biostatistics Career Preparation and Development II. The purpose of this course is to further develop the student’s job seeking ability and the practical aspects of job/internship search or interviewing for a PHD program. The goal is to learn these skills once and use them for a lifetime. Modules that will be covered include: Communication skills both written and oral, interviewing with videotaped practice and review, negotiating techniques, potential career choices in the Biostatistics marketplace, and working on a team. This semester includes writing and interviewing practicum, and a panel of relevant industry speakers. Students will leave this course with the knowledge to manage their careers now and in the future. Offered in the springFor more details, also see the description of our Career Development and Job Placement program.
Prerequisite(s): BIOSTAT 801
Corequisite(s): BIOSTAT 707
Credits: 0.5

BIOSTAT 821: Software Tools for Data Science. A data scientist needs to master several different tools to obtain, process, analyze, visualize and interpret large biomedical data sets such as electronic health records, medical images, and genomic sequences. It is also critical that the data scientist masters the best practices associated with using these tools, so that the results are robust and reproducible. The course covers foundational tools that will allow students to assemble a data science toolkit, including the Unix shell, text editors, regular expressions, relational and NoSQL databases, and the Python programming language for data munging, visualization and machine
learning. Best practices that students will learn include the Findable, Accessible, Interoperable and Reusable (FAIR) practices for data stewardship, as well as reproducible analysis with literate programming, version control and containerization. 
Prerequisite: Permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 822: R for Data Science. This course will build on the foundation laid in  software tools for data science. The course will explore the flow of a typical data science project from importing, cleaning, transforming and visualizing datasets to modeling and communicating results, within the context of R programming. While the course will include best practices, syntax and idioms specific to R, the focus will be on the process of conducting analysis in a reproducible fashion, writing readable, well-documented code and creating a coherent presentation of results. 
Prerequisite: Permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 823: Statistical Program for Big Data: This course will extend the foundation laid in software tools for data science to allow for efficient computing involving very large data sets. This course will explore the use appropriate algorithms and data structures for intensive computations, improving computational performance by use of native code compilation, use of parallel computing to accelerate intensive computations, use appropriate algorithms and data structures for massive data set, and use of distributed computing to process massive data sets.
Prerequisite(s): BIOSTAT 821 or permission of the Director of Graduate Studies
Credits: 2

BIOSTAT 824: Case Studies in Biomedical Data Science. This course will highlight how biomedical data science blends the field of biostatistics with the field of computer science through the introduction of 3 to 5 case studies. Students will be introduced to analytic programs typically encountered in biomedical data science and will implement the data science and statistical skills introduced in their previous course work. 
Prerequisite(s): BIOSTAT 707, 821, 822, and 823 or permission of the Director of Graduate Studies
Credits: 2

[Top of Page]