Curriculum Overview

Traditional features of the curriculum include parallel development of theory and applications as well as coverage of specific biostatistical topic areas and ethical issues in the conduct of statistical and medical research. The core curriculum covers the principles of epidemiologic studies in detail.  Embedded throughout the curriculum are examples of conflict of interest situations faced by biostatisticians, along with principles of reproducible research and strategies for implementation.

Required Knowledge in the Following Core Courses

For students with a Master's degree in Biostatistics, some of the required 700 level courses listed below may be waived if they have taken those courses or their equivalents previously. 

BIOSTAT 701. Introduction to Statistical Theory and Methods I
BIOSTAT 702. Applied Biostatistical Methods I
BIOSTAT 703. Introduction to the Practice of Biostatistics I
BIOSTAT 704. Introduction to Statistical Theory and Methods II
BIOSTAT 705. Applied Biostatistical Methods II
BIOSTAT 706. Introduction to the Practice of Biostatistics II
BIOSTAT 713. Survival Analysis
BIOSTAT 714. Categorical Data Analysis
BIOSTAT 718. Analysis of Correlated and Longitudinal Data
BIOSTAT 719. Generalized Linear Models
BIOSTAT 900. Current Problems in Biostatistics
BIOSTAT 901. Advanced Topics in Modern Inferential Techniques and Theory
BIOSTAT 905. Linear Models
BIOSTAT 906. Statistical Inference
BIOSTAT 910. Career Development and Prep
STA 711. Probability and Measure Theory

Approved Elective Courses

BIOSTAT 707 Statistical Methods for Learning and Discovery
BIOSTAT 708. Clinical Trial Design and Analysis
BIOSTAT 709. Observational Studies
BIOSTAT 710. Statistical Genetics and Genetic Epidemiology
BIOSTAT 902. Statistical Methods for Analysis with Missing Data
BIOSTAT 903. Advanced Survival Analysis
BIOSTAT 907. Early Phase Clinical Trials
BIOSTAT 908. Independent Study
BIOSTAT 909. Internship Course
STA 561D. Probabilistic Machine Learning
STA 601. Bayesian and Modern Statistical Data Analysis
STA 640. Causal Inference
STA 663L. Statistical Computation


The PhD program follows the Duke Graduate School Academic Calendar

Timeline and Curriculum for Students with an Applicable Quantitative Master's Degree 

Timeline and Curriculum for Students without an Applicable Quantitative Master's Degree

PhD-Level Course Descriptions

BIOSTAT 900. Current Problems in Biostatistics. Advanced seminar on topics at the research frontiers in biostatistics. Readings of current biostatistical research and presentations by faculty and advanced students of current research in their area of specialization.  (1 unit)

BIOSTAT 901. Advanced Inferential Techniques and Theory. Stochastic processes, martingales, counting processes, weak convergence and basic empirical process theory and applications. Hilbert spaces for random vectors, semiparametric models, geometry of efficient score functions and efficient influence functions, construction of semiparametric efficient estimators. Applications include the restricted moment model, the proportional hazards model, and etc. The theory for M- and Z- estimators with various applications. The bootstrap methods. Prerequisites: STAT 711, BIOSTAT 906.

BIOSTAT 905. Linear Models. Introduction to linear models and linear inference from the coordinate-free viewpoint. Topics: identifiability and estimability, key properties of and results for finite-dimensional vector spaces, linear transformations, self-adjoint transformations, spectral theorem, properties and geometry of orthogonal projectors, Cochran's theorem, estimation and inference for normal models, distributional properties of quadratic forms, minimum variance linear unbiased estimation, Gauss-Markov theorem and estimation, calculus of differentials, analysis of variance and covariance. Prerequisites: BIOSTAT 704, BIOSTAT 702, BIOSTAT 705, real analysis and linear algebra, or consent of the instructor. (3 units)

BIOSTAT 906. Statistical Inference. Introduction to decision theory and optimality criteria, sufficiency, methods for point estimation, confidence internal and hypothesis testing methods and theory.  Prerequisite: BIOSTAT 704 or equivalent. 

BIOSTAT 910. Career Development and Prep. The purpose of this course is to give the student a holistic view of career choices and individual development plans including the tools they will need to succeed as professionals in the world of work. The curriculum will focus on the unique challenges of PhD candidates and tools needed for successful careers in Academia or in Industry. Instructor: Baker. 1 unit.

STA 711. Probability and Measure Theory. Introduction to probability spaces, the theory of measure and integration, random variables, and limit theorems. Distribution functions, densities, and characteristic functions; convergence of random variables and of their distributions; uniform integrability and the Lebesgue convergence theorems. Weak and strong laws of large numbers, central limit theorem.  Prerequisite: elementary real analysis and elementary probability theory. (3 units)

Elective Courses

BIOSTAT 902. Statistical Methods for Analysis with Missing Data. Theory and application of missing data methodology, ad hoc methods, missing data mechanism, selection models, pattern mixture models, likelihood-based methods, multiple imputation, inverse probability weighting, sensitivity analysis. Prerequisites: Statistical Science 711, 721, and 732, or consent of instructor. (3 units

BIOSTAT 903. Advanced Survival Analysis. Designed for PhD students in Biostatistics or DSS departments who may be interested in conducting methodological research in the area of Survival Data Analysis. Applications of counting process and martingale theory to right censored survival data. Applications of empirical process theory to more general and possibly more complex statistical models using nonparametric analysis of interval-censored data as illustrating examples. After completion, students are anticipated to understand the statistical method papers on survival analysis appearing in top tier statistical journals. Prerequisites: BIOSTAT 701, 704, and 713, or equivalent, or consent of instructor. (3 units)

BIOSTAT 907: Phase II Clinical Trials. Introduction to diverse statistical design and analysis methods for randomized phase II clinical trials. Topics: Minimax, optimal, and admissible clinical trials inference methods for phase II clinical trials; clinical trials with survival endpoint; clinical trials with heterogeneous patient populations; and randomized phase II clinical trials. Instructor consent required. Instructor: Jung. 3 units.

BIOSTAT 908.  Independent Study. Faculty directed statistical methodology research. (1 unit)

BIOSTAT 909.  Internship Course. Students gains practical experience by taking an internship in industry/government and writes a report about this experience. Requires prior consent from the student's advisor and from the Director of Graduate Studies.  May be repeated with consent of the advisor and the Director of Graduate Studies. Credit/no credit grading only. 

STA 561D. Probabilistic Machine Learning. Introduction to concepts in robabilistic machine learning with a focus on discriminative and hierarchical generative models. Topics include directed and undirected graphical models, kernel methods, exact and approximate parameter estimation methods, and structure learning. Prerequisites: Linear algebra, Statistical Science 250 or Statistical Science 611. Instructor: Heller, Mukherjee, or Reeves. 3 units. 

STA 601. Bayesian and Modern Statistical Data Analysis. Principles of data analysis and modern statistical modeling. Exploratory data analysis. Introduction to Bayesian inference, prior and posterior distributions, predictive distributions, hierarchical models, model checking and selection, missing data, introduction to stochastic simulation by Markov Chain Monte Carlo using a higher level statistical language such as R or Matlab. Applications drawn from various disciplines. Not open to students with credit for Statistical Science 360. Prerequisite: Statistical Science 210, 230 and 250, or close equivalents. Instructor: Clyde, Dunson, Reiter, or Volfovsky. 3 units.

STA 640. Causal Inference. Statistical issues in causality and methods for estimating causal effects. Randomized designs and alternative designs and methods for when randomization is infeasible: matching methods, propensity scores, longitudinal treatments, regression discontinuity, instrumental variables, and principal stratification. Methods are motivated by examples from social sciences, policy and health sciences. Instructor: Li or VolfovskyStatistical issues in causality and methods for estimating causal effects. Randomized designs and alternative designs and methods for when randomization is infeasible: matching methods, propensity scores, longitudinal treatments, regression discontinuity, instrumental variables, and principal stratification. Methods are motivated by examples from social sciences, policy and health sciences. Instructor: Li or Volfovsky

STA 663L. Statistical Computation. Statistical modeling and machine learning involving large data sets and challenging computation. Data pipelines and data bases, big data tools, sequential algorithms and subsampling methods for massive data sets, efficient programming for multi-core and cluster machines, including topics drawn from GPU programming, cloud computing, Map/Reduce and general tools of distributed computing environments. Intense use of statistical and data manipulation software will be required. Data from areas such as astronomy, genomics, finance, social media, networks, neuroscience. Instructor consent required. Prerequisites: Statistics 521L, 523L; Statistics 531, 532 (or co-registration). (3 units)