DMS Statistics and Data Science Seminar

Time: Oct 04, 2023 (02:00 PM)
Location: ZOOM/354 Parker Hall



Speaker: Takumi Saegusa, University of Maryland

Title: Data Integration in Public Health Research
Abstract: Various data sets collected from numerous sources have a great potential to enhance the quality of inference and accelerate scientific discovery. Inference for merged data is, however, quite challenging because such data may contain unidentified duplication from overlapping inhomogeneous sources and each data set often opportunistically collected induces complex dependence. In public health research, for example, epidemiological studies have different inclusion and exclusion criteria in contrast to hospital records without a well-defined target population, and when combined with a disease registry, patients appear in multiple data sets. In this talk, we present several examples in public health research which potentially enjoy the merits of data integration. We overview existing research such as random effects model approach and multiple frame surveys and discuss their limitations in view of inferential goals, privacy protection, and large sample theory. We then propose our estimation and testing method in the context of survival analysis and two-sample tests. We illustrate our theory in simulation and real data examples. If time permitted, we discuss extensions of our proposed method in several directions.