COSAM » Departments » Mathematics & Statistics » Research » Seminars » Statistics and Data Science

Statistics and Data Science


DMS Statistics and Data Science Seminar
Sep 23, 2021 02:00 PM
ZOOM


cao

Speaker: Yanzhao Cao, Auburn University

 


DMS Statistics and Data Science Seminar
Sep 30, 2021 02:00 PM
ZOOM


bakalli.jpg

Speaker: Gaetan Bakalli, Auburn University


More Events...

DMS Statistics and Data Science Seminar
Sep 16, 2021 02:00 PM
ZOOM


hsieh-fushing.jpg

Speaker: Fushing Hsieh, UC Davis

Title: The geometry of colors in van Gogh's Sunflowers

 

Abstract: "Paintings fade like flowers": van Gogh's prediction on the impact of age on paintings came true for most of his paintings. We have studied the consequences of this aging on the Sunflowers in a vase with a yellow background series, namely its original, F454, currently in London, and two replicates, F457, in Tokyo, and F458, in Amsterdam, which van Gogh painted using the original as a model. The background and flower renditions in those paintings have faded and turned brown, making them less vibrant that van Gogh had most likely intended. We have attempted to restore van Gogh's intent using a computational approach based on data science. After identifications of regions of interest (ROI) within the three paintings F454, F457, and F458 that capture the flowers, stems of the flowers, and background, respectively, we studied the geometry of the color space (in RGB representation) occupied by those ROIs. By comparing those color spaces with those occupied by similar ROIs in photographs of real sunflowers, we identified shifts in all three-color coordinates, R, G, and B, with the positive shift in the blue coordinate being the more salient. We have proposed two algorithms, PCR-1 and PCR-2, for correcting that shift in blue and generate representations of the paintings that aim to restore their original conditions. The reduction of the blue component in the yellow hues has led to more vibrant and less brownish digital rendition of the three Sunflowers in a vase with a yellow background.

 

Location: ZOOM


DMS Statistics and Data Science Seminar
Sep 09, 2021 02:00 PM
ZOOM


yue.jpg

Speaker: Xiaowei Yue, Virginia Tech

Title: Stochastic Surrogate Models: Method, Algorithm, and Engineering Applications

 

Abstract: Surrogate models have been widely used in advanced design and manufacturing to tackle the high computational cost in high-fidelity simulation and digital twin. Due to sensing errors, actuating errors, and computational errors, uncertainties inevitably exist in engineering systems. With incorporating the influence of uncertainties, stochastic surrogate modeling has become an emerging field. We developed two stochastic surrogates: (1) Neural Process Aided Ordinary Differential Equation (NP-ODE); (2) Neural network Gaussian process considering input uncertainty. We show the relationships between deep neural networks, Gaussian process, and differential equations, and use their relationship to develop new physics-informed data analytics methods. We also demonstrate their applications in engineering simulations such as Finite Element simulation, Digital Twin, and Materials Science simulation.

 

Location : ZOOM


DMS Statistics and Data Science Seminar
Sep 02, 2021 02:00 PM
ZOOM


xdai.jpg

Speaker: Xiongtao Dai, Iowa State University

Title: Exploratory Data Analysis for Data Objects on a Metric Space via Tukey's Depth

Abstract: Exploratory data analysis involves looking at the data and understanding what can be done with them. Non-standard data objects such as directions, covariance matrices, trees, functions, and images have become increasingly common in modern practice. Such complex data objects are hard to examine due to the lack of a nature ordering and efficient visualization tools. We develop a novel exploratory tool for data objects lying on a metric space based on data depth, extending the celebrated Tukey's depth for Euclidean data. The proposed metric halfspace depth assigns depth values to data points, characterizing the centrality and outlyingness of these points. This also leads to an interpretable center-outward ranking, which can be used to construct rank tests. I will demonstrate two applications, one to reveal differential brain connectivity patterns in an Alzheimer's disease study, and another to infer the phylogenetic history and outlying phylogenies in 7 pathogenic parasites.


DMS Statistics and Data Science Seminar
Aug 26, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/82501343299


c-liu.png

Speaker: Dr. Chenang Liu, Oklahoma State University

Title: Data-Driven Anomaly Detection and Blockchain-Enabled Security Protection for Smart Manufacturing

Abstract: With the incorporation of the Internet of things (IoT) and information technologies, the environment of manufacturing become data-rich and cyber-enabled, which enables the rapid development of smart manufacturing. However, due to commonly existing the data imbalanced issue and cyber vulnerability, how to achieve accurate anomaly detection and effective cyber-security protection still remains challenging. To address these two critical issues, this research aims to develop methodologies that is capable of detecting the process anomalies accurately and preventing the potential cyber-physical attacks effectively. Toward this goal, an adversarial learning-based approach termed augmented time regularized generative adversarial network (ATR-GAN) is proposed to handle the highly imbalanced sensor data for anomaly detection. Then a blockchain-enabled method is also developed to prevent the common cyber-physical attacks in manufacturing. Furthermore, real-world case studies based on additive manufacturing was also conducted to demonstrate the effective and potential of the proposed methods.

  

Location: https://auburn.zoom.us/j/82501343299

 


DMS Statistics and Data Science Seminar
Aug 19, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/82501343299


yao_small.jpg

Speaker: Dr. Yao Xie, Georgia Institute of Technology

ZOOM LINK: https://auburn.zoom.us/j/82501343299

Title: Statistical Inference for Spatio-Temporal Point Processes

Abstract: Discrete events are a sequence of observations consisting of event time, location, and possibly "marks" with additional event information. Such event data is ubiquitous in modern applications, such as social networks, seismic activities, police reports data, neuronal spike trains, and disease spread counts. We are particularly interested in capturing the complex dependence of the discrete events data, particularly estimating how nodes interact with each other, such as the triggering or inhibiting effects of the historical events on future events. This helps us recover network topology, perform causal inference, understand spatio-temporal dynamics, and make predictions. Motivated by popular Hawkes processes, we introduce a new general modeling approach for capturing spatio-temporal interaction, which enjoys computationally efficient model estimation procedures. We establish statistical guarantees by connecting to a modern convex optimization theory of solving variational inequality. The good performance of the proposed method is illustrated using several real-world data sets.

 

Bio: Yao Xie is an Associate Professor and Harold R. and Mary Anne Nash Early Career Professor at Georgia Institute of Technology in the H. Milton Stewart School of Industrial and Systems Engineering, and an Associate Director of the Machine Learning Center. She received her Ph.D. in Electrical Engineering (minor in Mathematics) from Stanford University, M.Sc. in Electrical and Computer Engineering from the University of Florida, and B.Sc. in Electrical Engineering and Computer Science from the University of Science and Technology of China (USTC). She was a Research Scientist at Duke University. Her research areas are statistics (in particular sequential analysis and sequential change-point detection), machine learning, and signal processing, providing the theoretical foundation and developing computationally efficient and statistically powerful algorithms. She has worked on such problems in sensor networks, social networks, power systems, crime data analysis, and wireless communications. She received the National Science Foundation (NSF) CAREER Award in 2017. She is currently an Associate Editor for IEEE Transactions on Signal Processing, Sequential Analysis: Design Methods and Applications, and INFORMS Journal on Data Science, and serves on the Editorial Board of Journal of Machine Learning Research, Area Chair of NeurIPS 2021.

 

 


DMS Statistics and Data Science Seminar
Apr 15, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


Speaker: Mikhail Zhelonkin (Erasmus University Rotterdam)

Title: Robust Estimation of Probit Models with Endogeneity

 

Abstract: Probit models with endogenous regressors are commonly used models in economics and other social sciences. Yet, the robustness properties of parametric estimators in these models have not been formally studied. In this paper, we derive the influence functions of the endogenous probit model’s classical estimators (the maximum likelihood and the two-step estimator) and prove their non-robustness to small but harmful deviations from distributional assumptions. We propose a procedure to obtain a robust alternative estimator, prove its asymptotic normality and provide its asymptotic variance. A simple robust test for endogeneity is also constructed. We compare the performance of the robust and classical estimators in Monte Carlo simulations with different types of contamination scenarios. The use of our estimator is illustrated in several empirical applications. 

 

 

 

ZOOM: https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


DMS Statistics and Data Science Seminar
Apr 08, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


Speaker: Debashis Mondal (Oregon State University)

Title: H-likelihood Methods in Spatial Statistics

 

Abstract: Youngjo Lee and John Nelder introduced an important body of literature on h-likelihood methods and hierarchical generalized linear models, which expanded the scope of generalized linear regressions with correlated errors and revived an interest in Charles Roy Henderson's pioneering ideas on mixed linear equations and best linear unbiased predictions. In this talk, I shall present my work on how the h-likelihood methods pave the way for a deeper understanding of kriging and residual maximum likelihood estimation in spatial statistics, particularly for models based on conditional and intrinsic auto-regressions, de Wijs process and fractional Gaussian fields. In addition, I shall discuss how h-likelihood methods allow for scalable matrix-free computations. The importance of these developments will be emphasized with applications from environmental science. At the end, I will mention some of my ongoing works.

 

 

ZOOM: https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


DMS Statistics and Data Science Seminar
Apr 01, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


Speaker: Dave Zhao (Associate Professor in Statistics, University of Illinois at Urbana-Champaign) 

Title: Perfect is the enemy of good: New shrinkage estimators for genomics

 

Abstract: Simultaneous estimation problems have a long history in statistics and have become especially common and important in genomics research: modern technologies can simultaneously assay tens of thousands to even millions of genomic features that can each introduce an unknown parameter of interest. These applications reveal some conceptual and methodological gaps in the standard empirical Bayes approach to simultaneous estimation. This talk summarizes standard approaches, illustrates some difficulties, and introduces an alternative approach based on regression modeling, and illustrates some new estimators that can be applied to gene expression denoising, coexpression network reconstruction, and large-scale gene expression imputation.

 

Location: https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


DMS Statistics and Data Science Seminar
Mar 25, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


Speaker: Marco Avella-Medina (Columbia University)

Title: Differentially Private Inference via Noisy Optimization

 

Abstract: We propose a general optimization-based framework for computing differentially private M-estimators and a new method for the construction of differentially private confidence regions. Firstly, we show that robust statistics can be used in conjunction with noisy gradient descent and noisy Newton methods in order to obtain optimal private estimators with global linear or quadratic convergence, respectively. We establish global convergence guarantees, under both local strong convexity and self-concordance, showing that our private estimators converge with high probability to a neighborhood of the non-private M-estimators. The radius of this neighborhood is nearly optimal in the sense it corresponds to the statistical minimax cost of differential privacy up to a logarithmic term. Secondly, we tackle the problem of parametric inference by constructing differentially private estimators of the asymptotic variance of our private M-estimators. This naturally leads to the use of approximate pivotal statistics for the construction of confidence regions and hypothesis testing. We demonstrate the effectiveness of a bias correction that leads to enhanced small-sample empirical performance in simulations. We illustrate the benefits of our methods with synthetic numerical examples and real data.

 

Location: https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


DMS Statistics and Data Science Seminar
Mar 18, 2021 02:00 PM
ZOOM https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


Speaker: Andrea Apolloni (CIRAD, Montpellier, Languedoc-Roussillon, France)

Title: Modelling and Predicting National and Regional Animal Mobility in North/West Africa

 

Abstract: The trade of live animals is one of the main economic activities in most of the West and North African countries. Due to the absence of infrastructure, animals are sold alive at local markets to traders, and then moved to capital or coastal cities where they are slaughtered and butchered. In general, the consumption and production areas are several hundred km apart. The possibility of providing a reliable picture of livestock mobility in the area is hindered by the fact that few quantitative data are collected. In this talk, I present the results of the analysis of data provided by Veterinarian services in West and North Africa countries on ruminants’ mobility. Using gravity models we found that possible mobility drivers include environmental factors (conditioning the availability of natural resources), commercial reasons (demand and market price), economical (gdp difference between producer and consumer areas ) and social factors like religious festivities such as Tabaski celebration. To conclude I will present the application of this approach to two case studies: the diffusion of genetic strains in the area and the risk of bluetongue occurrence in Senegal.

 

Location: https://auburn.zoom.us/j/83299681626?pwd=SjlGNi9MWWhEMExGM0c0QzBPK0hMZz09


More Events...


Last Updated: 09/08/2020