Dynamic logistic state space prediction model for clinical decision making
Semiparametric efficient G-estimation with invalid instrumental variables
Sparse Factor Model for High Dimensional Time Series
北京师范大学自然科学高等研究院-统计与数据科学研究中心讲师，硕士研究生导师，研究领域包括：非参数和半参数建模、函数型数据分析等。2018 年获西南财经大学经济学博士学位，2018-2019年在清华大学从事博士后研究，2020年在宾夕法尼亚大学从事博士后研究工作，目前在Biometrics、Statistica Sinica、Scandinavian Journal of Statistics、Journal of Multivariate Analysis、Statistical Methods in Medical Research、《中国科学-数学》等期刊上发表多篇学术论文，主持国家自然科学基金青年基一项、全国统计科学研究项目一项，参与多项课题研究。
Prediction modeling for clinical decision making is of great importance and needed to be updated frequently with the changes of patient population and clinical practice. Existing methods are either done in an ad hoc fashion, such as model recalibration or focus on studying the relationship between predictors and outcome and less so for the purpose of prediction. In this article, we propose a dynamic logistic state space model to continuously update the parameters whenever new information becomes available. The proposed model allows for both time-varying and time-invariant coefficients. The varying coefficients are modeled using smoothing splines to account for their smooth trends over time. The smoothing parameters are objectively chosen by maximum likelihood. The model is updated using batch data accumulated at prespecified time intervals, which allows for better approximation of the underlying binomial density function. In the simulation, we show that the new model has significantly higher prediction accuracy compared to existing methods. We apply the method to predict 1 year survival after lung transplantation using the United Network for Organ Sharing data.
Dr. Zhonghua Liu is currently assistant professor of biostatistics at Columbia University since August 2022. His current research interests include causal inference, machine learning and their applications. Dr. Liu obtained his doctorate in biostatistics from Harvard University in 2015 and later spent about two years on Wall Street as a quantitative strategist. He was an assistant professor of statistics at The University of Hong Kong from 2018 to 2022.
The instrumental variable method is widely used in the health and social sciences for identification and estimation of causal effects in the presence of potential unmeasured confounding. To improve efficiency, multiple instruments are routinely used, raising concerns about bias due to possible violation of the instrumental variable assumptions. To address such concerns, we introduce a new class of G-estimators that are guaranteed to remain consistent and asymptotically normal for the causal effect of interest provided that a set of at least γ out of K candidate instruments are valid, for γ⩽K set by the analyst ex ante without necessarily knowing the identities of the valid and invalid instruments. We provide formal semiparametric efficiency theory supporting our results. Simulation studies and applications to UK Biobank data demonstrate the superior empirical performance of the proposed estimators compared with competing methods.
浙江大学数学学院教授、浙江大学统计所所长，浙江省现场统计研究所副理事长。2004年在浙江大学获得博士学位，2004年7月—2006年6月在北京大学从事博士后研究，2006年至今在浙江大学工作，多次访问香港科大、香港中文大学和伦敦政治经济学院。主要从事非平稳时间序列和高维空间数据的理论与应用研究，已发表SSCI/SCI论文50多篇，发表的杂志包括Ann. Statist., J. Amer. Assoc. Statist., J. Econometrics等。2015年获浙江省杰出青年基金，主持国家自然科学基金、省重点等省部部级基金项目多项，2021年获浙江省自然科学奖二等奖和第一届统计学科学技术进步奖三等奖，现任J. Korean Statist. Soc.（SCI期刊）和Intern. J. Math. Statist.的Associate Editor。
Factor models have been extensively employed in high-dimensional time series. However, little is known for the case with sparse loading matrix. This paper introduces a sparse factor model with an easy-to-implement estimation method, aiming to enhance interpretability and relax the constraints on the dimension p of the time series. Different from the classical factor model, where p could not be larger than square root of the sample size n for consistency, we allow p increasing with the sparseness and even larger than n. Under regular conditions, the loading space could be consistently estimated. In addition, a randomized sequential test is introduced to determine the number of sparse factors. Simulations and real data analysis on sea surface air pressure and stock portfolios are also provided to illustrate the performance of the proposed method.