主题:Non-Euclidean object filtering for large-scale discriminant analysis
主讲人:陈欣 南方科技大学
主持人:姜云卢 暨南大学
时间:2025年6月10日(周二)下午14:30-15:30
地点:暨南大学石牌校区经济学院(中惠楼)102室
摘要
Classifying random objects in metric spaces without a vector structure has garnered increasing attention. However, the inherent complexity of such non-Euclidean data often restricts existing models to handling only a few features, leaving a gap in real-world applications. To address this, we propose a data-adaptive filtering procedure to identify informative features from a large-scale set of random objects, leveraging a novel Kolmogorov-Smirnov type statistic defined on the metric space. Our method, applicable to data in general metric spaces with binary labels, exhibits remarkable flexibility. It is model-free, meaning its implementation does not depend on a specific classifier class. Theoretically, it guarantees strong selective consistency while controlling the false discovery rate. Empirically, equipped with a Wasserstein metric, it demonstrates superior performance compared to Euclidean competitors across various settings. We conduct a complete study on autism data, which identifies significant brain regions associated with the disease. Notably, it reveals distinct interaction patterns among brain regions in individuals with and without autism by filtering hundreds of thousands of covariance matrices representing diverse brain connectivities.
主讲人简介
陈欣,目前任职于南方科技大学统计与数据科学系副教授,研究员,博士生导师。1999年本科毕业于南开大学数学系,2003年在新加坡国立大学获得硕士学位。2010年博士毕业于美国明尼苏达大学双子城分校。曾在美国雪城大学,新加坡国立大学任教。主要研究领域是高维数据的降维和变量选择的方法, 其他的研究领域包括复杂数据分析,以及用统计方法研究气候变化。在统计学顶级刊物Annals of Statistics和 Biometrika发表过若干篇文章。目前担任JCR一区杂志Biometrics以及Statistics & Computing的副主编。
欢迎感兴趣的师生参加!
校对|姜云卢
责编| 彭毅
初审| 姜云卢
终审发布| 何凌云
(来源:暨南大学经济学院微信公众号)