分海總南亞王季大门原章 Big Data,Machine Learning and Statistics Professor Yongmiao Hong Cornell University July8,2020
Big Data, Machine Learning and Statistics Professor Yongmiao Hong Cornell University July 8, 2020
CONTENTS 10.1 Introduction 10.2 Empirical Studies and Statistical Inference 10.3 Important Features of Big Data 10.4 Big Data Analysis and Statistics 10.5 Machine Learning and Statistics 10.6 Conclusion Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 2170
Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 2/70 10.1 Introduction 10.2 Empirical Studies and Statistical Inference 10.3 Important Features of Big Data 10.4 Big Data Analysis and Statistics 10.5 Machine Learning and Statistics 10.6 Conclusion CONTENTS
Parameter Estimation and Evaluation Introduction Introduction With the rapid development of internet and mobil in- ternet techologies as well as their applications,the rise of Big data together with machine learning,a main computer-based automatic analytic tool for Big data,has profound implications on statistical sciences. Compared with traditional historical data,Big data of- ten has an extraordinarily large volume of data,with structured,semi-structruraled and unstructured formats, which are often produced in real-time or near real-time. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics Juy8,2020 3/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 3/70 Introduction Introduction
Parameter Estimation and Evaluation Introduction Introduction What is Big data? Has Big data altered the foundation of statistical sciences,such as sampling inference for population,causal analysis,sufficiency principle,data reduction, prediction,and etc? What challenges and opportunities does Big data bring to the theory and practice of statistical modelling and inference? What is machine learning? What are the key differences between machine learning and statistical mod- elling? What is the relationship between machine learning and statistical inference? As is well-known,machine learning often has accurate out-of-sample pre- dictions,but it looks like a black box.Can statistics provide meaningful interpretations for machine learning methods? Can machine learning and statistics be synthesized together,and if so,how this will affect the future development of statistical sciences? Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics July8,2020 4170
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 4/70 Introduction Introduction
Parameter Estimation and Evaluation Introduction Introduction Our analysis delivers the following main conclusions: Big data does not change the foundation of statistical sampling inference for population,and many statistical methods such as the sufficiency principle, data reduction,and causal inference remain to be very useful for Big data analysis. Big data shakes the conventional practice of using statistical significance to decide important variables in the model. It poses some new challenges to statistical modelling and inference,including the basic assumptions of model uniqueness,correct model specification,and stationarity. Big Data,Machine Learning and Statistics Introduction to Statistics and Econometrics Juy8,2020 5/70
Parameter Estimation and Evaluation Big Data, Machine Learning and Statistics Introduction to Statistics and Econometrics July 8, 2020 5/70 Introduction Introduction