Lecture 2 Raw Data Analysis and Pre-processing Dr.李晓瑜Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring SunData Group http://www.sundatagroup.org School of Information and Software Engineering,UESTC 1966 Copyright2019 by Xiaoyu Li
Dr.李晓瑜 Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring Lecture 2 Raw Data Analysis and Pre-processing SunData Group http://www.sundatagroup.org/ School of Information and Software Engineering, UESTC Copyright © 2019 by Xiaoyu Li. 1
飞黄多2t3美爱爱) Today Topic DATA Data Integration ●Data reduction ●Data Transformation 6 Copyright 2019 by Xiaoyu Li
Today Topic Data Integration Data Reduction Data Transformation Copyright © 2019 by Xiaoyu Li. 6
Target of Data Pre-process ·Data cleaning Dealing with vacancy data,noise data,to delete the isolated point,solving the inconsistency. Data integration Integrate multi-databases,data cube even data files. Data reduction Obtain the compressed data sets,get the same or similar results. ●Data selection Select the most efficient data for analysis ●Data discretization Process continuous data to discrete data 7 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 7 Target of Data Pre-process Data cleaning Dealing with vacancy data, noise data, to delete the isolated point, solving the inconsistency. Data integration Integrate multi-databases, data cube even data files. Data reduction Obtain the compressed data sets, get the same or similar results. Data selection Select the most efficient data for analysis Data discretization Process continuous data to discrete data
Target of Data Pre-process Data feature extraction Abstract original features into a set of obvious physical significance (Gabor,geometric feature [angular point,invariant]. texture [LBP HOG])or statistical significance properties. 。Data transformation Standardization and gather data from different raw data. ●Data normalization To unify different sources of data to a frame of reference,to facilitate rapid convergence. DATA 8 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 8 Target of Data Pre-process Data feature extraction Abstract original features into a set of obvious physical significance (Gabor, geometric feature [angular point, invariant], texture [LBP HOG]) or statistical significance properties. Data transformation Standardization and gather data from different raw data. Data normalization To unify different sources of data to a frame of reference, to facilitate rapid convergence
2.5 Data Integration DATA 9 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 9 2.5 Data Integration