第6章资料结构 Data structures Classifying the various Types of Data sets
第6章 资料结构 Data Structures: Classifying the Various Types of Data Sets
基本术语 ◆数据集合 o Measurements of items e.g., Yearly sales volume for your 23 salespeople e.g., Cost and number produced, daily, for the past month ◆基本单元 o The items being measured e.g., Salespeople, Days, Companies, Catalogs ◆变量 o The type of measurement being done e.g., Sales volume, Cost, Productivity, Number of defects
基本术语 ◆数据集合 ⚫ Measurements of items ◼ e.g., Yearly sales volume for your 23 salespeople ◼ e.g., Cost and number produced, daily, for the past month ◆基本单元 ⚫ The items being measured ◼ e.g., Salespeople, Days, Companies, Catalogs, … ◆变量 ⚫ The type of measurement being done ◼ e.g., Sales volume, Cost, Productivity, Number of defects, …
有哪些变量? Univariate data set: One variable measured for each elementary unit(单变量) e.g., Sales for the top 30 computer companies Can do: Typical summary, diversity, special features ◆ Bivariate data set: Two variables(双变量) o e.g., Sales and Employees for top 30 computer firms Can also do: relationship, prediction ◆ Multivariate data set: Three or more variables(多变量) o e.g., Sales, Employees, Inventories, Profits o Can also do: predict one from all other variables
◆ Univariate data set: One variable measured for each elementary unit(单变量) ⚫ e.g., Sales for the top 30 computer companies. ⚫ Can do: Typical summary, diversity, special features ◆ Bivariate data set: Two variables(双变量) ⚫ e.g., Sales and Employees for top 30 computer firms ⚫ Can also do: relationship, prediction ◆ Multivariate data set: Three or more variables(多变量) ⚫ e.g., Sales, Employees, Inventories, Profits, … ⚫ Can also do: predict one from all other variables 有哪些变量?
数值型或分类型( Categories) ◆ Quantitative Variable:计量型或尺度型 o e.g., Sales, Employees Can add. rank. count ◆ Qualitative Variable:分类型(有序、名义) Ordinal Variable: Categories with meaningful ordering a e.g., Bond rating(AA, A, B,...), Diamonds(VSI, SI,...) Can rank count o Nominal Variable: categories without meaningful ordering ae.g, State, Type of business, Field of study Can count
数值型或分类型(Categories) ◆ Quantitative Variable: 计量型或尺度型 ⚫ e.g., Sales, # Employees ⚫ Can add, rank, count ◆ Qualitative Variable: 分类型(有序、名义) ⚫ Ordinal Variable: Categories with meaningful ordering ◼ e.g., Bond rating (AA, A, B, …), Diamonds(VSI, SI, …) ◼ Can rank, count ⚫ Nominal Variable: categories without meaningful ordering ◼ e.g., State, Type of business, Field of study ◼ Can count
时间序列型或横截面型 Time-Series or Cross-Sectional? Time-Series Data: Data values recorded in meaningful sequence such as stock market index et al o Elementary units might be days or quarters or years o e.g., Daily Dow-Jones stock market average close for the past 90 days o e.g., Your firm's quarterly sales over the past 5 years Cross-Sectional Data: No meaningful sequence o e.g., Sales of 30 companies o e.g., Productivity of each sales division o Easier than time series/
时间序列型或横截面型 Time-Series or Cross-Sectional? ◆ Time-Series Data: Data values recorded in meaningful sequence such as stock market index et.al.. ⚫ Elementary units might be days or quarters or years ⚫ e.g., Daily Dow-Jones stock market average close for the past 90 days ⚫ e.g., Your firm’s quarterly sales over the past 5 years ◆ Cross-Sectional Data: No meaningful sequence ⚫ e.g., Sales of 30 companies ⚫ e.g., Productivity of each sales division ⚫ Easier than time series!