Opening Vignette (3 of 3) Discussion Questions for the Opening vignette 1. What is Watson? What is special about it? 2. What technologies were used in building Watson (both hardware and software)? 3. What are the innovative characteristics of deep Qa architecture that made Watson superior? 4. Why did BM spend all that time and money to build Watson? Where is the return on investment(ROD? Pearson Copyright C 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Opening Vignette (3 of 3) Discussion Questions for the Opening Vignette 1. What is Watson? What is special about it? 2. What technologies were used in building Watson (both hardware and software)? 3. What are the innovative characteristics of DeepQA architecture that made Watson superior? 4. Why did IBM spend all that time and money to build Watson? Where is the return on investment (ROI)?
Text Analytics and Text Mining(1 of 2) Text Analytics versus Text Mining Text Analytics Information Retrieval Information Extraction Data Mining Web Mining or simpl Text Analytics Information Retrieval Text Mining Pearson Copyright C 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Text Analytics and Text Mining (1 of 2) • Text Analytics versus Text Mining • Text Analytics = – Information Retrieval + – Information Extraction + – Data Mining + – Web Mining or simply Text Analytics = Information Retrieval + Text Mining
Text Analytics and Text Mining (2 of 2) Figure 5.2 Text Analytics, Related Application Areas, and Enabling Disciplines TEXT ANALYTICS I Document Matching Web Content Mining Link Analysis Information H-------sRetrieval Text Minin Web Structure Mining Search Engines “ Knowledge Discovery in Web Usage Mining Textua D Statistics Management Science Artificial Intelligence Computer Science Other Disciplines Pearson Copyright C 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Text Analytics and Text Mining (2 of 2) • Figure 5.2 Text Analytics, Related Application Areas, and Enabling Disciplines
Text Mining Concepts (1 of2) 85-90 percent of all corporate data is in some kind of unstructured form(e.g, text Unstructured corporate data is doubling in size every 18 months Tapping into these information sources is not an option, but a need to stay competitive ° Answer: text mining A semi-automated process of extracting knowledge from unstructured data sources a k a. text data mining or knowledge discovery in textual databases Pearson Copyright C 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Text Mining Concepts (1 of 2) • 85-90 percent of all corporate data is in some kind of unstructured form (e.g., text) • Unstructured corporate data is doubling in size every 18 months • Tapping into these information sources is not an option, but a need to stay competitive • Answer: text mining – A semi-automated process of extracting knowledge from unstructured data sources – a.k.a. text data mining or knowledge discovery in textual databases
Data Mining Versus Text Mining Both seek for novel and useful patterns Both are semi-automated processes Difference is the nature of the data Structured versus unstructured data Structured data: in databases Unstructured data: word documents Pdf files text excerpts, XML files, and so on To perform text mining--first, impose structure to the data. then mine the structured data Pearson Copyright C 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Data Mining Versus Text Mining • Both seek for novel and useful patterns • Both are semi-automated processes • Difference is the nature of the data: – Structured versus unstructured data – Structured data: in databases – Unstructured data: Word documents, PDF files, text excerpts, XML files, and so on • To perform text mining – first, impose structure to the data, then mine the structured data