Text Mining NLP ML Thinking in (Text)Clustering No math,be not afraid Yueshen Xu (lecturer) ysxu@xidian.edu.cn/xuyueshen@163.com Data and Knowledge Engineering Research Center Xidian University
Thinking in (Text) Clustering (No math, be not afraid) Yueshen Xu (lecturer) ysxu@xidian.edu.cn / xuyueshen@163.com Data and Knowledge Engineering Research Center Xidian University Text Mining & NLP & ML
Outline 历些毛子代拔大》 XIDIAN UNIVERSITY ▣Background What can be clustered? Problems in K-XXX(Means/Medoid/Center...) ■Similarity Measure Basics,not ■Convex and Concave state-of-the-art Problems in Gaussian Mixture Model Problems in Matrix Factorization Multinomial and Sparsity Keywords:Clustering,K-Means/Medoid,Similarity Computation,GMM,MF, Multinomial Distribution 2017/4/13 Software Engineering
2017/4/13 Software Engineering Outline Background What can be clustered? Problems in K-XXX (Means/Medoid/Center…) Similarity Measure Convex and Concave Problems in Gaussian Mixture Model Problems in Matrix Factorization Multinomial and Sparsity 2 Keywords: Clustering, K-Means/Medoid, Similarity Computation, GMM, MF, Multinomial Distribution Basics, not state-of-the-art
Background 历忠毛子代枚大学 XIDIAN UNIVERSITY Information Overloading Big Data Chinese International Travel Monitor 2015 at a glance Hotels.com Cloud Com uting Artificiatelligence Deep Kearnng n we need 8o0oa summarization isualization 人盘 Dimensional Reduction 2017/4/13 Software Engineering
2017/4/13 Software Engineering Background Information Overloading 3 we need summarization Visualization Dimensional Reduction Big Data Cloud Computing Artificial Intelligence Deep Learning ,…, etc
Background 历些毫子种拔大” XIDIAN UNIVERSITY Dimensional Reduction (DR) ■Clustering >Text Clustering,Webpage Clustering,Image Clustering... ■Summarization NMF ●nigina >Document Summarization,Image Summ ■Factorization >Rating Matrix Factorization,Image Non- ▣Basic Requirement Automatic Applicable Explainable →Clustering(Text) 2017/14/13 Software Engineering
2017/4/13 Software Engineering Background Dimensional Reduction (DR) Clustering Text Clustering, Webpage Clustering, Image Clustering… Summarization Document Summarization, Image Summarization… Factorization Rating Matrix Factorization, Image Non-negative Factorization 4 Automatic Applicable Explainable Basic Requirement Clustering (Text)
Some Concepts 历些毛子种技大学 XIDIAN UNIVERSITY Information Retrieval Related Research Areas Dimensional Reduction(DR) Machine DR ■Text Mining Learning (Text) Clustering Natural Language Processing Computational Linguistics Tex Mining Artificial Information Retrieval Machine Natu al Language Processing Artificial Intelligence Translation Computational Linguistics ntelligence (Text)Clustering Data Mining >We all know what(text)clustering is,right? >Widely-accepted topic,since everyone knows it 2017/4/13 Software Engineering
2017/4/13 Software Engineering Related Research Areas Dimensional Reduction (DR) Text Mining Natural Language Processing Computational Linguistics Information Retrieval Artificial Intelligence (Text) Clustering Some Concepts 5 Information Retrieval Computational Linguistics Natural Language Processing LSA/Topic Model Text Mining DR Data Mining Artificial Intelligence Machine Learning Machine Translation (Text) Clustering We all know what (text) clustering is, right? Widely-accepted topic, since everyone knows it