Challenges of using World knowledge Scalability Knowledge specification Domain adaptation Disambiguation Open domain classes Inference Learning Representation Data vs knowledge representation
Challenges of Using World Knowledge Representation Inference Learning Data vs. knowledge representation Knowledge specification; Disambiguation Scalability; Domain adaptation; Open domain classes 16
Networked Text Analysis Framework World Knowledge Specification ext and World Knowledge Learning Bases World Knowledge Representation Wang et al. Incorporating World Knowledge to document Clustering via Heterogeneous Information Networks. KDD'15 Wang et al. World knowledge as indirect supervision for document clustering. TKDD'16
Networked Text Analysis Framework World Knowledge Specification World Knowledge Representation Learning Text and World Knowledge Bases Wang et al., Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks. KDD’15. Wang et al. World knowledge as indirect supervision for document clustering. TKDD’16. 17
World Knowledge Specification Unsupervised semantic Parsing for documents Document Obama is the president of the United states of america Semantic parsing is the task of mapping a piece of natural language text to a formal meaning representation Logic form People. Barackobama n PresidentofCountry Country USA Motivation: [Berant et al. EMNLP'13] aim to train a parser from question/answer pairs on a large knowledge-base freebase Existing semantic parsing approaches that require expert annotation Scales to large scale knowledge-bases, supervised by the Qa pairs No such training data for the document dataset 18
Semantic parsing is the task of mapping a piece of natural language text to a formal meaning representation. Document Obama is the president of the United States of America Logic form People.BarackObama PresidentofCountry.Country.USA • Motivation: [Berant et al. EMNLP’13] aim to train a parser from question/answer pairs on a large knowledge-base Freebase – Existing semantic parsing approaches, that require expert annotation – Scales to large scale knowledge-bases, supervised by the QA pairs • No such training data for the document dataset. World Knowledge Specification: Unsupervised Semantic Parsing for Documents 18
World Knowledge Specification Unsupervised semantic Parsing for documents Document Obama is the president of the United States of america People. Barackobama n PresidentofCountry Country USA intersection People. BarackObama PresidentofCountry. Country USA join lexicon Obama PresidentofCountry Country USA lexico president United States of america
Obama is president of United States of America People.BarackObama Country.USA intersection People.BarackObama PresidentofCountry.Country.USA lexicon lexicon lexicon PresidentofCountry PresidentofCountry.Country.USA join 19 World Knowledge Specification: Unsupervised Semantic Parsing for Documents Document Obama is the president of the United States of America