HistoryTurn-of-theMillennium Full NLPQA[architectureofLcC(Harabagiu/Moldovan)QAsystem,circa2oo3]Complexsystemsbuttheydidworkfairlywell on"factoidquestionsDocumentProcessingQuestionProcessingFactoidAnswerProcessingSingle FactoidQuestionParsePassagesAnswerExtraction(NER)FactoidMultipleQuestionAnswerJustificationSemanticList(alignment,relations)TransformationFactoidAnswererRerankingRecognitionofListExpectedAnmProver)Too Complex!Type (forNER)QuestionwledgeKeyword ExtranBaseListAnswerProcessingAnswerAnswerExtractionNamedEntityAnswerTypeHierarchyRecognition(CICEROLITE)(WordNet)ThresholdCutoffDocumentQuestionProcessingDefinitionAnswerProcessingCollectionDefinitionQuestionParseQuestionAnswerExtractionDefinitionPattern MatchingAnswerPatternPattern MatchingRepositorKeywordExtraction
History l Turn-of-the Millennium Full NLP QA Question Parse Semantic Transformation Recognition of Expected Answer Type (for NER) Keyword Extraction Factoid Question List Question Named Entity Recognition (CICERO LITE) Answer Type Hierarchy (WordNet) Question Processing Question Parse Pattern Matching Keyword Extraction Question Processing Definition Question Definition Answer Answer Extraction Pattern Matching Definition Answer Processing Answer Extraction Threshold Cutoff List Answer Processing List Answer Answer Extraction (NER) Answer Justification (alignment,relations) Answer Reranking (Theorem Prover) Factoid Answer Processing Axiomatic Knowledge Base Factoid Answer Multiple Definition Passages Pattern Repositor Single Factoid Passages Multiple List Passages Passage Retrieval Document Processing Document Index Document Collection [architecture of LCC (Harabagiu/Moldovan) QA system, circa 2003] Complex systems but they did work fairly well on “factoid” questions Too Complex !
AskMSRAskMSR:QuestionAnsweringUsingtheWorldwideWeb12QuestionRewriteQuery<SearchEngine>WhereistheLouvre3Museumlocated?CollectSummariesinParisFrance59%MineN-grams12%museums10%hostelsN-BestAnswersTile N-GramsFilterN-Grams54BankoM, Brill E, Dumais S,etal.Askmsr:Question answering usingtheworldwideweb[C]//Proceedings of2002AAAlSpringSymposiumonMiningAnswersfromTextsandKnowledgeBases.2002:7-9
AskMSR l AskMSR: Question Answering Using the Worldwide Web Banko M, Brill E, Dumais S, et al. Askmsr: Question answering using the worldwide web[C]//Proceedings of 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases. 2002: 7-9
AskMSRStep1:RewritequeriesIntuition:The user's question is often syntactically quite close tosentences thatcontaintheanswerWhereis the Louvre Museumlocated?The Louvre Museum is located in ParisWho created the character of Scrooge?CharlesDickens created the characterof Scrooge
AskMSR l Step 1: Rewrite queries • Intuition: The user’ s question is often syntactically quite close to sentences that contain the answer • Where is the Louvre Museum located? • The Louvre Museum is located in Paris • Who created the character of Scrooge? • Charles Dickens created the character of Scrooge
AskMSRQuery Rewriting:VariationsClassify question into seven categoriesWhois/was/are/were...?Whenis/did/will/are/were...?Where is/are/were..?a.Category-specifictransformation rulese.g."For Where questions, move "is' to all possible locations""Where is the Louvre Museum located""is the Louvre Museum located"→Nonsensebut who cares?"the is Louvre Museumlocated”→It's onlyafew"the Louvre is Museum located"→morequeries"the Louvre Museum is located"→"the Louvre Museumlocated is"→b. Expected answer "Datatype"(e.g. Date, Person, Location, ...)WhenwastheFrenchRevolution?→DATEHand-crafted classification/rewrite/datatype rules(Couldtheybeautomaticallylearned?)
AskMSR l Query Rewriting: Variations • Classify question into seven categories • Who is/was/are/were.? • When is/did/will/are/were .? • Where is/are/were .? a. Category-specific transformation rules e.g. “For Where questions, move ‘is’ to all possible locations” “Where is the Louvre Museum located” “is the Louvre Museum located” “the is Louvre Museum located” “the Louvre is Museum located” “the Louvre Museum is located” “the Louvre Museum located is” b. Expected answer “Datatype” (e.g. Date, Person, Location, .) When was the French Revolution? DATE • Hand-crafted classification/rewrite/datatype rules (Could they be automatically learned?) Nonsense, but who cares? It’s only a few more queries
AskMSRStep2:QuerysearchengineSend all rewrites to a search engineRetrieve top N answers (100?)For speed, rely just on search engine' s "snippets" , not the fulltext of the actual document交通大学
AskMSR l Step 2: Query search engine • Send all rewrites to a search engine • Retrieve top N answers (100?) • For speed, rely just on search engine’s “snippets”, not the full text of the actual document