西安交通大学Natural languageprocessingwith deeplearningXIANHAOTONGUNIVERSITYLanguage Model&Distributed Representation (6)交通大学ChenLicli@xjtu.edu.cn2023
Chen Li cli@xjtu.edu.cn 2023 Language Model & Distributed Representation (6) Natural language processing with deep learning
Outlines1. Pre-training LMGPT2. 3. Bert大4. T5
Outlines 1. Pre-training LM 2. GPT 3. Bert 4. T5
Outlines1.Pre-trainingLMGPT2. 3.Bert4. T5
Outlines 1. Pre-training LM 2. GPT 3. Bert 4. T5
AdvancedLMTaxonomy of pre-training LMCBOW,Skip-Gram[123]Non-ContextualGloVe[127]Contextual?ContextualELMo[129].GPT[136],BERT[35]LSTMELMo [129], CoVe [120]TransformerEncBERT[35].SpanBERT[1]].XLNet[202].RoBERTa[11]]ArchitecturesGPT (L36], GPT-2 [137]Transformer Dec.MASS [154], BART [98]TransformerXNLG[19],mBART[L12]MTCoVe[120]SupervisedLMELMo[129],GPT[136].GPT-2[137],UniLM[38]BERT[35],SpanBERT[],RoBERTa[]XLM-R[28]Task TypesMLMTLMXLM[27]Seq2SeqMLMMASS [154],T5 [138]Unsupervised/PTMsSelf-SupervisedPLMXLNet [202]DAEBART(98]RTDCBOW-NS[123],ELECTRA[24]NSPCTLBERT [35],UniLM [38]SOPALBERT[9]],Suruc(BERT[187]
Advanced LM l Taxonomy of pre-training LM
AdvancedLMTaxonomyofpre-trainingLMERNIETHU)20Z.KnOWBERTL30].K-BERT107)Knowledge-EnrichedSentiLR[82].KEPLER[189].WKEM[195]XLUmBERT35],Unicoder[67].XLM27].XLM-R[28].MultiFit[4]MultilingualXLGMASS[154.mBART12.XNLG19ERNIE(Baidu)[64].BERT-wwm-Chinese[29].NEZHA[9],ZEN[36]Language-SpecificBERTje[32].CamemBERT[119].FlauBERT[93].RobBERT[34]ViLBERT[L14].LXMERT[169]ImageVisualBERT00B2T22].VL-BERT57]ExtensionsMulti-ModalVideoVideoBERT[L59].CBT[158]SpeechSpeechBERT[22]Domain-SpecificSentiLR[82],BioBERT[96],SciBERT[]],PatentBERT[95]Model PruningCompressingBERT[50]QuantizationQ-BERT50].Q8BERT[204ALBERT[9]ModelCompressionParameter SharingDistillationDistilBERT[46],TinyBERT[4].MiniLM[188]Module ReplacingBERT-of-Theseus[196]
Advanced LM l Taxonomy of pre-training LM