A Symmetric Local Search Network for Emotion-Cause Pair Extraction Zifeng Cheng,Zhiwei Jiang",Yafeng Yin,Hua Yu,Qing Gu State Key Laboratory for Novel Software Technology, Nanjing University,Nanjing 210023,China chengzf@smail.nju.edu.cn,[jzw,yafeng@nju.edu.cn huayu.yhesmail.nju.edu.cn,guq@nju.edu.cn Abstract Emotion-cause pair extraction(ECPE)is a new task which aims at extracting the potential clause pairs of emotions and corresponding causes in a document.To tackle this task,a two-step method was proposed by previous study which first extracted emotion clauses and cause clauses indi- vidually,then paired the emotion and cause clauses,and filtered out the pairs without causal- ity.Different from this method that separated the detection and the matching of emotion and cause into two steps,we propose a Symmetric Local Search Network(SLSN)model to perform the detection and matching simultaneously by local search.SLSN consists of two symmetric subnetworks,namely the emotion subnetwork and the cause subnetwork.Each subnetwork is composed of a clause representation learner and a local pair searcher.The local pair searcher is a specially-designed cross-subnetwork component which can extract the local emotion-cause pairs.Experimental results on the ECPE corpus demonstrate the superiority of our SLSN over existing state-of-the-art methods. 1 Introduction Emotion cause analysis is a research branch of sentiment analysis and has gained increasing popularity in recent years (Lee et al.,2010;Gui et al.,2016;Xia and Ding,2019;Xia et al.,2019).Its goal is to identify the potential causes that lead to the certain emotion.This is very useful in fields such as electronic commerce,where the sellers may concern about users'emotions towards the products as well as the causes of users'emotions. Previous studies on emotion cause analysis mainly focus on the task of emotion cause extraction (ECE),which is usually formalized as a clause-level sequence labeling problem(Chen et al.,2010;Gui et al.,2016;Li et al.,2018;Ding et al.,2019;Xia et al.,2019;Yu et al.,2019;Fan et al.,2019). Given an annotated emotion clause,for each clause in the document,the goal of ECE task is to identify whether the clause is the corresponding cause.However,in practice,the emotion clauses are naturally not annotated,which may limit the application of the ECE task in real-world scenarios.Motivated by this,Xia and Ding (2019)first proposed the emotion-cause pair extraction(ECPE)task,which aims to extract all potential pairs of emotion and corresponding cause in a document.As shown in Figure 1,the example document has 17 clauses,among which,the emotion clauses are c4,c13,and c17(marked as orange),and their corresponding cause clauses are c3,c12,and c15(marked as blue).The goal of ECPE task is to extract all emotion-cause pairs:(c4,c3),(c13,c12),and (c17,c15). The ECPE task is a new and more challenging task.To tackle this task,Xia and Ding(2019)proposed a two-step method,which has been demonstrated to be effective.In the first step,they extracted emotion clauses and cause clauses individually.In the second step,they used Cartesian product to pair the clauses and then used a logistic regression to filter out the emotion-cause pairs without causality.In this method, the detection of emotion and cause,and the matching of emotion and cause are separately implemented in two steps. Corresponding Author This work is licensed under a Creative Commons Attribution 4.0 International Licence.Licence details:http: //creativecommons.org/licenses/by/4.0/. 139 Proceedings of the 28th International Conference on Computational Linguistics,pages 139-149 Barcelona,Spain(Online),December 8-13,2020
Proceedings of the 28th International Conference on Computational Linguistics, pages 139–149 Barcelona, Spain (Online), December 8-13, 2020 139 A Symmetric Local Search Network for Emotion-Cause Pair Extraction Zifeng Cheng, Zhiwei Jiang∗ , Yafeng Yin, Hua Yu, Qing Gu State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China chengzf@smail.nju.edu.cn,{jzw,yafeng}@nju.edu.cn huayu.yh@smail.nju.edu.cn,guq@nju.edu.cn Abstract Emotion-cause pair extraction (ECPE) is a new task which aims at extracting the potential clause pairs of emotions and corresponding causes in a document. To tackle this task, a two-step method was proposed by previous study which first extracted emotion clauses and cause clauses individually, then paired the emotion and cause clauses, and filtered out the pairs without causality. Different from this method that separated the detection and the matching of emotion and cause into two steps, we propose a Symmetric Local Search Network (SLSN) model to perform the detection and matching simultaneously by local search. SLSN consists of two symmetric subnetworks, namely the emotion subnetwork and the cause subnetwork. Each subnetwork is composed of a clause representation learner and a local pair searcher. The local pair searcher is a specially-designed cross-subnetwork component which can extract the local emotion-cause pairs. Experimental results on the ECPE corpus demonstrate the superiority of our SLSN over existing state-of-the-art methods. 1 Introduction Emotion cause analysis is a research branch of sentiment analysis and has gained increasing popularity in recent years (Lee et al., 2010; Gui et al., 2016; Xia and Ding, 2019; Xia et al., 2019). Its goal is to identify the potential causes that lead to the certain emotion. This is very useful in fields such as electronic commerce, where the sellers may concern about users’ emotions towards the products as well as the causes of users’ emotions. Previous studies on emotion cause analysis mainly focus on the task of emotion cause extraction (ECE), which is usually formalized as a clause-level sequence labeling problem (Chen et al., 2010; Gui et al., 2016; Li et al., 2018; Ding et al., 2019; Xia et al., 2019; Yu et al., 2019; Fan et al., 2019). Given an annotated emotion clause, for each clause in the document, the goal of ECE task is to identify whether the clause is the corresponding cause. However, in practice, the emotion clauses are naturally not annotated, which may limit the application of the ECE task in real-world scenarios. Motivated by this, Xia and Ding (2019) first proposed the emotion-cause pair extraction (ECPE) task, which aims to extract all potential pairs of emotion and corresponding cause in a document. As shown in Figure 1, the example document has 17 clauses, among which, the emotion clauses are c4, c13, and c17 (marked as orange), and their corresponding cause clauses are c3, c12, and c15 (marked as blue). The goal of ECPE task is to extract all emotion-cause pairs: (c4, c3), (c13, c12), and (c17, c15). The ECPE task is a new and more challenging task. To tackle this task, Xia and Ding (2019) proposed a two-step method, which has been demonstrated to be effective. In the first step, they extracted emotion clauses and cause clauses individually. In the second step, they used Cartesian product to pair the clauses and then used a logistic regression to filter out the emotion-cause pairs without causality. In this method, the detection of emotion and cause, and the matching of emotion and cause are separately implemented in two steps. ∗ Corresponding Author This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http: //creativecommons.org/licenses/by/4.0/
(c)But when Hans heard these, (c2)he seemed very jealous. (c:)When Mr.Song had a son, (c)Hans was also very happy. (cs c3) (c)Hans had taught him to speak English'since theboy was young. (c)Hans also speaks Spanish and German, (c)and he ofen went downstairs to the community toteach children English (c)During Martial Arts Festival, (cs c12)X (c)he also helped with a lot of translation work, (c)and was rated as advanced worker. (cu)After the meeting. (cthe city organized all participants to travel, (c)Hans was very excited.- (C13c2)√ (c4)But before getting on the bus, (c13,c1s)X (cs)the tour guide said he was too old to go (c6)Everyone can see, (cr7 C1s) (c)Hans was very lost Figure 1:An example document from the ECPE corpus However,when humans deal with the ECPE task,they usually consider the detection and matching problems at the same time.This is mainly achieved through the process of local search.For example, as shown in Figure 1,if a clause is detected as an emotion clause (e.g.,c4),humans will search its corresponding cause clause (i.e.,ca)within its local context (i.e.,c2,c3,c4,c5,c6).The advantage of local search is that the wrong pairs (e.g.,(c4,c12))beyond the local context scope can be avoided Additionally,when local searching the cause clause corresponding to the target emotion clause,humans not only judge whether the clause is a cause clause,but also consider whether it matches the target emotion clause.In this way,they can avoid extracting the pairs (e.g.,(c13,c15))that are in the local context scope but mismatch.Similarly,when a cause clause is encountered,the corresponding emotion clause can also be searched within its local context scope. Inspired by this local search process,we propose a Symmetric Local Search Network (SLSN)model. The model consists of two subnetworks with symmetric structures,namely the emotion subnetwork and the cause subnetwork.Each subnetwork consists of two parts:a clause representation learner and a local pair searcher (LPS).Among them,the clause representation learner is designed to learn the emotion or cause representation of a clause.The local pair searcher is designed to perform the local search of emotion-cause pairs.Specifically,the LPS introduces a local context window to limit the scope of context for local search.In the process of local search,the LPS first judges whether the target clause is emotion (cause),and then judges whether each clause within the local context window is the corresponding cause (emotion).Finally,SLSN will output the local pair labels (i.e.,the labels of the target clause and the clauses within its local context window)for each clause in the document,from which we can get the emotion-cause pairs. The main contributions of this work can be summarized as follows: We propose a symmetric local search network model,which is an end-to-end model and gives a new scheme to solve the ECPE task. We design a local pair searcher in SLSN,which allows simultaneously detecting and matching the emotions and causes. Experimental results on the ECPE corpus demonstrate the superiority of our SLSN over existing state-of-the-art methods. 2 Symmetric Local Search Network In this section,we first present the task definition.Then,we introduce the SLSN model,followed by its technical details.Finally,we discuss the connection between the SLSN model and the previous two-step method. 140
140 (c1 ) But when Hans heard these, (c2 ) he seemed very jealous. (c3 ) When Mr. Song had a son, (c4 ) Hans was also very happy. (c5 ) Hans had taught him to speak English since the boy was young. (c6) Hans also speaks Spanish and German, (c7 ) and he often went downstairs to the community to teach children English. (c8 ) During Martial Arts Festival, (c9 ) he also helped with a lot of translation work, (c10) and was rated as advanced worker. (c11) After the meeting, (c12) the city organized all participants to travel, (c13) Hans was very excited. (c14) But before getting on the bus, (c15) the tour guide said he was too old to go. (c16) Everyone can see, (c17) Hans was very lost. (c4 , c3) (c13, c12) (c17, c15) (c4 , c12) (c13, c15) Figure 1: An example document from the ECPE corpus However, when humans deal with the ECPE task, they usually consider the detection and matching problems at the same time. This is mainly achieved through the process of local search. For example, as shown in Figure 1, if a clause is detected as an emotion clause (e.g., c4), humans will search its corresponding cause clause (i.e., c3) within its local context (i.e., c2, c3, c4, c5, c6). The advantage of local search is that the wrong pairs (e.g., (c4, c12)) beyond the local context scope can be avoided. Additionally, when local searching the cause clause corresponding to the target emotion clause, humans not only judge whether the clause is a cause clause, but also consider whether it matches the target emotion clause. In this way, they can avoid extracting the pairs (e.g., (c13, c15)) that are in the local context scope but mismatch. Similarly, when a cause clause is encountered, the corresponding emotion clause can also be searched within its local context scope. Inspired by this local search process, we propose a Symmetric Local Search Network (SLSN) model. The model consists of two subnetworks with symmetric structures, namely the emotion subnetwork and the cause subnetwork. Each subnetwork consists of two parts: a clause representation learner and a local pair searcher (LPS). Among them, the clause representation learner is designed to learn the emotion or cause representation of a clause. The local pair searcher is designed to perform the local search of emotion-cause pairs. Specifically, the LPS introduces a local context window to limit the scope of context for local search. In the process of local search, the LPS first judges whether the target clause is emotion (cause), and then judges whether each clause within the local context window is the corresponding cause (emotion). Finally, SLSN will output the local pair labels (i.e., the labels of the target clause and the clauses within its local context window) for each clause in the document, from which we can get the emotion-cause pairs. The main contributions of this work can be summarized as follows: • We propose a symmetric local search network model, which is an end-to-end model and gives a new scheme to solve the ECPE task. • We design a local pair searcher in SLSN, which allows simultaneously detecting and matching the emotions and causes. • Experimental results on the ECPE corpus demonstrate the superiority of our SLSN over existing state-of-the-art methods. 2 Symmetric Local Search Network In this section, we first present the task definition. Then, we introduce the SLSN model, followed by its technical details. Finally, we discuss the connection between the SLSN model and the previous two-step method
Pfinal Pete or Pete or (Pete nPete)or (Pete UPete) Per={…,(ce,dc)…J Pc={,(cle,cc).…} =,驰%,脂) =5,张1,,骆1) ■回…■ Symmetrical Local Search Network(SLSN) Figure 2:Overview of SLSN model 2.1 Task Definition The task of emotion-cause pair extraction(ECPE)is first studied by Xia and Ding (2019).In the ECPE task,each document d in the dataset D consists of multiple clauses d=[c1,c2,..,cn].The clause with emotional polarity(such as happiness,sadness,fear,anger,disgust and surprise)is labeled as an emotion clause ce.The clause that causes the emotion is called a cause clause ce.The pair of emotion clause and its corresponding cause clause is called an emotion-cause pair(ce,c).The goal of ECPE task is to extract all emotion-cause pairs in d: P={…,(c,c),…} Note that each document may contain several (at least one)emotion clauses,and each emotion clause may correspond to several (at least one)cause clauses.Besides,the emotion clause and its corresponding cause clause may be the same clause. 2.2 An Overview of SLSN As shown in Figure 2,SLSN receives a sequence of clauses from a document as input and predicts the local pair labels for these clauses,which can be directly converted into the corresponding emotion-cause (E-C)pairs.For each clause ci,SLSN predicts two types of local pair labels:E-LC labele and C- LE label le.The E-LC label le contains the emotion label (E-label)of the i-th clause and the local cause labels (LC-label)()of the clauses near the i-th clause.Similarly,the C-LE label e contains the cause label (C-label)of the i-th clause and the local emotion labels(LE-label) (,,of the clauses near the i-th clause.Whether a clause is near the target clause is defined by the local context window,whose size is denoted as k (the case in Figure 2 is k 1).That is,for a target clause,the scope of its local context includes the previous k clauses,itself,and the following clauses.Note that,both the E-LC labelle and the C-LE label e can be converted into their corresponding emotion-cause(E-C)pairs.For example,the corresponding E-C pair of e=(1,1,0,0) is (ci,ci-1),and the corresponding E-C pair of le =(1,1,0,0)is (ci-1,ci).We denote the E-C pair set corresponding toe as Pelc,and the E-C pair set corresponding to le as Pele.Then the final E-C pair set of our method is the union of Pelc and Pdle.Of course,Pele,Pcle or the intersection of Pele and Ple is also an option for the final pair set. 2.3 Components of SLSN As shown in Figure 3,SLSN contains two subnetworks,i.e.,the emotion subnetwork referred as E-net which is mainly for the E-LC label prediction and the cause subnetwork referred as C-net which is mainly for the C-LE label prediction.E-net and C-net have similar structures in terms of word embedding,clause encoder,and hidden state learning.After the hidden state learning layer,E-net and C-net use two types of local pair searchers (LPS)with symmetric structures for the local pair label prediction.The local pair 141
141 … 𝑷𝒇𝒊𝒏𝒂𝒍 𝑷𝒆𝒍𝒄 𝑜𝑟 𝑷𝒄𝒍𝒆 𝑜𝑟 𝑷𝒆𝒍𝒄 ∩ 𝑷𝒄𝒍𝒆 𝑜𝑟 (𝑷𝒆𝒍𝒄 ∪ 𝑷𝒄𝒍𝒆) c1 … ci … cn Symmetrical Local Search Network (SLSN) 𝑷𝒆𝒍𝒄 = ⋯ , (c 𝒆 , 𝒄 𝒍𝒄 , ⋯ } 𝒚ෝ𝒊 𝒆𝒍𝒄 = (𝑦ො𝑖 𝑒 , 𝑦ො𝑖−1 𝑙𝑐 , 𝑦ො𝑖 𝑙𝑐 , 𝑦ො𝑖+1 𝑙𝑐 ) 𝒚ෝ𝒊 𝒄𝒍𝒆 = (𝑦ො𝒊 𝒄 , 𝑦ො𝒊−𝟏 𝒍𝒆 , 𝑦ො𝒊 𝒍𝒆 , 𝑦ො𝒊+𝟏 𝒍𝒆 ) 𝑷𝒄𝒍𝒆 = ⋯ , (c 𝒍𝒆 , 𝒄 𝒄 , ⋯ } … … … Figure 2: Overview of SLSN model 2.1 Task Definition The task of emotion-cause pair extraction (ECPE) is first studied by Xia and Ding (2019). In the ECPE task, each document d in the dataset D consists of multiple clauses d = [c1, c2, · · · , cn]. The clause with emotional polarity (such as happiness, sadness, fear, anger, disgust and surprise) is labeled as an emotion clause c e . The clause that causes the emotion is called a cause clause c c . The pair of emotion clause and its corresponding cause clause is called an emotion-cause pair (c e , cc ). The goal of ECPE task is to extract all emotion-cause pairs in d: P = {· · · ,(c e , cc ), · · · } Note that each document may contain several (at least one) emotion clauses, and each emotion clause may correspond to several (at least one) cause clauses. Besides, the emotion clause and its corresponding cause clause may be the same clause. 2.2 An Overview of SLSN As shown in Figure 2, SLSN receives a sequence of clauses from a document as input and predicts the local pair labels for these clauses, which can be directly converted into the corresponding emotion-cause (E-C) pairs. For each clause ci , SLSN predicts two types of local pair labels: E-LC label yˆ elc i and CLE label yˆ cle i . The E-LC label yˆ elc i contains the emotion label (E-label) yˆ e i of the i-th clause and the local cause labels (LC-label) (ˆy lc i−1 , yˆ lc i , yˆ lc i+1) of the clauses near the i-th clause. Similarly, the C-LE label yˆ cle i contains the cause label (C-label) yˆ c i of the i-th clause and the local emotion labels (LE-label) (ˆy le i−1 , yˆ le i , yˆ le i+1) of the clauses near the i-th clause. Whether a clause is near the target clause is defined by the local context window, whose size is denoted as k (the case in Figure 2 is k = 1). That is, for a target clause, the scope of its local context includes the previous k clauses, itself, and the following k clauses. Note that, both the E-LC label yˆ elc i and the C-LE label yˆ cle i can be converted into their corresponding emotion-cause (E-C) pairs. For example, the corresponding E-C pair of yˆ elc i = (1, 1, 0, 0) is (ci , ci−1), and the corresponding E-C pair of yˆ cle i = (1, 1, 0, 0) is (ci−1, ci). We denote the E-C pair set corresponding to yˆ elc i as Pelc, and the E-C pair set corresponding to yˆ cle i as Pcle. Then the final E-C pair set of our method is the union of Pelc and Pcle. Of course, Pelc, Pcle or the intersection of Pelc and Pcle is also an option for the final pair set. 2.3 Components of SLSN As shown in Figure 3, SLSN contains two subnetworks, i.e., the emotion subnetwork referred as E-net which is mainly for the E-LC label prediction and the cause subnetwork referred as C-net which is mainly for the C-LE label prediction. E-net and C-net have similar structures in terms of word embedding, clause encoder, and hidden state learning. After the hidden state learning layer, E-net and C-net use two types of local pair searchers (LPS) with symmetric structures for the local pair label prediction. The local pair
TT1 Hidden state Learning (Clause-Level Bi-LSTM Hidden State Learning (Clause-Level Bi-LSTM) a Clause Encoder c v西-lSTM+Aen Clause Clause Encoder (wor 。● E-net C-net Figure 3:Framework of SLSN model searcher is a specially designed cross-subnetwork module,which uses the hidden states of the clauses in both subnetworks for prediction.In the following,we introduce the components of SLSN in technical details. 2.3.1 Word Embedding Before representing the clauses in the document,we first map each word in clauses into word embedding, which is a low-dimensional real-value vector.Formally,given a sequence of clauses d [c1,c2,..,cn], the clause ci=[w,w,...,w]consists of li words.We map each clause into its word-level represen- tation vi=v,v2,...v],where v is the word embedding of word w. 2.3.2 Clause Encoder After word embedding,we use a Bi-LSTM layer followed by an attention layer as the clause encoder in both E-net and C-net to learn the representation of clauses.Formally,in E-net,given the word-level representation of the i-th clause vi=[,,...,]as the input,the word-level Bi-LSTM layer first maps it to the hidden statesrThen,the attention layer maps eachr to the emotion representation of the clause sf by weighting each word in the clause and then aggregating them through the following equations: u tanh(Wwr +bw) (1) a= exp((u)Tus) ∑terp(u)Tus (2) =∑州 (3) where Ww,b and us are weight matrix,bias vector and context vector respectively.a is the attention weight ofr.Similarly,in C-net,the cause representation of the i-th clause sf is obtained using a clause encoder with a similar structure. 2.3.3 Hidden State Learning After the clause encoder,we use a hidden state learning layer to learn the contextualized representa- tion of each clause in the document.Formally,in E-net,given a sequence of emotion representations [s,s5,...,s as input,the clause-level Bi-LSTM layer is used to map it to a sequence of emotion hidden states [hf,h.h.Similarly,in C-net,the sequence of cause hidden states [hf,hh]is obtained from a sequence of cause representations. 2.3.4 Local Pair Searcher After obtaining the two types of hidden states,we design two types of local pair searchers(LPS)with symmetric structures in E-net and C-net respectively,to predict the local pair labels of each clause. In E-net,LPS predicts the E-LC label for each clause,which contains E-label and LC-label respec- tively.For the E-label prediction,LPS only uses the emotion hidden state of the clause.Formally,given 142
142 # (( &# # !)'*! # (( &# # !)'*! !)' #$& $&*! ((#( $# !)' #$& $&*! ((#( $# &# $%+ !' &) !' &) "$( $# )& )' )& $& " # !)' %&'#(( $# # (( $! $! E-net C-net Figure 3: Framework of SLSN model searcher is a specially designed cross-subnetwork module, which uses the hidden states of the clauses in both subnetworks for prediction. In the following, we introduce the components of SLSN in technical details. 2.3.1 Word Embedding Before representing the clauses in the document, we first map each word in clauses into word embedding, which is a low-dimensional real-value vector. Formally, given a sequence of clauses d = [c1, c2, · · · , cn], the clause ci = [w 1 i , w2 i , . . . , w li i ] consists of li words. We map each clause into its word-level representation vi = [v 1 i , v2 i , . . . , v li i ], where v j i is the word embedding of word w j i . 2.3.2 Clause Encoder After word embedding, we use a Bi-LSTM layer followed by an attention layer as the clause encoder in both E-net and C-net to learn the representation of clauses. Formally, in E-net, given the word-level representation of the i-th clause vi = [v 1 i , v2 i , . . . , v li i ] as the input, the word-level Bi-LSTM layer first maps it to the hidden states ri = [r 1 i , r2 i , . . . , r li i ]. Then, the attention layer maps each ri to the emotion representation of the clause s e i by weighting each word in the clause and then aggregating them through the following equations: u j i = tanh(Wwr j i + bw) (1) a j i = exp((u j i ) T us) P t exp((u t i ) T us) (2) s e i = X j a j i r j i (3) where Ww, bw and us are weight matrix, bias vector and context vector respectively. a j i is the attention weight of r j i . Similarly, in C-net, the cause representation of the i-th clause s c i is obtained using a clause encoder with a similar structure. 2.3.3 Hidden State Learning After the clause encoder, we use a hidden state learning layer to learn the contextualized representation of each clause in the document. Formally, in E-net, given a sequence of emotion representations [s e 1 , se 2 , . . . , se n ] as input, the clause-level Bi-LSTM layer is used to map it to a sequence of emotion hidden states [h e 1 ,h e 2 ,· · · ,h e n ]. Similarly, in C-net, the sequence of cause hidden states [h c 1 ,h c 2 ,· · · ,h c n ] is obtained from a sequence of cause representations. 2.3.4 Local Pair Searcher After obtaining the two types of hidden states, we design two types of local pair searchers (LPS) with symmetric structures in E-net and C-net respectively, to predict the local pair labels of each clause. In E-net, LPS predicts the E-LC label for each clause, which contains E-label and LC-label respectively. For the E-label prediction, LPS only uses the emotion hidden state of the clause. Formally, given
the emotion hidden state he of the i-th clause,LPS uses a softmax layer to predict its E-label yf through the following equation: 蓝=softmax(Weh线+be) (4) where We and be are weight matrix and bias vector respectively. For the LC-label prediction,there are two cases under consideration.If the predicted E-label of the i-th clause is false,the corresponding LC-label is a zero vector.Because it is unnecessary to predict LC-label.Otherwise,LPS predicts the LC-label for all the clauses within the local context of the i-th clause.We denote these clauses as local context clauses.Assuming that the local context window size is k=1(the case in Figure 3),the local context clauses of the i-th clause are ci-1,ci,and ci+1 respectively. For the LC-label prediction,both the emotion and cause hidden states of the clause are used.Formally. given the emotion hidden state he of the i-th clause and the cause hidden states [he1,he,h]of the corresponding local context clauses,LPS first calculates an emotion attention ratio j for each local context clause using the following formula: (hi,hg)=hihg (5) exp((hi,hg)) 入= ∑-ep(h,写) (6) where y(he,h)is an emotion attention function which estimates the relevance between the local cause and the target emotion.We choose the simple dot attention based on the experimental results (Luong et al.,2015).This emotion attention ratio Aj is then used to scale the original cause hidden states as follows: 5=入·9 (7) where ge is the scaled cause hidden state of the j-th local context clause.The used in Figure 3 refers to Eq.(5),Eq.(6),and Eq.(7).We further use a local Bi-LSTM layer to learn the contextualized representation of each local context clause: =LSTMie(df),j [ik,+ (8) 6=STMe(g5),j∈i-k,i+月 (9) Finally,the LC-label of the j-th local context clause is predicted through the following equation: softmaz(Wicoj+bie) (10) where o;is the concatenation ofof and j,Wie and bie are the weight matrix and bias vector respectively. Similarly,in C-net,the LPS with symmetric structure in E-net is used to predict the C-LE label for each clause,which contains C-label yf and LE-label e respectively. 2.4 Model Training The SLSN model consists of two sub-networks,i.e.,E-net and C-net.Given a sequence of clauses as input,the E-net is mainly used to predict their E-LC label,and the C-net is mainly used to predict their C-LE label.Thus,the loss of SLSN is a weighted sum of two components: L=aLelc+(1-a)Lcle (11) where a 0,1]is a tradeoff parameter.Both Lele and Lele consist of two parts of the loss: Lelc BLe+(1-B)Llc (12) Lce BLc+(1-B)Lle (13) 143
143 the emotion hidden state h e i of the i-th clause, LPS uses a softmax layer to predict its E-label yˆ e i through the following equation: yˆ e i = sof tmax(Weh e i + be) (4) where We and be are weight matrix and bias vector respectively. For the LC-label prediction, there are two cases under consideration. If the predicted E-label of the i-th clause is false, the corresponding LC-label is a zero vector. Because it is unnecessary to predict LC-label. Otherwise, LPS predicts the LC-label for all the clauses within the local context of the i-th clause. We denote these clauses as local context clauses. Assuming that the local context window size is k = 1 (the case in Figure 3), the local context clauses of the i-th clause are ci−1, ci , and ci+1 respectively. For the LC-label prediction, both the emotion and cause hidden states of the clause are used. Formally, given the emotion hidden state h e i of the i-th clause and the cause hidden states [h c i−1 ,h c i ,h c i+1] of the corresponding local context clauses, LPS first calculates an emotion attention ratio λj for each local context clause using the following formula: γ(h e i , hc j ) = h e ih c j (5) λj = exp(γ(h e i , hc j )) Pi+k j=i−k exp(γ(h e i , hc j )) (6) where γ(h e i , hc j ) is an emotion attention function which estimates the relevance between the local cause and the target emotion. We choose the simple dot attention based on the experimental results (Luong et al., 2015). This emotion attention ratio λj is then used to scale the original cause hidden states as follows: q lc j = λj · h c j (7) where q lc j is the scaled cause hidden state of the j-th local context clause. The ⊗ used in Figure 3 refers to Eq. (5), Eq. (6), and Eq. (7). We further use a local Bi-LSTM layer to learn the contextualized representation of each local context clause: −→oj = −−−−−→ LSTMlc(q lc j ), j ∈ [i − k, i + k] (8) ←−oj = ←−−−−− LSTMlc(q lc j ), j ∈ [i − k, i + k] (9) Finally, the LC-label yˆ lc j of the j-th local context clause is predicted through the following equation: yˆ lc j = sof tmax(Wlcoj + blc) (10) where oj is the concatenation of −→oj and ←−oj , Wlc and blc are the weight matrix and bias vector respectively. Similarly, in C-net, the LPS with symmetric structure in E-net is used to predict the C-LE label for each clause, which contains C-label yˆ c i and LE-label yˆ le j respectively. 2.4 Model Training The SLSN model consists of two sub-networks, i.e., E-net and C-net. Given a sequence of clauses as input, the E-net is mainly used to predict their E-LC label, and the C-net is mainly used to predict their C-LE label. Thus, the loss of SLSN is a weighted sum of two components: L = αLelc + (1 − α)L cle (11) where α ∈ [0, 1] is a tradeoff parameter. Both L elc and L cle consist of two parts of the loss: L elc = βLe + (1 − β)L lc (12) L cle = βLc + (1 − β)L le (13)