J.Liu,H.A.Abbass,D.G.Green,and W.Zhong Table 1:The relationship between basic motifs and distance motifs Type of basic motifs Class of Distance Motifs No. Motifs Condition Class Condition Class Groupo Neutral Core dg≥d2 Guide d≥d Guide Group Core da<d2 Deceptive da>d Deceptive Core d2≥dg Guide d≥d Guide Group? d2<d3 d1>d3 Core Deceptive Deceptive Guide d≥d2and Core d1≥d3 Group3 d≥dg Guide di>d2 and d<da Core Deceptive d2>d3 Deceptive d2≥d1andd2≥ds Guide Group4 ● ● d2<d or d2<ds Deceptive d1≥d2 and d3≥d2 Guide Groups d<d2 or d3<d2 Deceptive Clearly,based on the edges between {n,n2}and {n2,n3},there are six types of basic motifs in total for all DFLNs,which are labelled as Groupo-Groups and shown in the first two columns of Table 1.These six types of basic motifs can be viewed as basic building blocks of a DFLN. 4.2 Distance Motifs An selection mechanism in EAs,such as binary tournament selection,causes the search heuristic to prefer high fitness regions.Thus,the problems would be easy if high fitness regions are close to global optima.On the contrary,if high fitness regions are far from global optima,the selection mechanism may lead the search heuristic in a wrong direc- tion.This indicates that the distance between candidate solutions and global optima is another important factor that impacts EAs'performance.Thus,we use the distance information to refine basic motifs.There are a number of ways to define the distance between a candidate solution and a global optimum.Naturally,the one that is most suitable for the searching process on fitness landscapes is the shortest path length.To calculate the shortest path length,we need to identify the nodes in a network that correspond to the global optima.However,for indirect encoding methods,like the per- mutation encoding for MKPs,the computational cost to match the points in the search 6 Evolutionary Computation Volume x,Number x
J. Liu, H. A. Abbass, D. G. Green, and W. Zhong Table 1: The relationship between basic motifs and distance motifs. Type of basic motifs Class of Distance Motifs No. Motifs Condition Class Condition Class Group0 Neutral Group1 d3 ≥ d2 Guide d3 ≥ d1 Core Guide d3 < d2 Deceptive d3 > d1 Core Deceptive Group2 d2 ≥ d3 Guide d1 ≥ d3 Core Guide d2 < d3 Deceptive d1 > d3 Core Deceptive Group3 d1 ≥ d3 Guide d1 ≥ d2 and Core d2 ≥ d3 Guide d1 < d3 Deceptive d1 > d2 and Core d2 > d3 Deceptive Group4 d2 ≥ d1 and d2 ≥ d3 Guide d2 < d1 or d2 < d3 Deceptive Group5 d1 ≥ d2 and d3 ≥ d2 Guide d1 < d2 or d3 < d2 Deceptive Clearly, based on the edges between {n1, n2} and {n2, n3}, there are six types of basic motifs in total for all DFLNs, which are labelled as Group0 - Group5 and shown in the first two columns of Table 1. These six types of basic motifs can be viewed as basic building blocks of a DFLN. 4.2 Distance Motifs An selection mechanism in EAs, such as binary tournament selection, causes the search heuristic to prefer high fitness regions. Thus, the problems would be easy if high fitness regions are close to global optima. On the contrary, if high fitness regions are far from global optima, the selection mechanism may lead the search heuristic in a wrong direction. This indicates that the distance between candidate solutions and global optima is another important factor that impacts EAs’ performance. Thus, we use the distance information to refine basic motifs. There are a number of ways to define the distance between a candidate solution and a global optimum. Naturally, the one that is most suitable for the searching process on fitness landscapes is the shortest path length. To calculate the shortest path length, we need to identify the nodes in a network that correspond to the global optima. However, for indirect encoding methods, like the permutation encoding for MKPs, the computational cost to match the points in the search 6 Evolutionary Computation Volume x, Number x
Motif Difficulty:A Problem Difficulty Measure space to the global optima may be quite high.For binary encoding,since the expected number of bits that are flipped in the 1/n mutation is 1,we use the 1-bit flip operator to build the DFLN,and then the Hamming distance is used as the distance measure, which is equivalent to the shortest path length in this case.For other encoding meth- ods,if the computational cost to identify the nodes corresponding to global optima and calculate the shortest path length is feasible,the shortest path length can be used as the distance measure;otherwise,an approximate way should be designed. Definition 3:Given a basic motif,M={Vb=(m1,n2,n3),Eme),the corre- sponding Distance Motif is Ma={VM,E,{di,d2,ds}),where di,d2,and d3 are respectively the distances attached to n,n2,and n. When the searching process visits a basic motif from m to n2 to n3,we can check di, d2,and d3 to see whether the searching process is heading towards the right direction or not.Clearly,when we check a direction is correct or not,we should take into account the information of both fitness and distance.Since the information of fitness is reflected by edges in DFLNs,we first define paths in a distance motif.Then,based on their contributions to the searching process,distance motifs are divided into three classes. Definition 4: Given a distance motif,Ma {VMa (m,n2,n3},Emd,{d,d2,d3)).For Vni,nj E VMd,if ni can reach nj by only visiting edges in Ev4,then there is a Path between n;and n.Furthermore,for each edge on a path in Ma,if the edge with inverse direction does not exist,then this path is an Effective Path. Definition 5:Given a distance motif,Md ={VMa,Em4,{di,d2,d3)).If there is no effective paths in Ma,then Md is a Neutral Motif.If all effective paths with the largest length(the number of edges in the path)in Md satisfies that the distance of the start node is not less than that of the end node,then Ma is a Guide Motif;otherwise,it is a Deceptive Motif. In a path,the node with lower fitness value always points to the node with higher fitness value.When the distance of an end node on a path is not larger than that of the other end node,then it means that when the fitness value increases,the distance decreases.In such a case,the motif has a positive effect on the searching process,thus, it is a guide motif.Table 1 illustrates how the three classes of distance motifs are related to the six types of basic motifs.For Groupo,since the set of effective paths is empty,all motifs in Groupo are neutral motifs.For Group,the only effective path is between #3 and n2.Thus,if d3 d2,then they are guide motifs;otherwise,deceptive motifs.The case for Group?is similar.For Group,there are three effective paths,namely the paths between n and n2,n2 and n3,and m and n3,whose lengths are 1,1,and 2,respectively, and only the path with the largest length need to be considered.Thus,if di>d3,then they are guide motifs;otherwise,deceptive motifs.For Group,there are two effective paths,namely the paths between n2 and n1,n2 and n3,whose lengths are both 1,and both of them need to be considered.Thus,if both d2>di and d2>d3,then they are guide motifs;otherwise,deceptive motifs.The case for Group,is similar. If we further analyse guide and deceptive motifs,paths with length two exist in some cases.On such paths,if the difference in distances and fitness values of each pair of nodes connected by an edge is consistent with that of two end nodes,then the corresponding motifs have a more explicit impact on the searching process.Thus,we further define two sub-classes of guide and deceptive motifs. Definition 6:Given a guide motif,MG=[VMc,EMc,{di,d2,d3)).If there exists a path with length two,labelled as {(mi,nk),(nk,nj)},where ni,nk,njE VMG and diz dj,satisfies the conditions that di>dk when (nk,ni)g Emc and dk >dj when Evolutionary Computation Volume x,Number x
Motif Difficulty: A Problem Difficulty Measure space to the global optima may be quite high. For binary encoding, since the expected number of bits that are flipped in the 1/n mutation is 1, we use the 1-bit flip operator to build the DFLN, and then the Hamming distance is used as the distance measure, which is equivalent to the shortest path length in this case. For other encoding methods, if the computational cost to identify the nodes corresponding to global optima and calculate the shortest path length is feasible, the shortest path length can be used as the distance measure; otherwise, an approximate way should be designed. Definition 3: Given a basic motif, Mb = {VMb = {n1, n2, n3}, −→E Mb }, the corresponding Distance Motif is Md = {VMb , −→E Mb , {d1, d2, d3}}, where d1, d2, and d3 are respectively the distances attached to n1, n2, and n3. When the searching process visits a basic motif from n1 to n2 to n3, we can check d1, d2, and d3 to see whether the searching process is heading towards the right direction or not. Clearly, when we check a direction is correct or not, we should take into account the information of both fitness and distance. Since the information of fitness is reflected by edges in DFLNs, we first define paths in a distance motif. Then, based on their contributions to the searching process, distance motifs are divided into three classes. Definition 4: Given a distance motif, Md = {VMd = {n1, n2, n3}, −→E Md , {d1, d2, d3}}. For ∀ni , nj ∈ VMd , if ni can reach nj by only visiting edges in −→E Md , then there is a Path between ni and nj . Furthermore, for each edge on a path in Md , if the edge with inverse direction does not exist, then this path is an Effective Path. Definition 5: Given a distance motif, Md = {VMd , −→E Md , {d1, d2, d3}}. If there is no effective paths in Md , then Md is a Neutral Motif. If all effective paths with the largest length (the number of edges in the path) in Md satisfies that the distance of the start node is not less than that of the end node, then Md is a Guide Motif; otherwise, it is a Deceptive Motif. In a path, the node with lower fitness value always points to the node with higher fitness value. When the distance of an end node on a path is not larger than that of the other end node, then it means that when the fitness value increases, the distance decreases. In such a case, the motif has a positive effect on the searching process, thus, it is a guide motif. Table 1 illustrates how the three classes of distance motifs are related to the six types of basic motifs. For Group0 , since the set of effective paths is empty, all motifs in Group0 are neutral motifs. For Group1 , the only effective path is between n3 and n2. Thus, if d3 ≥ d2, then they are guide motifs; otherwise, deceptive motifs. The case for Group2 is similar. For Group3 , there are three effective paths, namely the paths between n1 and n2, n2 and n3, and n1 and n3, whose lengths are 1, 1, and 2, respectively, and only the path with the largest length need to be considered. Thus, if d1 ≥ d3, then they are guide motifs; otherwise, deceptive motifs. For Group4 , there are two effective paths, namely the paths between n2 and n1, n2 and n3, whose lengths are both 1, and both of them need to be considered. Thus, if both d2 ≥ d1 and d2 ≥ d3, then they are guide motifs; otherwise, deceptive motifs. The case for Group5 is similar. If we further analyse guide and deceptive motifs, paths with length two exist in some cases. On such paths, if the difference in distances and fitness values of each pair of nodes connected by an edge is consistent with that of two end nodes, then the corresponding motifs have a more explicit impact on the searching process. Thus, we further define two sub-classes of guide and deceptive motifs. Definition 6: Given a guide motif, MG = {VMG , −→E MG , {d1, d2, d3}}. If there exists a path with length two, labelled as {(ni , nk), (nk, nj )}, where ni , nk, nj ∈ VMG and di ≥ dj , satisfies the conditions that di ≥ dk when (nk, ni) 6∈ −→E MG and dk ≥ dj when Evolutionary Computation Volume x, Number x 7
J.Liu,H.A.Abbass,D.G.Green,and W.Zhong (nj,nk)EMG,then MG is a Core Guide Motif. Definition 7:Given a deceptive motif,M=(VD,ED,{d,d2,d3}).If there exists a path with length two,labelled as {(ni,n),(nk,nj)},where ni,nk,n;E VMD and di<di,satisfies the conditions that di<dk when (nk,ni)EMD and dk<dj when (nj,nk)EMD,then MD is a Core Deceptive Motif. According to Definitions 6 and 7,none of the motifs in Group and Groups is core guide or deceptive motifs because no path with length two exists.The relationship with other groups of basic motifs is also shown in Table 1. 5 Predictive Difficulty Measures based on Distance Motifs Guide motifs can mostly help the searching process head towards the right direction. This is because when the searching process goes from nodes with lower fitness values to nodes with higher fitness values,it naturally and smoothly approaches global op- tima.On the contrary,deceptive motifs mostly lead the searching process in the wrong direction since when the searching process goes from nodes with lower fitness values to nodes with higher fitness values,it deviates away from global optima.For neutral motifs,fitness values of the three nodes are the same,thus,neither deceptive nor guide information is provided.Next,we first analyse DFLNs of some well-studied fitness functions to show how the motifs composing a DFLN are related to problem difficulty. Then,based on the analyses,predictive difficulty measures are proposed. 5.1 DFLNs of Well-Studied Fitness Functions 5.1.1 Effect of the Amount of Different Types of Motifs on Problem Difficulty Three classical fitness functions defined over the binary search space are used,namely ONEMAX,NIAH,and TRAP(Droste et al.,2002). ONEMAX(s)= ∑ (10) = 1 Ifs=s* NIAH(S)={0fs≠s (11) TRAP(s)=∑s+(n+1)·ΠI1-s) (12) 1=1 where s (s1,s2,...,sn)and si=0 or 1,i=1,2,...,n.s*is the global optimum. Proposition 1:If a fitness landscape,FL={S,f,N),where S is composed of binary strings,f=ONEMAX,and the 1-bit flip operator is used to construct N,then distance motifs in the corresponding directed fitness landscape network are guide motifs. Proof.Let a distance motif in the corresponding DFLN be M {m,n2,n3},EMa,{di,d2,d3}.If an edge (ni,nj)EMa,then according to the 1-bit flip operator and ONEMAX,we have f(n)=1+fn:)→fn)>f(n)→(,n)是EMd (13) Thus,Md belongs to Groups,Group,or Groups. Case 1:Md belongs to Group3. Evolutionary Computation Volume x,Number x
J. Liu, H. A. Abbass, D. G. Green, and W. Zhong (nj , nk) 6∈ −→E MG , then MG is a Core Guide Motif. Definition 7: Given a deceptive motif, MD = {VMD , −→E MD , {d1, d2, d3}}. If there exists a path with length two, labelled as {(ni , nk), (nk, nj )}, where ni , nk, nj ∈ VMD and di < dj , satisfies the conditions that di < dk when (nk, ni) 6∈ −→E MD and dk < dj when (nj , nk) 6∈ −→E MD , then MD is a Core Deceptive Motif. According to Definitions 6 and 7, none of the motifs in Group4 and Group5 is core guide or deceptive motifs because no path with length two exists. The relationship with other groups of basic motifs is also shown in Table 1. 5 Predictive Difficulty Measures based on Distance Motifs Guide motifs can mostly help the searching process head towards the right direction. This is because when the searching process goes from nodes with lower fitness values to nodes with higher fitness values, it naturally and smoothly approaches global optima. On the contrary, deceptive motifs mostly lead the searching process in the wrong direction since when the searching process goes from nodes with lower fitness values to nodes with higher fitness values, it deviates away from global optima. For neutral motifs, fitness values of the three nodes are the same, thus, neither deceptive nor guide information is provided. Next, we first analyse DFLNs of some well-studied fitness functions to show how the motifs composing a DFLN are related to problem difficulty. Then, based on the analyses, predictive difficulty measures are proposed. 5.1 DFLNs of Well-Studied Fitness Functions 5.1.1 Effect of the Amount of Different Types of Motifs on Problem Difficulty Three classical fitness functions defined over the binary search space are used, namely ONEMAX , NIAH , and TRAP (Droste et al., 2002). ONEMAX (s) = Xn i=1 si (10) NIAH (s) = ½ 1 If s = s ∗ 0 If s 6= s ∗ (11) TRAP(s) = Xn i=1 si + (n + 1) · Yn i=1 (1 − si) (12) where s = (s1, s2, · · · , sn) and si = 0 or 1, i = 1, 2, · · · , n. s ∗ is the global optimum. Proposition 1: If a fitness landscape, FL = {S, f, N}, where S is composed of binary strings, f=ONEMAX , and the 1-bit flip operator is used to construct N, then distance motifs in the corresponding directed fitness landscape network are guide motifs. Proof. Let a distance motif in the corresponding DFLN be Md = {{n1, n2, n3}, −→E Md , {d1, d2, d3}}. If an edge (ni , nj ) ∈ −→E Md , then according to the 1-bit flip operator and ONEMAX , we have f(nj ) = 1 + f(ni) ⇒ f(nj ) > f(ni) ⇒ (nj , ni) 6∈ −→E Md (13) Thus, Md belongs to Group3 , Group4 , or Group5 . Case 1: Md belongs to Group3 . 8 Evolutionary Computation Volume x, Number x
Motif Difficulty:A Problem Difficulty Measure Ewa={(1,2),(2,3)}→f(1)<f2)<f(n3) (14) →#1(1)<#1(2)<#1(13)】 where #1()denotes the number of '1'in a candidate solution.Since the global opti- mum is the string of all '1',we have di>d2>d3;that is,Md is a core guide motif. Case 2:Md belongs to Group. Exa={(n2,m),(n2,n3)}f(n2)<f(m)andf(n2)<f(n3) →#1(2)<#1()and#1(2)<#1(g) (15) Then,we have d2>di and d2>d3;that is,Md is a guide motif. Case 3:Md belongs to Groups. EMa={,),(n,2}→fh)<f2)andf(ns)<fn2】 (16) →#1()<#1(2)and#1(3)<#1(2) Then,we have d>d2 and d3>d2;that is,Md is a guide motif. To summarise,in all the three cases,Md is a guide motif. ◇ Before we give Proposition 2,we first prove some properties of DFLNs built by the 1-bit flip operator. Lemma 1:In a fitness landscape network built by the 1-bit flip operator,(1)the number of edges connecting any three nodes is less than three,(2)each node belongs to 3n()basic motifs where n is the dimension of the search space,(3)the number of all basic motifs is 2n-n(n-1). Proof.(1)Suppose there are three nodes {n1,n2,n3}connected by three edges. Without loss of generality,let the set of edges connecting these three nodes be n,n2,n2,n3,n3,n,and the configurations corresponding to these three nodes be s1,s2,s3.Then,we have H(s1,s3)=1 (17) where H(,)denotes the Hamming distance.But we also have H(s1,s2)=1 and H(s2,s3)=1 H(s1,s3)=0 or 2 (18) Clearly,(17)and(18)contradict each other.Thus,the number of edges connecting any three nodes is less than three. (2)The number of edges connected to each node is n.According to Lemmal(1), any three nodes can form only 1 basic motif at most.Thus,all distance motifs that a node m belongs to can be divided into two cases;that is,(a)both n2 and n3 connect to n and (b)n connects to n2 and n2 connects to n3.The number of basic motifs that n belongs to is )+n(n-1)=3nm- (19) 2 (3)The proof for Lemmal(2)shows that basic motifs that each node belongs to can be divided into two cases.In fact,Case 2 for n is equivalent to Case 1 for n2. Therefore,to prevent repetition,we need to count only Case 1 for each node to calculate Evolutionary Computation Volume x,Numberx 9
Motif Difficulty: A Problem Difficulty Measure −→E Md = {(n1, n2), (n2, n3)} ⇒ f(n1) < f(n2) < f(n3) ⇒ #1(n1) < #1(n2) < #1(n3) (14) where #1(·) denotes the number of ‘1’ in a candidate solution. Since the global optimum is the string of all ‘1’, we have d1 > d2 > d3; that is, Md is a core guide motif. Case 2: Md belongs to Group4 . −→E Md = {(n2, n1), (n2, n3)} ⇒ f(n2) < f(n1) and f(n2) < f(n3) ⇒ #1(n2) < #1(n1) and #1(n2) < #1(n3) (15) Then, we have d2 > d1 and d2 > d3; that is, Md is a guide motif. Case 3: Md belongs to Group5 . −→E Md = {(n1, n2), (n3, n2)} ⇒ f(n1) < f(n2) and f(n3) < f(n2) ⇒ #1(n1) < #1(n2) and #1(n3) < #1(n2) (16) Then, we have d1 > d2 and d3 > d2; that is, Md is a guide motif. To summarise, in all the three cases, Md is a guide motif. Before we give Proposition 2, we first prove some properties of DFLNs built by the 1-bit flip operator. Lemma 1: In a fitness landscape network built by the 1-bit flip operator, (1) the number of edges connecting any three nodes is less than three, (2) each node belongs to 3n(n−1) 2 basic motifs where n is the dimension of the search space, (3) the number of all basic motifs is 2 n−1n(n − 1). Proof. (1) Suppose there are three nodes {n1, n2, n3} connected by three edges. Without loss of generality, let the set of edges connecting these three nodes be {{n1, n2}, {n2, n3}, {n3, n1}}, and the configurations corresponding to these three nodes be s1, s2, s3. Then, we have H(s1, s3) = 1 (17) where H(·, ·) denotes the Hamming distance. But we also have H(s1, s2) = 1 and H(s2, s3) = 1 ⇒ H(s1, s3) = 0 or 2 (18) Clearly, (17) and (18) contradict each other. Thus, the number of edges connecting any three nodes is less than three. (2) The number of edges connected to each node is n. According to Lemma1(1), any three nodes can form only 1 basic motif at most. Thus, all distance motifs that a node n1 belongs to can be divided into two cases; that is, (a) both n2 and n3 connect to n1 and (b) n1 connects to n2 and n2 connects to n3. The number of basic motifs that n1 belongs to is ( n 2 ) + n(n − 1) = 3n(n − 1) 2 (19) (3) The proof for Lemma1(2) shows that basic motifs that each node belongs to can be divided into two cases. In fact, Case 2 for n1 is equivalent to Case 1 for n2. Therefore, to prevent repetition, we need to count only Case 1 for each node to calculate Evolutionary Computation Volume x, Number x 9