A Methodology for Analyzing Availability Weak Points in SOA Deployment Frameworks Jing Luol,Ying Lil,John A Pershing2,Lei Xie3,and Ying Chenl 1 IBM China Research Lab,China jingluo,lying,yingch@cn.ibm.com 2 IBM T.J.Watson Research Center,Hawthorne,NY 10532,USA pershing@alum.mit.edu 3 Department of Computer Science,Nanjing University,China xielei@dislab.nju.edu.cn Abstract-The fundamental characteristics of SOA,loose To ensure availability,redundancy-based High Availability coupling and on-demand integration,enable organizations to (HA)solutions are the primary approach,including clustering seek more flexibility and responsiveness from their business IT [2],hot-failover [3],recursive restartability [4],and Redundant systems.However,this brings challenges to assure QoS,especially availability,which should be considered in an integrated way in Array of Independent Disk (RAID).However,these solu- an SOA environment.Traditionally,availability is measured for tions are usually expensive and their cost,capabilities and each IT resource,but within SOA environments,rather than implementation difficulties also vary greatly.Thus,it becomes being considered individually,availability should be analyzed quite difficult to plan HA solutions in a cost-effective manner. from an end-to-end view from both business and IT perspectives. Traditionally,IT architects rely on experience to decide which In this paper,to address the availability problem of SOA,we propose a methodology that analyzes availability weak points HA solutions should be applied to which IT resources with in SOA deployment frameworks,leveraging workflow defini- what degree of redundancy.However,this experience-based tions that specify availability requirements at business level. approach to HA is difficult to apply to SOA environments, This methodology includes an effective way to calculate high- because the large number of involved IT systems and the com- availability enhancement recommendations for a given SOA plex relationships among them are beyond the comprehension deployment topology with near-minimum cost,while meeting the business-level availability requirements.A prototype has of most humans. been implemented as an extension to IBM's SOA deployment Moreover,even if all single points of failure have been framework.Its efficiency and performance are analyzed here. eliminated,some of the (redundant)IT resources still may not exhibit the necessary availability level to satisfy the I.INTRODUCTION requirements of the business,so it may be necessary to intro- The Service-Oriented Architecture (SOA)provides on- duce even more redundancy in order to meet the availability demand integration capabilities by loosely composing one requirements.Hence,the key to delivering a cost-effective HA architecture is to determine the right HA level for each IT or more services.This loose coupling enables SOA to offer system,based on the trade-off between its outage loss and clear benefits [1]and opens up new opportunities for orga- redundancy cost:too little redundancy could result in costly nizations to become more flexible and responsive.However, outages,and too much could be an expensive waste. this architecture brings challenges to Information Technology (IT)management and complicates Quality of Service (QoS) In this paper,we define an availability weak point as an IT resource that is not providing sufficient availability to meet measurement precisely because of its loosely coupled nature:it is frequently difficult to determine which systems and services current business requirements,but for which we can provide a cost-effective HA enhancement in order to meet these avail- are contributing to an SOA service,and how they may be ability requirements.Therefore,to apply HA solutions [2][3] failing to deliver the required quality of service.Traditionally, over SOA environments in a cost-effective manner,identifying availability is measured for each IT resource.But within and analyzing availability weak points is the starting point.In SOA environments,rather than merely considering the avail- ability of individual IT resources,one must take an end-to- a previous article [5].we proposed a workflow-based weak- end viewpoint.Moreover,the relationships between business point analysis methodology to address this challenge.The methodology determines which deployed IT resources need workflows and the supporting IT resources are complex and dynamic.For example,one service can be invoked by several to have their availability enhanced,and to what extent,in business workflows.while each business workflow usually order to satisfy the business-level availability requirements while keeping the overall cost close to the minimum.In this invokes multiple services. paper,we further refine and evaluate our methodology,and IManuscript received 1 April 2008,revised 29 October 2008.accepted 17 analyze the efficiency and performance of the prototype that February 2009.The Associate Editor coordinating the review of this paper implements it as an extension to IBM's SOA deployment and approving it for publication was J.P.Martin-Flatin. framework
A Methodology for Analyzing Availability Weak Points in SOA Deployment Frameworks Jing Luo1 , Ying Li1 , John A Pershing2 , Lei Xie3 , and Ying Chen1 1 IBM China Research Lab, China {jingluo, lying, yingch}@cn.ibm.com 2 IBM T. J. Watson Research Center, Hawthorne, NY 10532, USA pershing@alum.mit.edu 3 Department of Computer Science, Nanjing University, China xielei@dislab.nju.edu.cn 1Abstract— The fundamental characteristics of SOA, loose coupling and on-demand integration, enable organizations to seek more flexibility and responsiveness from their business IT systems. However, this brings challenges to assure QoS, especially availability, which should be considered in an integrated way in an SOA environment. Traditionally, availability is measured for each IT resource, but within SOA environments, rather than being considered individually, availability should be analyzed from an end-to-end view from both business and IT perspectives. In this paper, to address the availability problem of SOA, we propose a methodology that analyzes availability weak points in SOA deployment frameworks, leveraging workflow definitions that specify availability requirements at business level. This methodology includes an effective way to calculate highavailability enhancement recommendations for a given SOA deployment topology with near-minimum cost, while meeting the business-level availability requirements. A prototype has been implemented as an extension to IBM’s SOA deployment framework. Its efficiency and performance are analyzed here. I. INTRODUCTION The Service-Oriented Architecture (SOA) provides ondemand integration capabilities by loosely composing one or more services. This loose coupling enables SOA to offer clear benefits [1] and opens up new opportunities for organizations to become more flexible and responsive. However, this architecture brings challenges to Information Technology (IT) management and complicates Quality of Service (QoS) measurement precisely because of its loosely coupled nature: it is frequently difficult to determine which systems and services are contributing to an SOA service, and how they may be failing to deliver the required quality of service. Traditionally, availability is measured for each IT resource. But within SOA environments, rather than merely considering the availability of individual IT resources, one must take an end-toend viewpoint. Moreover, the relationships between business workflows and the supporting IT resources are complex and dynamic. For example, one service can be invoked by several business workflows, while each business workflow usually invokes multiple services. 1Manuscript received 1 April 2008, revised 29 October 2008, accepted 17 February 2009. The Associate Editor coordinating the review of this paper and approving it for publication was J.P. Martin-Flatin. To ensure availability, redundancy-based High Availability (HA) solutions are the primary approach, including clustering [2], hot-failover [3], recursive restartability [4], and Redundant Array of Independent Disk (RAID). However, these solutions are usually expensive and their cost, capabilities and implementation difficulties also vary greatly. Thus, it becomes quite difficult to plan HA solutions in a cost-effective manner. Traditionally, IT architects rely on experience to decide which HA solutions should be applied to which IT resources with what degree of redundancy. However, this experience-based approach to HA is difficult to apply to SOA environments, because the large number of involved IT systems and the complex relationships among them are beyond the comprehension of most humans. Moreover, even if all single points of failure have been eliminated, some of the (redundant) IT resources still may not exhibit the necessary availability level to satisfy the requirements of the business, so it may be necessary to introduce even more redundancy in order to meet the availability requirements. Hence, the key to delivering a cost-effective HA architecture is to determine the right HA level for each IT system, based on the trade-off between its outage loss and redundancy cost: too little redundancy could result in costly outages, and too much could be an expensive waste. In this paper, we define an availability weak point as an IT resource that is not providing sufficient availability to meet current business requirements, but for which we can provide a cost-effective HA enhancement in order to meet these availability requirements. Therefore, to apply HA solutions [2][3] over SOA environments in a cost-effective manner, identifying and analyzing availability weak points is the starting point. In a previous article [5], we proposed a workflow-based weakpoint analysis methodology to address this challenge. The methodology determines which deployed IT resources need to have their availability enhanced, and to what extent, in order to satisfy the business-level availability requirements while keeping the overall cost close to the minimum. In this paper, we further refine and evaluate our methodology, and analyze the efficiency and performance of the prototype that implements it as an extension to IBM’s SOA deployment framework
The rest of the paper is organized as follows.In Section Workflow Specification Module Component II,we describe the basic structure of our availability weak- failure point analysis methodology.In Section III,we present our Availability SOA deployment topology behavior reauirements algorithm for calculating a near-optimal solution.In Section Workflow-based iVVonKTlov Vorkflowy Vorktlowl HA capacity IV,we describe our implementation and evaluate experimental mapping Business mapping results.In Section V,we study related work.In Section VI, matrix checking workflows we conclude the paper and discuss future work. HA Weak Point Analysis Module No Yes are II.WEAK-POINT ANALYSIS Overall utility function Optimal solution calculation In this work,we define a three-level workflow hierarchy for HA enhancement for HA enhancement to enable work-point analysis:business workflow,application The optimal HA enhancement workflow and IT resource workflow.Firstly,we assume that parameters for each IT resource business workflows are defined in some machine-readable format,such as Business Process Execution Language(BPEL) HA Pattern Mapping Module [6],and that a business workflow includes "pointers"(e.g., HA pattern HA pattern HA enhanced SOA Web service references)to the services that support the various repository mapping deployment topology steps of this business workflow.As services are implemented by given applications,we define an application workflow as the Fig.1.Architecture for workflow-based weak-point analysis application chain that supports the given business workflow. Furthermore,secondly,as applications should be supported Worktow1 by their hosted underlying IT resources,we assume that the ws Relafionship hosting and dependency relationships among the various IT Workfow2 resources are also available in some machine-readable format. WsWs either as standard deployment documents from the design phase,or as the result of a discovery process running against WAR EAR the IT infrastructure.By analyzing the hosting and dependency EAR WAREAR relationships,an IT resource workflow is defined as the IT resource chain that supports a given application workflow that further supports the given business workflow. WAS APP Server Server Based on these assumptions,our weak-point analysis Workfow2 methodology can first construct relationships between business WAS APP workflows and IT resources by workflow mapping:then,based WAS APP Server Server on these relationships,it can calculate the optimized HA en- hancement recommendation for the current SOA deployment Fig.2.Workflow mapping over the SOA deployment topology topology. The main building blocks of our methodology are depicted in Fig.1;they are grouped in three modules. relationships is inadequate:business workfow branching and implicit dependency discovery need to be considered. A.Workflow Specification Module Business workflow branching describes a situation when The Workflow Specification Module maps business work- the business workflow contains conditional branches and the flows to IT resources,where each business workflow is anno- branch to be selected next depends on current conditions.For tated with availability requirements.In this paper,availability a business workflow that implements complex business func- requirements are defined by an uptime ratio,which represents tions,it is common to have several conditional branches.Fig.3 the percentage of time a business workflow is available;for illustrates a workflow with two conditional branches at points example,99.9%means that end users tolerate a downtime A and B.When this workflow runs,only one path through the of at most 86.4 seconds per day for this workflow.Such workflow is executed.Therefore,modeling business workflows availability requirements are typically specified by business from an HA standpoint poses a problem in case of branching. architects.The mapping is performed from the business level On the one hand,the availability requirement is specified on to the application and IT resource levels by inspecting the the overall business workflow;on the other hand,only a subset hosting and dependency relationships that are defined in the of the service components are executed for any given runtime SOA deployment topology. invocation,depending on branch conditions. As Fig.2 shows,through the hosting relationships specified Under these circumstances,mapping business-level HA over the SOA deployment topology,Workflow I and Work-requirements to applications and IT resources is not straight- flow 2 are mapped to the IT resource level.However,in a forward.To deal with this problem,we break up a com- more complex scenario,direct mapping based only on hosting plex workflow into several sub-workflows,where each sub-
The rest of the paper is organized as follows. In Section II, we describe the basic structure of our availability weakpoint analysis methodology. In Section III, we present our algorithm for calculating a near-optimal solution. In Section IV, we describe our implementation and evaluate experimental results. In Section V, we study related work. In Section VI, we conclude the paper and discuss future work. II. WEAK-POINT ANALYSIS In this work, we define a three-level workflow hierarchy to enable work-point analysis: business workflow, application workflow and IT resource workflow. Firstly, we assume that business workflows are defined in some machine-readable format, such as Business Process Execution Language (BPEL) [6], and that a business workflow includes “pointers” (e.g., Web service references) to the services that support the various steps of this business workflow. As services are implemented by given applications, we define an application workflow as the application chain that supports the given business workflow. Furthermore, secondly, as applications should be supported by their hosted underlying IT resources, we assume that the hosting and dependency relationships among the various IT resources are also available in some machine-readable format, either as standard deployment documents from the design phase, or as the result of a discovery process running against the IT infrastructure. By analyzing the hosting and dependency relationships, an IT resource workflow is defined as the IT resource chain that supports a given application workflow that further supports the given business workflow. Based on these assumptions, our weak-point analysis methodology can first construct relationships between business workflows and IT resources by workflow mapping; then, based on these relationships, it can calculate the optimized HA enhancement recommendation for the current SOA deployment topology. The main building blocks of our methodology are depicted in Fig. 1; they are grouped in three modules. A. Workflow Specification Module The Workflow Specification Module maps business work- flows to IT resources, where each business workflow is annotated with availability requirements. In this paper, availability requirements are defined by an uptime ratio, which represents the percentage of time a business workflow is available; for example, 99.9% means that end users tolerate a downtime of at most 86.4 seconds per day for this workflow. Such availability requirements are typically specified by business architects. The mapping is performed from the business level to the application and IT resource levels by inspecting the hosting and dependency relationships that are defined in the SOA deployment topology. As Fig. 2 shows, through the hosting relationships specified over the SOA deployment topology, Workflow 1 and Work- flow 2 are mapped to the IT resource level. However, in a more complex scenario, direct mapping based only on hosting Fig. 1. Architecture for workflow-based weak-point analysis Fig. 2. Workflow mapping over the SOA deployment topology relationships is inadequate: business workflow branching and implicit dependency discovery need to be considered. Business workflow branching describes a situation when the business workflow contains conditional branches and the branch to be selected next depends on current conditions. For a business workflow that implements complex business functions, it is common to have several conditional branches. Fig. 3 illustrates a workflow with two conditional branches at points A and B. When this workflow runs, only one path through the workflow is executed. Therefore, modeling business workflows from an HA standpoint poses a problem in case of branching. On the one hand, the availability requirement is specified on the overall business workflow; on the other hand, only a subset of the service components are executed for any given runtime invocation, depending on branch conditions. Under these circumstances, mapping business-level HA requirements to applications and IT resources is not straightforward. To deal with this problem, we break up a complex workflow into several sub-workflows, where each sub-
workflow represents a path through the complex workflow. Implicit business Business workflow ① dependency Thus,to guarantee the availability of a complex workflow, Service1 Service2 we only need to guarantee the same availability for all its Implicit application dependency sub-workflows.This technique can be applied to all types of Application workflow workflow,including business workflows. Ear1 Ear2- Ol DB sub- sub- sub- IT resource workflow workflow workflow1 workflow2 workflow3 WAS WAS DB2 SFS Service 1 (Service 1 (Service 1 Service 1 Linux Linux nux inux Service 3 (Service 2 (Service 2 X86 X86 X86 Fig.4. Implicit dependency discovery in workflow mapping Service 4 Service 5 Service 6 (Service 4 Service 5 Service 6 Similarly,implicit application dependencies should also be considered when an application workflow is mapped to an (Service 7 IT resource workflow.A typical example used for database HA solutions is that the actual database files of a database Service 8 (Service 8 (Service 8 (Service 8 server are placed on a shared file system or Storage Area Network (SAN);thus,there is a dependency between the Fig.3.Business workflow branching database server and the shared file system,which is not expressed in the application workflow.Fig 4 shows an implicit As depicted in Fig 3,the complex workflow is transformed business dependency between an EAR module and a database into three separate sub-workflows.We treat each sub-workflow being inserted in the application workflow,and an implicit as a complete business workflow,which can be directly taken application dependency between the database server and a as input by the Weak-Point Analysis Module. shared file system being inserted in the IT resource workflow. Based on a "flat"(i.e.,non-branching)business workflow, After the workflow has been mapped from the business level our mapping mechanism constructs lower-level application to the IT resource level,we extract the list of IT resources workflows and IT resource workflows by deriving dependen- that are involved in each workflow.Then,the workflow- cies from the business workflows according to hosting and resource relationship matrix for weak-point analysis is created, dependency ("uses")relationships.In this mechanism,besides which contains the necessary information for the relevant IT noting the explicit dependencies of the higher-level workflows resources for each workflow. (e.g..from Web service references),implicit dependencies We assume there exist n business workflows over the SOA are also used to construct the lower-level workflows.An deployment topology,denoted Wi,W2.W3.....Wn.These implicit dependency is a relationship that is not expressed in workflows are specified with availability requirements P.P2. the higher-level workflows,but should be taken into account P3,...,P,where 0<P<1.We also assume that there are m in the lower-level workflows.When mapping from business IT resources,denoted C1,C2....Cm.Each resource consists workflows to application workflows,implicit business depen- of a "stack"of hardware and software components (e.g.,an dencies(which express the dependencies from applications to X86 server,a Linux Operating System,and a Websphere databases or to other application components in the application Application Server). topology)should be considered for constructing the applica- tion workflows.Dependencies between Enterprise ARchive C1 C02 C3 年 Cm (EAR)modules and databases are a typical example.Thus. WP R1.1 R1.2R1.3 Ri.m an application workflow is constructed as follows:an initial W2P2■ 2.1 R2.2■2.3 R2.m application workflow is constructed by following the explicit Wn(Pn)Rn.1 Rn.2 Rn.3 Rn.m dependencies from the business workflow:then the related TABLE I implicit dependencies are tracked down (e.g.,from a prior The workflow-resource relationship matrix discovery process)and inserted into the application workflow. For example,the business workflow of a J2EE application usually describes the dependency between Web modules and Table I shows the workflow-resource relationship matrix;the EAR modules;to construct an end-to-end application work- relationship between business workflow Wi and IT resource flow,the implicit dependencies expressing the relationships Ci is Ri.j,where Ri;is an integer count of the number of between each EAR module and the referenced databases are references to IT resource C;from business workflow Wi.Ri.j analyzed,and the databases are added as part of the application is set to 0 when resource C;is not included in the resource list workflow. of Wi.For example,Fig.5 shows a business workflow with
workflow represents a path through the complex workflow. Thus, to guarantee the availability of a complex workflow, we only need to guarantee the same availability for all its sub-workflows. This technique can be applied to all types of workflow, including business workflows. Fig. 3. Business workflow branching As depicted in Fig 3, the complex workflow is transformed into three separate sub-workflows. We treat each sub-workflow as a complete business workflow, which can be directly taken as input by the Weak-Point Analysis Module. Based on a “flat” (i.e., non-branching) business workflow, our mapping mechanism constructs lower-level application workflows and IT resource workflows by deriving dependencies from the business workflows according to hosting and dependency (“uses”) relationships. In this mechanism, besides noting the explicit dependencies of the higher-level workflows (e.g., from Web service references), implicit dependencies are also used to construct the lower-level workflows. An implicit dependency is a relationship that is not expressed in the higher-level workflows, but should be taken into account in the lower-level workflows. When mapping from business workflows to application workflows, implicit business dependencies (which express the dependencies from applications to databases or to other application components in the application topology) should be considered for constructing the application workflows. Dependencies between Enterprise ARchive (EAR) modules and databases are a typical example. Thus, an application workflow is constructed as follows: an initial application workflow is constructed by following the explicit dependencies from the business workflow; then the related implicit dependencies are tracked down (e.g., from a prior discovery process) and inserted into the application workflow. For example, the business workflow of a J2EE application usually describes the dependency between Web modules and EAR modules; to construct an end-to-end application work- flow, the implicit dependencies expressing the relationships between each EAR module and the referenced databases are analyzed, and the databases are added as part of the application workflow. Fig. 4. Implicit dependency discovery in workflow mapping Similarly, implicit application dependencies should also be considered when an application workflow is mapped to an IT resource workflow. A typical example used for database HA solutions is that the actual database files of a database server are placed on a shared file system or Storage Area Network (SAN); thus, there is a dependency between the database server and the shared file system, which is not expressed in the application workflow. Fig 4 shows an implicit business dependency between an EAR module and a database being inserted in the application workflow, and an implicit application dependency between the database server and a shared file system being inserted in the IT resource workflow. After the workflow has been mapped from the business level to the IT resource level, we extract the list of IT resources that are involved in each workflow. Then, the workflowresource relationship matrix for weak-point analysis is created, which contains the necessary information for the relevant IT resources for each workflow. We assume there exist n business workflows over the SOA deployment topology, denoted W1, W2, W3,...,Wn. These workflows are specified with availability requirements P1, P2, P3,..., Pn, where 0 < Pi < 1. We also assume that there are m IT resources, denoted C1, C2, ..., Cm. Each resource consists of a “stack” of hardware and software components (e.g., an X86 server, a Linux Operating System, and a Websphere Application Server). C1 C2 C3 ... Cm W1(P1) R1,1 R1,2 R1,3 ... R1,m W2(P2) R2,1 R2,2 R2,3 ... R2,m ... ... ... ... ... ... Wn(Pn) Rn,1 Rn,2 Rn,3 ... Rn,m TABLE I The workflow-resource relationship matrix Table I shows the workflow-resource relationship matrix; the relationship between business workflow Wi and IT resource Cj is Ri,j , where Ri,j is an integer count of the number of references to IT resource Cj from business workflow Wi . Ri,j is set to 0 when resource Cj is not included in the resource list of Wi . For example, Fig. 5 shows a business workflow with
Weak points and Business Worktlow 1 (Wi) HA Expertise HA enhancement parameters Service 1 Service 1 HA pattem P(Ci) P(C2) P(C3) capturer Topology with Component 1 Component 1 Component 1 weak points Map with Resource C1 Resource C1 Resource C1 A Pattern Reposito a HA pattern P(C4) HA pattern transfommation Component 1 →Dependency Link Resource C1 HA pattem appled topolog☑ T ○pplicaton Fig.5.Example of BPEL workflow two services,which are mapped to three IT resources,C1, Fig.6.HA pattern mapping and transformation C2 and C3,plus one implicit resource C4 that is not explicitly included in the business workflow.Note that,at the application given deployment topology;next,the solutions that satisfy level,Component 1 depends on Component 2 to implement the overall availability requirements are selected as candidates; Service 1,and Component 2 depends on Component 3 to finally,among these candidates,the one with minimum cost is implement Service 2;these dependencies are implicit business selected as the best solution.Unfortunately.this method can dependencies.We denote the availability capability of resource only be applied to simple scenarios because when the number Ci as P(Ci);therefore,based on the implicit dependencies of IT resources in the IT infrastructure grows linearly,the discovered above,the availabilities for the two services are computational complexity grows exponentially;consequently, P(C)·P(C2)·P(C3)andP(C2)·P(C3).Thus,the avail-- the exhaustive iteration method can hardly be applied to real- ability for the workflow is P(C1).P(C2)2.P(C3)2,and the world scenarios.Moveover,in SOA environments,the IT matrix for business workflow Wi is set to [1,2,2,0].For a infrastructure must be very flexible so that it can quickly adapt standalone service that has no dependency relationships,we to changing business requirements;therefore,the HA analysis can simply set Ri.i to 1 for all its referenced resources,and may be invoked frequently and should be processed quickly 0 for its unreferenced resources. to provide a cost-effective solution. To address the above challenges,we describe a weak-point B.Weak-Point Analysis Module analysis methodology in Section III.It utilizes a Lagrange The Weak-Point Analysis Module uses the workflow- multiplier method of constrained optimization to calculate resource relationship matrix to calculate a near-optimal HA the optimal HA enhancement recommendation over the SOA enhancement recommendation.Traditionally,HA analysis lo- deployment topology subject to a utility function,and produces cates single points of failure in the IT infrastructure topology; the HA enhancement parameters for each relevant IT resource. for example,if there is no HA solution applied to a Web server,then it is regarded as a single point of failure.This C.HA Pattern Mapping Module method can be applied simply but has limitations in current Based on the optimized HA enhancement recommendation. SOA environments.For example,even if an IT resource has the HA Pattern Mapping Module applies relevant HA patterns been made redundant,it still could be a weak point and to the identified weak points.These patterns may be generic more redundancy could be required to satisfy the availability (e.g.,clustering,hot standby)or product-specific (e.g.,DB2 requirements of the corresponding business workflows.Fur- HADR-High Availability for Disaster Recovery).The goal thermore,the cost and HA capabilities of redundancy vary of this module is to finally produce an HA-enhanced de- for different IT resource types;thus,it is critical to find the ployment topology that satisfies the business-level availability points in the IT infrastructure where it is most cost-effective requirements with a minimum overall cost. to apply an HA solution.Based on the workflow-resource In this module (see Fig.6),each HA pattern is associated relationship matrix,weak-point analysis identifies these weak with an applicable IT resource type (e.g.,a J2EE application points in the IT infrastructure and calculates the cost-effective server or a DB2 database),and provides transformation and HA enhancement parameters. configuration logic for applying the pattern.For each weak At first sight,it looks like an exhaustive iteration method point identified in the IT infrastructure,a list of compatible should be used to identify the weak points and produce HA HA patterns is generated using two matching mechanisms. enhancement parameters.The algorithm would be as follows: The first is the applicable-type match:if the weak point is a first,identify all the possible enhancement solutions for the single IT resource,then HA patterns whose applicable type
Fig. 5. Example of BPEL workflow two services, which are mapped to three IT resources, C1, C2 and C3, plus one implicit resource C4 that is not explicitly included in the business workflow. Note that, at the application level, Component 1 depends on Component 2 to implement Service 1, and Component 2 depends on Component 3 to implement Service 2; these dependencies are implicit business dependencies. We denote the availability capability of resource Ci as P(Ci); therefore, based on the implicit dependencies discovered above, the availabilities for the two services are P(C1) · P(C2) · P(C3) and P(C2) · P(C3). Thus, the availability for the workflow is P(C1) · P(C2) 2 · P(C3) 2 , and the matrix for business workflow W1 is set to [1,2,2,0]. For a standalone service that has no dependency relationships, we can simply set Ri,j to 1 for all its referenced resources, and 0 for its unreferenced resources. B. Weak-Point Analysis Module The Weak-Point Analysis Module uses the workflowresource relationship matrix to calculate a near-optimal HA enhancement recommendation. Traditionally, HA analysis locates single points of failure in the IT infrastructure topology; for example, if there is no HA solution applied to a Web server, then it is regarded as a single point of failure. This method can be applied simply but has limitations in current SOA environments. For example, even if an IT resource has been made redundant, it still could be a weak point and more redundancy could be required to satisfy the availability requirements of the corresponding business workflows. Furthermore, the cost and HA capabilities of redundancy vary for different IT resource types; thus, it is critical to find the points in the IT infrastructure where it is most cost-effective to apply an HA solution. Based on the workflow-resource relationship matrix, weak-point analysis identifies these weak points in the IT infrastructure and calculates the cost-effective HA enhancement parameters. At first sight, it looks like an exhaustive iteration method should be used to identify the weak points and produce HA enhancement parameters. The algorithm would be as follows: first, identify all the possible enhancement solutions for the Fig. 6. HA pattern mapping and transformation given deployment topology; next, the solutions that satisfy the overall availability requirements are selected as candidates; finally, among these candidates, the one with minimum cost is selected as the best solution. Unfortunately, this method can only be applied to simple scenarios because when the number of IT resources in the IT infrastructure grows linearly, the computational complexity grows exponentially; consequently, the exhaustive iteration method can hardly be applied to realworld scenarios. Moveover, in SOA environments, the IT infrastructure must be very flexible so that it can quickly adapt to changing business requirements; therefore, the HA analysis may be invoked frequently and should be processed quickly to provide a cost-effective solution. To address the above challenges, we describe a weak-point analysis methodology in Section III. It utilizes a Lagrange multiplier method of constrained optimization to calculate the optimal HA enhancement recommendation over the SOA deployment topology subject to a utility function, and produces the HA enhancement parameters for each relevant IT resource. C. HA Pattern Mapping Module Based on the optimized HA enhancement recommendation, the HA Pattern Mapping Module applies relevant HA patterns to the identified weak points. These patterns may be generic (e.g., clustering, hot standby) or product-specific (e.g., DB2 HADR — High Availability for Disaster Recovery). The goal of this module is to finally produce an HA-enhanced deployment topology that satisfies the business-level availability requirements with a minimum overall cost. In this module (see Fig. 6), each HA pattern is associated with an applicable IT resource type (e.g., a J2EE application server or a DB2 database), and provides transformation and configuration logic for applying the pattern. For each weak point identified in the IT infrastructure, a list of compatible HA patterns is generated using two matching mechanisms. The first is the applicable-type match: if the weak point is a single IT resource, then HA patterns whose applicable type
is equal to this resource type are considered compatible.The %Service 1 Service 2Service 3Service 4 second is the pattern match:if the weak point is already a W2 99.9% Service 5 Service 6 Semvice 7 Host redundant HA solution(e.g.,a cluster),then HA patterns that can generate this HA solution are considered as compatible. From the list of matched HA patterns (perhaps under the EAR1(APP)EAR2(APP) EAR3 (APP)DB1 (APP) guidance of the software architect),one is selected and config- ured with the HA enhancement parameters.Then,the pattern transformation and configuration logic is used to generate an IHSServer WASServer WASServer D82 Server HA solution that is deployable in IT environments. LinuxOS LinuxOs AIXOS windosOs III.ALGORITHM PowerServer X86Server C1 C2 C3 C4 A key contribution of our weak-point analysis methodology (a)SOA deployment topology example is to attach availability requirements to the business workflow and then to map these workflows to the IT infrastructure, C1 C2 C3 C4 carrying the availability requirements down to the level of the 1 1 individual IT resources,where they can be analyzed.In this 1 1 0 1 section,we describe a methodology for making HA enhance- (b)Workflow-resource relationship matrix ment recommendations to meet the business-level availability objectives,while keeping the overall cost close to the mini- Fig.7. SOA deployment topology example for problem definition mum.In this methodology,the current availability capability for each workflow is first calculated according to the compo- nent failure behavior parameters obtained from historical data and experience:Mean Time Between Failures (MTBF),Mean Time To Repair(MTTR),etc.Secondly,it is checked whether We denote the n business workflows over the SOA de- the availability requirements for each business workflow have been satisfied.Thirdly,for the affected workfows,the relevant ployment topology as W1.W2,W3....,Wn.These workflows are IT resources are identified as availability weak points.and specified with availability requirements P.B2.P3.....P,where appropriate HA patterns are recommended in order to meet 0<P<1.We also assume that there are m IT resources, the availability requirements. denoted by C1,C2.....Cm.We construct a workflow-resource relationship matrix where IT resources depended upon by A.Availability Optimization Problem each workflow are identified;each matrix entry Ri.;expresses the dependency degree from workflow W;to IT resource C A simple example is depicted in Fig.7(a)to illustrate the as previously described.For instance,if IT resource C is availability optimization problem.In this example,we have referenced three times by a given business workflow Wi,its two business workflows(Wi and W2)and four underlying IT dependency degree Rii will be recorded as 3. resources (C1,C2,C3 and C).The workflow specification module constructs the workflow-resource relationship matrix A resource vector can be defined for a given workfow (see Fig.7(b)).In this example,finding an optimized HA ,Was(C1,R,1),(C2,R,2〉,,(Cm,R,m.For simplicity, enhancement recommendation requires the iterative testing of we express this as (Rj.1.Rj.2....Rj.m).Based on the above various HA solutions and different redundancy degrees:for definition,for a given IT resource Ci,its intrinsic availability each IT resource,various HA solutions could be applicable,capability (e.g.,based on historical experience)is denoted by and for the redundancy-based HA solution,different redun- P(Ci),and its cost is denoted by h(Ci,ni)where ni is the dancy degrees should be explored. redundancy degree (e.g.,cluster size)of this resource.For More generally,to make an optimal HA enhancement a given workflow Wi,its current availability capability can recommendation for an SOA deployment topology,three op- be calculated by P(W;).These calculation functions will be timization dimensions should be iteratively explored:i)every discussed in Section III.B. IT resource in the deployment topology could be an HA enhancement candidate;ii)a variety of HA solutions could The optimization problem becomes finding a cost- be applied to a given IT resource;and iii)each of these HA effective HA enhancement recommendation described as solutions could rely on different redundancy degrees.Such an ((Ci,n1),(C2.n2).....(Cm,nm))where Ci denotes the original iterative exploration requires exponential computation time. IT resource,and n denotes the new redundancy degree Due to this computation complexity,it is inapplicable for for Ci.This enhancement recommendation keeps the overall large-scale IT infrastructures,which are frequently used in cost minimal while satisfying all availability requirements for real-world SOA environments. each business workflow.Formally,the availability optimization Let us now rigorously define this availability optimization problem is defined as a constrained optimization problem as problem. follows:
is equal to this resource type are considered compatible. The second is the pattern match: if the weak point is already a redundant HA solution (e.g., a cluster), then HA patterns that can generate this HA solution are considered as compatible. From the list of matched HA patterns (perhaps under the guidance of the software architect), one is selected and configured with the HA enhancement parameters. Then, the pattern transformation and configuration logic is used to generate an HA solution that is deployable in IT environments. III. ALGORITHM A key contribution of our weak-point analysis methodology is to attach availability requirements to the business workflow and then to map these workflows to the IT infrastructure, carrying the availability requirements down to the level of the individual IT resources, where they can be analyzed. In this section, we describe a methodology for making HA enhancement recommendations to meet the business-level availability objectives, while keeping the overall cost close to the minimum. In this methodology, the current availability capability for each workflow is first calculated according to the component failure behavior parameters obtained from historical data and experience: Mean Time Between Failures (MTBF), Mean Time To Repair (MTTR), etc. Secondly, it is checked whether the availability requirements for each business workflow have been satisfied. Thirdly, for the affected workflows, the relevant IT resources are identified as availability weak points, and appropriate HA patterns are recommended in order to meet the availability requirements. A. Availability Optimization Problem A simple example is depicted in Fig. 7(a) to illustrate the availability optimization problem. In this example, we have two business workflows (W1 and W2) and four underlying IT resources (C1, C2, C3 and C4). The workflow specification module constructs the workflow-resource relationship matrix (see Fig. 7(b)). In this example, finding an optimized HA enhancement recommendation requires the iterative testing of various HA solutions and different redundancy degrees: for each IT resource, various HA solutions could be applicable, and for the redundancy-based HA solution, different redundancy degrees should be explored. More generally, to make an optimal HA enhancement recommendation for an SOA deployment topology, three optimization dimensions should be iteratively explored: i) every IT resource in the deployment topology could be an HA enhancement candidate; ii) a variety of HA solutions could be applied to a given IT resource; and iii) each of these HA solutions could rely on different redundancy degrees. Such an iterative exploration requires exponential computation time. Due to this computation complexity, it is inapplicable for large-scale IT infrastructures, which are frequently used in real-world SOA environments. Let us now rigorously define this availability optimization problem. Fig. 7. SOA deployment topology example for problem definition We denote the n business workflows over the SOA deployment topology as W1,W2,W3,...,Wn. These workflows are specified with availability requirements P1,P2,P3,...,Pn, where 0 < Pi < 1. We also assume that there are m IT resources, denoted by C1,C2,...,Cm. We construct a workflow-resource relationship matrix where IT resources depended upon by each workflow are identified; each matrix entry Rj,i expresses the dependency degree from workflow Wj to IT resource Ci as previously described. For instance, if IT resource Ci is referenced three times by a given business workflow Wj , its dependency degree Rj,i will be recorded as 3. A resource vector can be defined for a given workflow Wj as (hC1,Rj,1i, hC2,Rj,2i, ..., hCm,Rj,mi). For simplicity, we express this as (Rj,1,Rj,2,...,Rj,m). Based on the above definition, for a given IT resource Ci , its intrinsic availability capability (e.g., based on historical experience) is denoted by P(Ci), and its cost is denoted by h(Ci ,ni) where ni is the redundancy degree (e.g., cluster size) of this resource. For a given workflow Wj , its current availability capability can be calculated by P(Wj ). These calculation functions will be discussed in Section III.B. The optimization problem becomes finding a costeffective HA enhancement recommendation described as (hC1,n0 1 i,hC2,n0 2 i,...,hCm,n0 mi) where Ci denotes the original IT resource, and n 0 i denotes the new redundancy degree for Ci . This enhancement recommendation keeps the overall cost minimal while satisfying all availability requirements for each business workflow. Formally, the availability optimization problem is defined as a constrained optimization problem as follows: