Caching In on the Enterprise grid Turbo-Charge Your Applica ations w1 Oracleas Web cache An Oracle Technical W hite Paper September 2005 ORACLE FUSION MIDDLEWARE
Caching In on the Enterprise Grid Turbo-Charge Your Applications with OracleAS Web Cache An Oracle Technical White Paper September 2005
OracleAs Web Cache Introduction to Grid Computing Information Technology Challenges Organizations Face Today..3 Grid Computing and Oracle,s grid Computing Offering Oracle Application Server 10g and its Benefits Performance, Scalability, and QoS on the Grid The Dynamic Content Dilemma Peak loads, Flash Crowds, Lights Out The End User is King…… Introducing Oracle Application Server Web Cache Efficient Use of Low-Cost, Existing Hardware Static and Dynamic Content Caching Invalidation for Cache Consistency polio Invalidation Messages Partial-Page Caching and Personalized Page Assembly ESI for Java (ESt-JSR 128 Automatic Compression Linear Scalability on Commodity Hardware…… Workload Management and reliability…… Web Server Load Balancing, Failover, and Connection Pooling..19 Surge Protection and Throttling Quality of Service Assurance Cache Clustering… Partitioning the Web Object Space for Linear Capacity Content Provisioning Failure Detection and Auto-Restart for Hands-Off Manageability 23 On-line Reconfiguration Without Loss of Service 24 Reduced Load on the Origin Servers…… End-User Experience Management. ole Deployment Op terogeneous Environment Branch Office hierarchies Customers and Return on Investment ROI: A Case Study of Di More Information.…. OracleAs Web Cache 10g(10.1.)-Technical White poration, All Rights Reserved
Page 2 OracleAS Web Cache 10g (10.1.2) -- Technical White Paper Copyright © 1999-2005 Oracle Corporation, All Rights Reserved OracleAS Web Cache Introduction to Grid Computing.................................................................... 3 Information Technology Challenges Organizations Face Today.......... 3 Grid Computing and Oracle’s Grid Computing Offering...................... 3 Oracle Application Server 10g and its Benefits ........................................ 4 Performance, Scalability, and QoS on the Grid............................................ 4 The Dynamic Content Dilemma................................................................ 5 Peak Loads, Flash Crowds, Lights Out ..................................................... 6 The End User is King .................................................................................. 6 Introducing Oracle Application Server Web Cache..................................... 7 Efficient Use of Low-Cost, Existing Hardware............................................ 8 Static and Dynamic Content Caching........................................................ 9 Invalidation for Cache Consistency ......................................................... 13 Expiration Policies ................................................................................. 13 Invalidation Messages............................................................................ 13 Partial-Page Caching and Personalized Page Assembly........................ 14 ESI for Java (JESI) – JSR 128 .............................................................. 16 Automatic Compression............................................................................ 17 Linear Scalability on Commodity Hardware........................................... 18 Workload Management and Reliability ........................................................ 19 Web Server Load Balancing, Failover, and Connection Pooling......... 19 Surge Protection and Throttling............................................................... 20 Quality of Service Assurance .................................................................... 20 Cache Clustering ......................................................................................... 21 Partitioning the Web Object Space for Linear Capacity................... 22 Content Provisioning............................................................................. 23 Failure Detection and Auto-Restart for Hands-Off Manageability 23 On-line Reconfiguration Without Loss of Service............................ 24 Reduced Load on the Origin Servers .................................................. 24 End-User Experience Management.............................................................. 25 Flexible Deployment Options....................................................................... 27 Integrated with the Oracle Technology Stack ........................................ 28 Heterogeneous Environments.................................................................. 29 Branch Office Hierarchies......................................................................... 29 Customers and Return on Investment......................................................... 30 ROI: A Case Study of Digital River........................................................ 31 Turning Cache into Cash ...................................................................... 32 Summary........................................................................................................... 32 More Information ........................................................................................... 33
INTRODUCTION TO GRID COMPUTING Information Technology Challenges Organizations Face Today The primary challenge facing organizations today is the high cost of their information technology infrastructure. This high cost arises from three related 1. Excess Computing Capacity that is poorly utilized due to the need to build capacity for peaks and the inability to use th 2. Expensine Capacity Growth due to the inability to add capacity quickly, when needed, and in low-cost, modular units to avoid further compounding the xcess capacity 3. Higb Cost of Management due to the complexity of systems; the specialized tools, procedures, and skills required; and the large amounts of human intervention needed to manage systems Grid Computing and Oracle's Grid Computing Offering Grid Computing is a new software architecture designed to effectively pool together large amounts of low-cost modular storage and servers to create a virtual computing resource across which work can be transparently distributed to use capacity efficiently, at low cost, and with high availability. The resources in a grid pooling resources together, Grid Computing can offer dependable, consistent, pervasive, and inexpensive access to these resources regardless of their location and when needed, thereby fulfilling the need for computing capacity on-demand While grid Computing has primarily been used by the scientific community to solve very specialized problems, the rapid evolution of cost-effective networked storage; high speed, high density blade servers; high speed network interconnects and low cost operating systems; together with the evolving capabilities of systems software(databases and application servers) to exploit these advances have now made it possible for enterprises to leverage the benefits of Grid Computing Oracle offers a comprehensive solution to manage information and run Enterprise Applications on Grids using Oracle Database 10g and Oracle Application Server 10g. Both Oracle Database 10g and Oracle Application Server 10g can be managed in a Grid Computing environment using Oracle Grid Control. Together thes products address the challenges faced by IT organizations today Radically Reduce or Eliminate Excess Computing Capacity by automatically load balancing workloads to use spare capacity efficiently, eliminating"islands Proride Inexpen sine Capacity Growth by adding capacity on-demand in low cost modular units OracleAS Web Cache 10g(10.1.2)-Technical White Pape poration, All Rights Reserved
Page 3 OracleAS Web Cache 10g (10.1.2) -- Technical White Paper Copyright © 1999-2005 Oracle Corporation, All Rights Reserved INTRODUCTION TO GRID COMPUTING Information Technology Challenges Organizations Face Today The primary challenge facing organizations today is the high cost of their information technology infrastructure. This high cost arises from three related causes: 1. Excess Computing Capacity that is poorly utilized due to the need to build capacity for peaks, and the inability to use the spare capacity efficiently 2. Expensive Capacity Growth due to the inability to add capacity quickly, when needed, and in low-cost, modular units to avoid further compounding the problem of excess capacity 3. High Cost of Management due to the complexity of systems; the specialized tools, procedures, and skills required; and the large amounts of human intervention needed to manage systems. Grid Computing and Oracle’s Grid Computing Offering Grid Computing is a new software architecture designed to effectively pool together large amounts of low-cost modular storage and servers to create a virtual computing resource across which work can be transparently distributed to use capacity efficiently, at low cost, and with high availability. The resources in a Grid can include storage, servers, databases, application servers, and applications. By pooling resources together, Grid Computing can offer dependable, consistent, pervasive, and inexpensive access to these resources regardless of their location and when needed, thereby fulfilling the need for computing capacity on-demand. While Grid Computing has primarily been used by the scientific community to solve very specialized problems, the rapid evolution of cost-effective networked storage; high speed, high density blade servers; high speed network interconnects; and low cost operating systems; together with the evolving capabilities of systems software (databases and application servers) to exploit these advances have now made it possible for enterprises to leverage the benefits of Grid Computing. Oracle offers a comprehensive solution to manage information and run Enterprise Applications on Grids using Oracle Database 10g and Oracle Application Server 10g. Both Oracle Database 10g and Oracle Application Server 10g can be managed in a Grid Computing environment using Oracle Grid Control. Together these products address the challenges faced by IT organizations today: • Radically Reduce or Eliminate Excess Computing Capacity by automatically load balancing workloads to use spare capacity efficiently, eliminating “islands of computation” • Provide Inexpensive Capacity Growth by adding capacity on-demand in lowcost modular units
Radially Lomer Cost of Management by centralizing administration of the esources in a grid and automating provisioning and administration tasks these res Oracle Application Server 10g and its Benefits Oracle Application Server 10g, the next generation of Oracle's Integrated Software Infrastructure for Enterprise Applications, has been designed to enable grid Computing. It has been designed to effectively pool together large numbers of low-cost servers to create a virtual computing resource across which enterprise applications can be transparently distributed to use capacity efficiently, at low cost, and with high availability. Any existing application that runs on Oracle Applicatio Server can transparently take advantage of Grid Computing without any change Service-Oriented Applications will find additional benefits when deployed in a Grid. Oracle Application Server 10g provides a number of Grid Computing features, most importantly Radicaly reduce or Eliminate Exzess Computing Capacity through Policy-Based Resource Management; Metrics-based Workload Management; and a variety of advanced back-up, disaster recovery, and clustered fail-over solutions to provide maximum availability in a grid. Proride modular, Inexpensive Capacity Growth through Automated Installation, Configuration, and Software Provisioning(including both software cloning and patch management) across hundreds of nodes in a grid Radically lower Cost of management and eliminate human errors in management through Centralized Systems Monitoring, Unified Application Server Cluster Management (including Cluster Monitoring, Cluster Optimization, and Cluster-wide Application Deployment), and centralized Identity Management across a Grid. PERFORMANCE, SCALABILITY, AND QOS ON THE GRID IT managers and software developers face strict performance, scalability, and quality of service(Qos)requirements for Web-based applications, whether or not their applications are deployed in a grid environment. To be successful, IT' administrators must protect against poor response times and system outages caused by peak loads, at the same time, they must also contain costs. For their part, successful application developers must consider performance and scalability at design time, not as a post-deployment tuning exercise. Today's IT managers and software developers face three main performance-related 1. Dynamic, Web-based applications are compu re asked to do more with less 2. Unexpected traffic surges can cause delays and outages, yet planning for peak loads is cost-prohibitive OracleAS Web Cache 10g(10.1.2)-Technical White Pape poration, All Rights Reserved
Page 4 OracleAS Web Cache 10g (10.1.2) -- Technical White Paper Copyright © 1999-2005 Oracle Corporation, All Rights Reserved • Radically Lower Cost of Management by centralizing administration of the resources in a Grid and automating provisioning and administration tasks across these resources Oracle Application Server 10g and its Benefits Oracle Application Server 10g, the next generation of Oracle’s Integrated Software Infrastructure for Enterprise Applications, has been designed to enable Grid Computing. It has been designed to effectively pool together large numbers of low-cost servers to create a virtual computing resource across which enterprise applications can be transparently distributed to use capacity efficiently, at low cost, and with high availability. Any existing application that runs on Oracle Application Server can transparently take advantage of Grid Computing without any changes. Service-Oriented Applications will find additional benefits when deployed in a Grid. Oracle Application Server 10g provides a number of Grid Computing features, most importantly: • Radically Reduce or Eliminate Excess Computing Capacity through Policy-Based Resource Management; Metrics-based Workload Management; and a variety of advanced back-up, disaster recovery, and clustered fail-over solutions to provide maximum availability in a Grid. • Provide Modular, Inexpensive Capacity Growth through Automated Installation, Configuration, and Software Provisioning (including both software cloning and patch management) across hundreds of nodes in a Grid. • Radically Lower Cost of Management and eliminate human errors in management through Centralized Systems Monitoring, Unified Application Server Cluster Management (including Cluster Monitoring, Cluster Optimization, and Cluster-wide Application Deployment), and centralized Identity Management across a Grid. PERFORMANCE, SCALABILITY, AND QOS ON THE GRID IT managers and software developers face strict performance, scalability, and quality of service (QoS) requirements for Web-based applications, whether or not their applications are deployed in a Grid environment. To be successful, IT administrators must protect against poor response times and system outages caused by peak loads; at the same time, they must also contain costs. For their part, successful application developers must consider performance and scalability at design time, not as a post-deployment tuning exercise. Today’s IT managers and software developers face three main performance-related challenges: 1. Dynamic, Web-based applications are compute-intensive, yet IT personnel are asked to do more with less 2. Unexpected traffic surges can cause delays and outages, yet planning for peak loads is cost-prohibitive
3. Users are demanding sub-second response times, yet visibility into end- user service levels is poor The Dynamic Content Dilemma While fast response times are crucial for revenue generation, companies must also retain users and control costs if they want to become (or remain) profitable. The problem of user retention is most often addressed by delivering dynamic personalized content to each user. Yet as the Aberdeen group aptly warns, customizing each Web page raises the ever-present cost specter. " Customized content generated at the moment of request often translates into high hardware, software and data management costs. Without an adequate infrastructure, customized content could do more harm than good -degrading the firms brand, slowing customer service, or, in the worst case, causing a customer to turn away and go to a competitor. 1 One alternative is to design Web pages using only static content. In terms of computation and resource utilization, static content is easy to generate and deliver, and most static Web sites will perform adequately under heavy load. One of the problems with this approach is that without a dynamic, database-driven infrastructure, content management becomes difficult. Every time an update is nade. the static Web site has to be While static content may have been sufficient for first-generation Web design, todays e-businesses must offer customers a more compelling user experience. E business is anything but static. Companies must exchange data in real-time with other companies, and customer retention demands an interactive, one-to-one relationship with consumers. For these reasons and more, database-driven, ated content is at the heart of today's Web-based applic architecture Despite its prominent architectural role, dynamic content generation poses significant challenges for e-business managers who struggle to control costs without sacrificing perfo rmance To paraphrase the Aberdeen Group report cited earlier, generating content on the fly involves several steps that inevitably utilize a large amount of computing power and, under load, can lead to performance bottlenecks The typical steps can be summarized as follows. In order for a Web browser to request content from a Web server, the users chent machine must first connect to the Web server Once connected the browsers Http request has to be parsed by the Web server thE Http request may contain parameters and header information that must be passed to a"presentation"mechanism for appropriate content retrieval and formatting. If the requested content requires formatting by a servlet or SP, then the Web server must connect to a runtime environment(e. g Apache Tomcat), which may be running on a separate machine. Once invoked, the servlet may instantiate a number of Java classes which query a database, requiring further network connections, since the database is typically running on a dedicated machine of its own. Ultimately, the formatted content is returned to the Web I Aberdeen Group, Aazlerating Web Site Perfornance by Caching Dynamic Content, January 2001 OracleS Web Cache 10g(1 Technical White Pape ation, All Rights Reserved
Page 5 OracleAS Web Cache 10g (10.1.2) -- Technical White Paper Copyright © 1999-2005 Oracle Corporation, All Rights Reserved 3. Users are demanding sub-second response times, yet visibility into enduser service levels is poor The Dynamic Content Dilemma While fast response times are crucial for revenue generation, companies must also retain users and control costs if they want to become (or remain) profitable. The problem of user retention is most often addressed by delivering dynamic, personalized content to each user. Yet as the Aberdeen Group aptly warns, customizing each Web page raises the ever-present cost specter. “Customized content generated at the moment of request often translates into high hardware, software and data management costs. Without an adequate infrastructure, customized content could do more harm than good – degrading the firm's brand, slowing customer service, or, in the worst case, causing a customer to turn away and go to a competitor.”1 One alternative is to design Web pages using only static content. In terms of computation and resource utilization, static content is easy to generate and deliver, and most static Web sites will perform adequately under heavy load. One of the problems with this approach is that without a dynamic, database-driven infrastructure, content management becomes difficult. Every time an update is made, the static Web site has to be redesigned and republished. While static content may have been sufficient for first-generation Web design, today’s e-businesses must offer customers a more compelling user experience. Ebusiness is anything but static. Companies must exchange data in real-time with other companies, and customer retention demands an interactive, one-to-one relationship with consumers. For these reasons and more, database-driven, dynamically generated content is at the heart of today’s Web-based application architectures. Despite its prominent architectural role, dynamic content generation poses significant challenges for e-business managers who struggle to control costs without sacrificing performance. To paraphrase the Aberdeen Group report cited earlier, generating content on the fly involves several steps that inevitably utilize a large amount of computing power and, under load, can lead to performance bottlenecks. The typical steps can be summarized as follows. In order for a Web browser to request content from a Web server, the user’s client machine must first connect to the Web server. Once connected, the browser’s HTTP request has to be parsed by the Web server. The HTTP request may contain parameters and header information that must be passed to a “presentation” mechanism for appropriate content retrieval and formatting. If the requested content requires formatting by a servlet or JSP, then the Web server must connect to a runtime environment (e.g., Apache Tomcat), which may be running on a separate machine. Once invoked, the servlet may instantiate a number of Java classes which query a database, requiring further network connections, since the database is typically running on a dedicated machine of its own. Ultimately, the formatted content is returned to the Web 1 Aberdeen Group, Accelerating Web Site Performance by Caching Dynamic Content, January 2001