16 CHAPTER 1 Understanding object/relational persistence modeler.The usual solution to this problem is to bend and twist the object model until it matches the underlying relational technology This can be done successfully,but only at the cost of losing some of the advan- tages of object orientation.Keep in mind that relational modeling is underpinned by relational theory.Object orientation has no such rigorous mathematical defini- tion or body of theoretical work.So,we can't look to mathematics to explain how we should bridge the gap between the two paradigms-there is no elegant trans- formation waiting to be discovered.(Doing away with Java and SQL and starting from scratch isn't considered elegant.) The domain modeling mismatch problem isn't the only source of the inflexibil- ity and lost productivity that lead to higher costs.A further cause is the JDBC API itself.JDBC and SQL provide a statement(that is,command-)oriented approach to moving data to and from an SQL database.A structural relationship must be spec ified at least three times(Insert.Update.select),adding to the time required for design and implementation.The unique dialect for every SQL database doesn't improve the situation. Recently,it has been fashionable to regard architectural or pattern-based mod- els as a partial solution to the mismatch problem.Hence,we have the entity bean component model,the data access object (DAO)pattern,and other practices to implement data access.These approaches leave most or all of the problems listed earlier to the application developer.To round out your understanding of object persistence,we need to discuss application architecture and the role of a persistence layerin typical application design. 1.3 Persistence layers and alternatives In a medium-or large-sized application,it usually makes sense to organize classes by concern.Persistence is one concern.Other concerns are presentation,work. flow,and business logic.There are also the so-called "cross-cutting"concerns,which may be implemented generically-by framework code,for example.Typical cross- cutting concerns include logging,authorization,and transaction demarcation. A typical object-oriented architecture comprises layers that represent the concerns.It's normal,and certainly best practice.to group all classes and components responsible for persistence into a separate persistence layer in a layered system architecture In this section,we first look at the layers of this type of architecture and why we use them.After that,we focus on the layer we're most interested in-the persis- tence laver-and some of the wavs it can be implemented
16 CHAPTER 1 Understanding object/relational persistence modeler. The usual solution to this problem is to bend and twist the object model until it matches the underlying relational technology. This can be done successfully, but only at the cost of losing some of the advantages of object orientation. Keep in mind that relational modeling is underpinned by relational theory. Object orientation has no such rigorous mathematical definition or body of theoretical work. So, we can’t look to mathematics to explain how we should bridge the gap between the two paradigms—there is no elegant transformation waiting to be discovered. (Doing away with Java and SQL and starting from scratch isn’t considered elegant.) The domain modeling mismatch problem isn’t the only source of the inflexibility and lost productivity that lead to higher costs. A further cause is the JDBC API itself. JDBC and SQL provide a statement- (that is, command-) oriented approach to moving data to and from an SQL database. A structural relationship must be specified at least three times (Insert, Update, Select), adding to the time required for design and implementation. The unique dialect for every SQL database doesn’t improve the situation. Recently, it has been fashionable to regard architectural or pattern-based models as a partial solution to the mismatch problem. Hence, we have the entity bean component model, the data access object (DAO) pattern, and other practices to implement data access. These approaches leave most or all of the problems listed earlier to the application developer. To round out your understanding of object persistence, we need to discuss application architecture and the role of a persistence layer in typical application design. 1.3 Persistence layers and alternatives In a medium- or large-sized application, it usually makes sense to organize classes by concern. Persistence is one concern. Other concerns are presentation, workflow, and business logic. There are also the so-called “cross-cutting” concerns, which may be implemented generically—by framework code, for example. Typical crosscutting concerns include logging, authorization, and transaction demarcation. A typical object-oriented architecture comprises layers that represent the concerns. It’s normal, and certainly best practice, to group all classes and components responsible for persistence into a separate persistence layer in a layered system architecture. In this section, we first look at the layers of this type of architecture and why we use them. After that, we focus on the layer we’re most interested in—the persistence layer—and some of the ways it can be implemented. Licensed to Jose Carlos Romero Figueroa <jose.romero@galicia.seresco.es>
17 1.3.1 Layered architecture A layered architecture defines interfaces between code that implements the various concerns,allowing a change to the way one concern is implemented without sig- nificant disruption to code in the other layers.Layering also determines the kinds of interlayer dependencies that occur.The rules are as follows: Layers communicate top to bottom.A layer is dependent only on the layer directly below it. Each layer is unaware of any other layers except for the layer just below it. Different applications group concerns differently,so they define different layers. A typical,proven,high-level application architecture uses three layers,one each for presentation,business logic,and persistence,as shown in figure 1.4. Let's take a closer look at the layers and elements in the diagram: Presentation layer-The user interface logic is topmost.Code responsible for the presentation and control of page and screen navigation forms the pre sentation layer. Business layer-The exact form of the next layer varies widely between appli- cations.It's generally agreed,however,that this business layer is responsible for implementing any business rules or system requirements that would be understood by users as part of the problem domain.In some systems,this layer has its own internal representation of the business domain entities.In others,it reuses the model defined by the persistence layer.We revisit this issue in chapter 3. Presentation Layer Business Layer Persistence Layer Figure 14 atabas A persistence layer is the basis in a layered architecture
Persistence layers and alternatives 17 1.3.1 Layered architecture A layered architecture defines interfaces between code that implements the various concerns, allowing a change to the way one concern is implemented without significant disruption to code in the other layers. Layering also determines the kinds of interlayer dependencies that occur. The rules are as follows: ■ Layers communicate top to bottom. A layer is dependent only on the layer directly below it. ■ Each layer is unaware of any other layers except for the layer just below it. Different applications group concerns differently, so they define different layers. A typical, proven, high-level application architecture uses three layers, one each for presentation, business logic, and persistence, as shown in figure 1.4. Let’s take a closer look at the layers and elements in the diagram: ■ Presentation layer—The user interface logic is topmost. Code responsible for the presentation and control of page and screen navigation forms the presentation layer. ■ Business layer—The exact form of the next layer varies widely between applications. It’s generally agreed, however, that this business layer is responsible for implementing any business rules or system requirements that would be understood by users as part of the problem domain. In some systems, this layer has its own internal representation of the business domain entities. In others, it reuses the model defined by the persistence layer. We revisit this issue in chapter 3. Presentation Layer Business Layer Persistence Layer Utility and Helper Classes Database Figure 1.4 A persistence layer is the basis in a layered architecture. Licensed to Jose Carlos Romero Figueroa <jose.romero@galicia.seresco.es>
18 CHAPTER 1 Understanding object/relational persistence Persistence layer-The persistence layer is a group of classes and components responsible for data storage to,and retrieval from,one or more data stores This layer necessarily includes a model of the business domain entities (even if it's only a metadata model). DatabaseThe database exists outside the Java application.It's the actual, persistent representation of the system state.If an SQL database is used,the database includes the relational schema and possibly stored procedures. Helper/utility classes-Every application has a set of infrastructural helper or utility classes that are used in every layer of the application (for example Exception classes for error handling).These infrastructural elements don't form a layer,since they don't obey the rules for interlayer dependency in a lavered architecture. Let's now take a brief look at the various ways the persistence layer can be imple- mented by Java applications.Don't worry- we'll get to ORM and Hibernate soon There is much to be learned by looking at other approaches. 1.3.2 Hand-coding a persistence layer with SQL/JDBC The most common approach to Java persistence is for application programmers to work directly with SQL and JDBC.After all,developers are familiar with rela- tional database management systems,understand SQL.and know how to work with tables and foreign keys.Moreover,they can always use the well-known and widely used DAO design pattern to hide complex JDBC code and nonportable SQL from the business logic. The DAO pattern is a good one- -so good that we recommend its use even with ORM(see chapter 8).However,the work involved in manually coding persistence for each domain class is considerable,particularly when multiple SQL dialects are supported.This work usually ends up consuming a large portion of the develop- ment effort.Furthermore,when requirements change,a hand-coded solution always requires more attention and maintenance effort. So why not implement a simple ORM framework to fit the specific requirements of your project?The result of such an effort could even be reused in future projects.Many developers have taken this approach;numerous homegrown object/relational persistence layers are in production systems today.However,we don't recommend this approach.Excellent solutions already exist,not only the (mostly expensive)tools sold by commercial vendors but also open source projects with free licenses.We're certain you'll be able to find a solution that meets your
18 CHAPTER 1 Understanding object/relational persistence ■ Persistence layer—The persistence layer is a group of classes and components responsible for data storage to, and retrieval from, one or more data stores. This layer necessarily includes a model of the business domain entities (even if it’s only a metadata model). ■ Database—The database exists outside the Java application. It’s the actual, persistent representation of the system state. If an SQL database is used, the database includes the relational schema and possibly stored procedures. ■ Helper/utility classes—Every application has a set of infrastructural helper or utility classes that are used in every layer of the application (for example, Exception classes for error handling). These infrastructural elements don’t form a layer, since they don’t obey the rules for interlayer dependency in a layered architecture. Let’s now take a brief look at the various ways the persistence layer can be implemented by Java applications. Don’t worry—we’ll get to ORM and Hibernate soon. There is much to be learned by looking at other approaches. 1.3.2 Hand-coding a persistence layer with SQL/JDBC The most common approach to Java persistence is for application programmers to work directly with SQL and JDBC. After all, developers are familiar with relational database management systems, understand SQL, and know how to work with tables and foreign keys. Moreover, they can always use the well-known and widely used DAO design pattern to hide complex JDBC code and nonportable SQL from the business logic. The DAO pattern is a good one—so good that we recommend its use even with ORM (see chapter 8). However, the work involved in manually coding persistence for each domain class is considerable, particularly when multiple SQL dialects are supported. This work usually ends up consuming a large portion of the development effort. Furthermore, when requirements change, a hand-coded solution always requires more attention and maintenance effort. So why not implement a simple ORM framework to fit the specific requirements of your project? The result of such an effort could even be reused in future projects. Many developers have taken this approach; numerous homegrown object/relational persistence layers are in production systems today. However, we don’t recommend this approach. Excellent solutions already exist, not only the (mostly expensive) tools sold by commercial vendors but also open source projects with free licenses. We’re certain you’ll be able to find a solution that meets your Licensed to Jose Carlos Romero Figueroa <jose.romero@galicia.seresco.es>
19 requirements,both business and technical.It's likely that such a solution will do a great deal more,and do it better,than a solution you could build in a limited time. Development of a reasonably full-featured ORM may take many developers months.For example,Hibernate is 43,000 lines of code (some of which is much more difficult than typical application code),along with 12,000 lines of unit test code.This might be more than your application.A great many details can easily be overlooked-as both the authors know from experience!Even if an existing tool doesn't fully implement two or three of your more exotic requirements,it's still probably not worth creating your own.Any ORM will handle the tedious common cases-the ones that really kill productivity.It's okay that you might need to hand code certain special cases;few applications are composed primarily of special cases. Don't fall for the"Not Invented Here"syndrome and start your own object/rela tional mapping effort just to avoid the learning curve associated with third-party software.Even if you decide that all this ORM stuff is crazy.and you want to work as close to the SQL database as possible,other persistence frameworks exist that don't implement full ORM.For example,the iBATIS database layer is an open source persistence layer that handles some of the more tedious JDBC code while letting developers handcraft the SQL. 1.3.3 Using serialization Java has a built-in persistence mechanism:Serialization provides the ability to write a graph of objects(the state of the application)to a byte-stream.which may then be persisted to a file or database.Serialization is also used by Java's Remote Method Invocation (RMI)to achieve pass-by value semantics for complex objects Another usage of serialization is to replicate application state across nodes in a cluster of machines. Why not use serialization for the persistence layer?Unfortunately,a serialized graph of interconnected objects can only be accessed as a whole;it's impossible to retrieve any data from the stream without deserializing the entire stream.Thus,the resulting byte-stream must be considered unsuitable for arbitrary search or aggre gation.It isn't even possible to access or update a single object or subgraph inde pendently.Loading and overwriting an entire object graph in each transaction is no option for systems designed to support high concurrency. Clearly.given current technology.serialization is inadequate as a persistence mechanism for high concurrency web and enterprise applications.It has a partic- ular niche as a suitable persistence mechanism for desktop applications
Persistence layers and alternatives 19 requirements, both business and technical. It’s likely that such a solution will do a great deal more, and do it better, than a solution you could build in a limited time. Development of a reasonably full-featured ORM may take many developers months. For example, Hibernate is 43,000 lines of code (some of which is much more difficult than typical application code), along with 12,000 lines of unit test code. This might be more than your application. A great many details can easily be overlooked—as both the authors know from experience! Even if an existing tool doesn’t fully implement two or three of your more exotic requirements, it’s still probably not worth creating your own. Any ORM will handle the tedious common cases—the ones that really kill productivity. It’s okay that you might need to handcode certain special cases; few applications are composed primarily of special cases. Don’t fall for the “Not Invented Here” syndrome and start your own object/relational mapping effort just to avoid the learning curve associated with third-party software. Even if you decide that all this ORM stuff is crazy, and you want to work as close to the SQL database as possible, other persistence frameworks exist that don’t implement full ORM. For example, the iBATIS database layer is an open source persistence layer that handles some of the more tedious JDBC code while letting developers handcraft the SQL. 1.3.3 Using serialization Java has a built-in persistence mechanism: Serialization provides the ability to write a graph of objects (the state of the application) to a byte-stream, which may then be persisted to a file or database. Serialization is also used by Java’s Remote Method Invocation (RMI) to achieve pass-by value semantics for complex objects. Another usage of serialization is to replicate application state across nodes in a cluster of machines. Why not use serialization for the persistence layer? Unfortunately, a serialized graph of interconnected objects can only be accessed as a whole; it’s impossible to retrieve any data from the stream without deserializing the entire stream. Thus, the resulting byte-stream must be considered unsuitable for arbitrary search or aggregation. It isn’t even possible to access or update a single object or subgraph independently. Loading and overwriting an entire object graph in each transaction is no option for systems designed to support high concurrency. Clearly, given current technology, serialization is inadequate as a persistence mechanism for high concurrency web and enterprise applications. It has a particular niche as a suitable persistence mechanism for desktop applications. Licensed to Jose Carlos Romero Figueroa <jose.romero@galicia.seresco.es>
20 CHAPTER 1 Understanding object/relational persistence 1.3.4 Considering EJB entity beans In recent years,Enterprise JavaBeans(EJBs)have been a recommended way of persisting data.If you've been working in the field of Java enterprise applications you've probably worked with EJBs and entity beans in particular.If you haven't, don't worry-entity beans are rapidly declining in popularity.(Many of the devel- oper concerns will be addressed in the new EJB 3.0 specification,however.) Entity beans (in the current EJB 2.1 specification)are interesting because,in contrast to the other solutions mentioned here,they were created entirely by committee.The other solutions (the DAO pattern,serialization,and ORM)were distilled from many years of experience;they represent approaches that have stood the test of time.Unsurprisingly.perhaps,EJB 2.1 entity beans have been a disaster in practice.Design flaws in the EJB specification prevent bean-managed persistence(BMP)entity beans from performing efficiently.A marginally more acceptable solution is container-managed persistence(CMP).at least since some glar- ing deficiencies of the EJB 1.1 specification were rectified. Nevertheless,CMP doesn't represent a solution to the object/relational mis- match.Here are six reasons why: CMP beans are defined in one-to-one correspondence to the tables of the relational model.Thus,they're too coarse grained:they may not take full advantage of Java's rich typing.In a sense.CMP forces your domain model into first normal form. On the other hand,CMP beans are also too fine grained to realize the stated goal of EJB:the definition of reusable software components.A reusable component should be a very coarse-grained object,with an external inter face that is stable in the face of small changes to the database schema.(Yes we really did just claim that CMP entity beans are both too fine grained and too coarse grained!) Although EJBs may take advantage of implementation inheritance,entity beans don't support polymorphic associations and queries,one of the defin- ing features of"true"ORM. Entity beans,despite the stated goal of the EJB specification,aren't portable in practice.Capabilities of CMP engines vary widely between vendors,and the mapping metadata is highly vendor-specific.Some projects have chosen Hibernate for the simple reason that Hibernate applications are much more portable between application servers
20 CHAPTER 1 Understanding object/relational persistence 1.3.4 Considering EJB entity beans In recent years, Enterprise JavaBeans (EJBs) have been a recommended way of persisting data. If you’ve been working in the field of Java enterprise applications, you’ve probably worked with EJBs and entity beans in particular. If you haven’t, don’t worry—entity beans are rapidly declining in popularity. (Many of the developer concerns will be addressed in the new EJB 3.0 specification, however.) Entity beans (in the current EJB 2.1 specification) are interesting because, in contrast to the other solutions mentioned here, they were created entirely by committee. The other solutions (the DAO pattern, serialization, and ORM) were distilled from many years of experience; they represent approaches that have stood the test of time. Unsurprisingly, perhaps, EJB 2.1 entity beans have been a disaster in practice. Design flaws in the EJB specification prevent bean-managed persistence (BMP) entity beans from performing efficiently. A marginally more acceptable solution is container-managed persistence (CMP), at least since some glaring deficiencies of the EJB 1.1 specification were rectified. Nevertheless, CMP doesn’t represent a solution to the object/relational mismatch. Here are six reasons why: ■ CMP beans are defined in one-to-one correspondence to the tables of the relational model. Thus, they’re too coarse grained; they may not take full advantage of Java’s rich typing. In a sense, CMP forces your domain model into first normal form. ■ On the other hand, CMP beans are also too fine grained to realize the stated goal of EJB: the definition of reusable software components. A reusable component should be a very coarse-grained object, with an external interface that is stable in the face of small changes to the database schema. (Yes, we really did just claim that CMP entity beans are both too fine grained and too coarse grained!) ■ Although EJBs may take advantage of implementation inheritance, entity beans don’t support polymorphic associations and queries, one of the defining features of “true” ORM. ■ Entity beans, despite the stated goal of the EJB specification, aren’t portable in practice. Capabilities of CMP engines vary widely between vendors, and the mapping metadata is highly vendor-specific. Some projects have chosen Hibernate for the simple reason that Hibernate applications are much more portable between application servers. Licensed to Jose Carlos Romero Figueroa <jose.romero@galicia.seresco.es>