cover_image

Table of Contents

Series Page

Title Page

Copyright

Dedication Page

Acknowledgements

Chapter 1: Introduction

References

Chapter 2: Middleware

2.1 Enterprise Information Systems

2.2 Communication

2.3 System and Failure Models

2.4 Remote Procedure Call

2.5 Message-Oriented Middleware

2.6 Web Services and Service-Oriented Architecture (SOA)

2.7 Cloud Computing

2.8 Naming and Discovery

2.9 Further Reading

References

Chapter 3: Data Modeling

3.1 Entities and Relationships

3.2 XML Schemas

3.3 Defining New Types

3.4 Derived Types

3.5 Document Hierarchies

3.6 Relationship Types in XML Schemas

3.7 Metaschemas and Metamodels

3.8 Further Reading

References

Chapter 4: Data Processing

4.1 Processing XML Data

4.2 Query Languages and XQuery

4.3 XML Databases

4.4 Web Services

4.5 Presentation Layer: JSON and JQUERY

References

Chapter 5: Domain-Driven Architecture

5.1 Software Architecture

5.2 Domain-Driven Design

5.3 Application Frameworks

5.4 Domain-Specific Languages (DSLs)

5.5 An Example API for Persistent Domain Objects

5.6 Domain-Driven Architecture

5.7 Further Reading

References

Chapter 6: Service-Oriented Architecture

6.1 Services and Procedures

6.2 Service-Oriented Architecture (SOA)

6.3 Service Design Principles

6.4 Service-Oriented Architecture (SOA) Governance

6.5 Standardized Service Contract

6.6 Service Loose Coupling

6.7 Service Abstraction

6.8 Service Reusability

6.9 Service Autonomy

6.10 Service Statelessness

6.11 Service Discoverability

6.12 Further Patterns

6.13 Further Reading

References

Chapter 7: Resource-Oriented Architecture

7.1 Representational State Transfer

7.2 RESTful Web Services

7.3 Resource-Oriented Architecture (ROA)

7.4 Interface Description Languages

7.5 An Example Application Program Interface (API) for Resource-Oriented Web Services

7.6 Hypermedia Control and Contract Conformance

7.7 Concluding Remarks

7.8 Further Reading

References

Appendix A: Introduction to Haskell

A.1 Types and Functions

A.2 Type Classes and Functors

A.3 Monads

A.4 Further Reading

References

Appendix B: Time in Distributed Systems

B.1 What Time Is It?

B.2 Time and Causality

B.3 Applications of Logical and Vector Time

B.4 Virtual Time

B.5 Further Reading

References

Index

Title Page
Title Page

To My Father

Acknowledgements

Thanks to Larry Bernstein for suggesting this book, and for his friendship.

Thanks to my students for their enthusiasm.

Thanks to my family for their support and encouragement.

D. D.

Chapter 1

Introduction

Communications protocol development has tended to follow two paths. One path has emphasized integration with the current language model. The other path has emphasized solving the problems inherent in distributed computing. Both are necessary, and successful advances in distributed computing synthesize elements from both camps.

Waldo et al. [1]

This book is about programming-in-the-large, a term coined by DeRemer and Kron [2] to distinguish the assembly of large applications from components, from the task of implementing those components themselves. Many of the principles of programming-in-the-large were earlier elucidated by Parnas in two articles [[3, 4]]. It is striking how many of the key ideas of programming-in-the-large that remain relevant today were in Parnas' seminal work.

In the 1980s, with the rise of local area networks (LANs) and object-oriented programming, distributed software buses appeared as solutions to the growing complexity of networked heterogeneous software systems. The synthesis of these trends saw the emergence of the Common Object Request Broker Architecture (CORBA) in the late 1980s. Long before then, industries such as banking and air travel had developed on-line networked transactional systems that are still in heavy use today. CORBA, as originally envisaged, was to be the glue for allowing those legacy systems to be incorporated into modern applications. The use of systems such as CORBA, and particularly of transaction processing monitors, as the backbone for enterprise middlewares has, indeed, been a success story in enterprise software environments. Unfortunately, and perhaps predictably, the challenge of heterogeneity remained, as a heterogeneous collection of software buses emerged for application deployment.

In the 1990s, a true sea change happened with the advent of the World Wide Web. Besides making the uniform resource locator (URL) a part of the daily culture, the spectacular success of the Web has required enterprises to consider how their business model can adapt to this newly popular communication medium. The first Netscape browser included support for the hypertext transfer protocol (HTTP) for the Web, but also supported the safe execution of dynamically downloaded Java applets in the Web browser. It also had support for the CORBA communication protocol to enable those applets to connect back to the server through sophisticated inter-object request broker protocols. Clearly, the thinking here was that a simple protocol, such as the Web protocol, should just be an entry point into the more sophisticated realm of distributed enterprise applications built on modern middleware.

The vision of ubiquitous distributed object systems connecting the world together did not materialize. It became impossible to ignore the significance of location that distributed object systems were supposed to mask. No amount of middleware layering can hide the fact that there is a network connecting the points together, and that network has effects such as latency and failures that cannot be masked from applications. At the same time, the Web protocols that were supposed to be just an entry point into more sophisticated protocols showed amazing staying power, while frustration grew with the complexity of the more “sophisticated” approaches. Most importantly, connecting applications across enterprise boundaries remains a difficult problem, perhaps inevitably so because market forces push against a standardized solution to the problem.

The first decade of the 21st century saw the rise of Web services, an alternative approach to distributed object systems for providing and consuming software services outside the enterprise. Rather than run over proprietary protocol stacks, Web services were proposed to operate over the same Web protocol that had been such a spectacular success for business-to-consumer e-commerce. Although business-to-business e-commerce over reliable messaging systems has been in use for many years, the clear trend has been to vastly increase the scale of this inter-operation by making use of the Web protocol stack. Enterprises must learn to leverage the Internet for rapid and agile application development by continually striving for the optimum mix of in-house expertise and outsourced software services. Thus, service-oriented architecture (SOA) emerged: partly driven by the enabling technology of Web services, partly building on work in component-based development for programming-in-the-large, and partly seeking to align the information technology (IT) architecture with the business goals of the enterprise.

The experience of the WS-* standardization effort has been an interesting one. Developers in the field resisted adoption of the WS-* stack, essentially citing the end-to-end argument underlying the original design of the Internet to decry the complexity of the various layers in the stack. The controversy has had an evident effect. Despite early enthusiasm for simple object access protocol (SOAP)- and Web services description language (WSDL)-based Web services, many companies have pulled back and are instead adopting so-called RESTful (representational state transfer) Web services. As an indication of the “zeitgeist”, one IT writer went so far as to declare the SOAP protocol stack to be the worst technology of the last decade [5].

Yet, despite the fact that difficult challenges remain, there is room for optimism. For all of its flaws, asynchronous JavaScript and XML (AJAX) has clearly moved the ball forward in terms of building responsive networked applications where code is easily deployed to where it needs to be executed—essentially anywhere on the planet. As we shall see in Chapter 5, this trend looks likely to intensify. Meanwhile, tools such as jQuery have emerged to tame the complexities of clumsier tools such as the Document Object Model (DOM), while RESTful Web services have emerged as a backlash against the complexity of SOAP-based Web services. If there is a lesson here, it may, perhaps, be in E. F. Schumacher's famous admonition: “Small is beautiful” [6]. Just because a multi-billion dollar industry is pushing a technology, does not necessarily mean developers will be forced to use it, and there is room for the individual with key insight and the transforming idea to make a difference.

However, platforms such as the SOAP stack have at least had a real motivation. Architects and developers tend to have opposing ideas on the subject of SOAP versus REST. REST, or some derivative of REST, may be argued to be the best approach for building applications over the internet, where challenges such as latency and independent failures are unavoidable. Nevertheless, the issues of enterprise collaboration and inter-operation that motivated SOAP and WSDL, however imperfect those solutions were, are not going away. Indeed, the move to cloud computing, which increasingly means outsourcing IT to third parties, ensures that these issues will intensify in importance for enterprises. The criticality of IT in the lives of consumers will also intensify as aging populations rely on healthcare systems that make increasing use of IT for efficiency, cost savings and personalized healthcare. Similarly, IT will play a crucial role in solving challenges with population pressure and resource usage that challenge all of mankind. For example, Bill Gates has made the case for improved healthcare outcomes as critical to reducing family size in developing countries; IT is a key component in delivering on these improved outcomes.

The intention of this book is to cover the principles underlying the enterprise software systems that are going to play an increasing part in the lives of organizations of all kinds and of individuals. Some of the issues that SOA claims to address have been with us for decades. It is instructive, therefore, while discussing the problems and proposed solutions, to also ensure that there is agreement on the definition of the principles. However the book must also be relevant and discuss current approaches. Part of the challenge is that, following the battles of the last decade, it is not clear what role technologies such as Java Enterprise Edition (Java EE) and Windows Communication Foundation (WCF), so central to the adoption of SOA, will play in future enterprise systems. Therefore, this book emphasizes principles in a discussion of some of the current practices, with a critical eye towards how the current approaches succeed or fail to live up to those principles, with passing speculation about how matters may be improved.

In this text the focus is on data modeling and software architecture, particularly SOA. SOA clearly dominates the discussion. This is not intended to denigrate any of the alternative architectural styles, but SOA brings many traditional issues in software engineering to the forefront in enterprise software architecture, and it has ambitious goals for aligning IT strategy with business goals. Domain-driven design is essentially object-oriented design, and it is principally discussed in order to counterpoint it with SOA. There is reason to believe that domain-driven architecture will play a more prominent role in enterprise software architecture, as experience with mobile code progresses. Resource-oriented architecture (ROA) is certainly the technology of choice for many developers, but it is not clear how well it will address the needs of application architects attempting to deal with the issues in enterprise applications. Many of the issues in software architecture highlighted by SOA will also eventually need to be addressed by ROA.

As noted by Waldo et al., the development of enterprise applications has broadly followed two courses: the “systems” approach, in which developers have focused on the algorithmic problems to be solved in building distributed systems, and the “programming environment” approach, which has attempted to develop suitable tools for enterprise application development. The success of the former has been substantially greater than the latter, where the retreat from SOAP-based to REST-based Web services can be viewed as the most recent failure to develop an acceptable programming model. Much of the focus of the systems approach has been on the proper algorithms and protocols for providing reliability and, increasingly, security for enterprise applications. The use of these tools in application development is still a challenge. Computer scientists, sometimes express surprise when they learn of the importance of these tools in the infrastructure of the cloud.

Service developers wrestle with the problems of network security and independent failures every day—even if the solutions are sometimes flawed because of imperfect understanding of the problems and solutions. While the last several decades have seen attempts to build middleware platforms that encapsulate the use of these tools, isolating application developers from the problems with distributed applications, the REST philosophy challenges the assumptions underlying this approach, at least for applications that run over the Web, based on an end-to-end argument that focuses responsibility for dealing with the issues on the application rather than the system. This is a basic rejection of three decades of distributed programming platforms, and challenges the research and developer community to develop the appropriate abstractions that make this effort scalable. Certainly, the application developer must have the appropriate tools provided in programming frameworks. However, even with these tools, the developer still has the responsibility of using and deploying these tools in their application. This approach is in opposition to the traditional middleware approach, which purports to allow developers to build applications without regard to distribution and rely on the middleware to handle the distribution aspects.

The material in this text comes primarily from two teaching courses: first, and primarily, a course on software architecture for enterprise applications; secondly, a course in distributed systems, with an evolving focus on cloud computing. The focus of the software architecture course, and of this text, is on software design and development, including data and process modeling, and design principles. The software architecture course involves hands-on development of an enterprise application, demonstrating aspects of the software architectures considered in this text, using current tools. This approach is highly recommended in order to keep the material “real”. However it is also recommended that students be provided with copious tutorials and hands-on guidance, for dealing with “nitty-gritty” details that should not distract from the overall enterprise application perspective. The distributed systems and cloud computing course involves more advanced assignments and programming techniques, that complements the enterprise applications course. For example, the material on failure models (Sect. 2.3), data replication (Sect. 6.9) and distributed protocols (App. B) is covered there, with this volume used as a recommended reference. In these courses, the author has chosen to emphasize REST-based Web services. A related course, in enterprise and cloud computing, has more time to consider SOAP-based Web services from the enterprise perspective, as well as the use of tools such as Hadoop and Cassandra for large-scale cloud-based data processing.

This tension between the “systems” and “programming environment” aspects of the material is reflected in two appendices to this volume. The first appendix reviews the concept of time in distributed systems, a fundamental consideration in the design of any distributed application. Some other aspects of distributed systems, in particular dealing with failures and replication, are also considered in other parts of the text and may be omitted in a course focusing on software architecture. The student wishing to gain a deep appreciation of the issues in enterprise software architecture will hopefully see these aspects addressed in a complementary distributed programming course. A more in-depth consideration of these topics could be addressed by supplementing the summary material in this text with some of the classic research papers from the literature. In any case, any developer of enterprise applications must have at least a basic understanding of the principles of distributed systems, hence the decision to include an introduction to distributed system concepts in this text.

The second appendix provides an introduction to the Haskell functional programming language. Concepts from Haskell are used in several examples in the text, for example when discussing component-based software re-use. Haskell provides an elegant language for describing abstractions, and Sect. 5.4 considers its use in defining embedded domain-specific languages. Sect. 7.6 considers the use of Haskell in verifying client-side conformance with RESTful business protocols, again making use of Haskell's rich support for defining programming abstractions. Finally, there is also a school of thought that the future of RESTful programming is as some form of functional programming. It remains to be seen if this will be the case but, in any case, this provides additional motivation for providing a Haskell tutorial in a text on enterprise software applications.

There is much that is not considered in this volume; a second volume will consider these other issues. Service-oriented architecture is, boiled down to its essence, two things: abstract data types and asychronous communication. In this volume, only the first of these is considered; the other requires further elucidation, particularly in relation to event-driven architecture (EDA). There are also important issues with distributed agreement that there has not been space to fully address. So-called orchestration and choreography of enterprise applications is an important and intellectually interesting topic that will be addressed in the sequel. We also consider the role of the cloud in enterprise applications in more depth in the sequel volume. Finally, security and privacy are obviously important concerns for enterprise applications, particularly when those applications run in the cloud, and we also consider these issues in the sequel volume.

References

1. Waldo J., Wyant G., Wollrath A., and Kendall S. A note on distributed computing. Technical report, Sun Microsystems Laboratories, 1994.

2. DeRemer F. and Kron H. H. Programming-in-the-large versus programming-in-the-small. IEEE Transactions on Software Engineering, 1976;2:80–86.

3. Parnas D. On the criteria to be used in decomposing systems into modules. Communications of the ACM, 1972;15:1053–1058.

4. Parnas D. A technique for software module specification. Communications of the ACM, 1972;15:330–336.

5. Turner J. The best and worst tech of the decade, December 2009.

6. Schumacher E.F. Small is beautiful: Economics as if people mattered. Hartley and Marks Publishers; 2000.