BigAdmin System Administration Portal

HowTos

Archived from Sun's Dot-Com Builder Web Site
This content is archived from Sun's Dot-Com Builder Web Site.
These are the Best Practices > How To's archives.

Some of these pages may contain links that are no longer available. If you see these, you can report it through the Suggestions link and we will remove the link and leave the name (for reference).

Back to Dot-Com Builder How-Tos Archive

Scaling J2EE™ Technology Applications Based on Application Infrastructure Techniques - Part 1
February 22, 2001

by Rakesh Radhakrishnan

This article is the first in a four-part series that identifies multiple techniques for scaling applications based on J2EE technology to support many concurrent users.

Part 1
Introduction
Directory Server
Proxy Server
Part 2
Mail Server
Web Server
Application Server
Part 3
Database Server
Messaging Server
Transaction Server
Part 4
Certificate Server
CORBA Server

In any computing environment, especially in dot-coms, scaling applications to support millions of users, thousands of concurrent users, expanding data and high volumes of transactions is a must. Java technology, both a programming language and a technology platform, addresses this issue via many techniques.

The platform-independent nature of Java technology offers enormous value to application developers and users: vendors do not have to develop their products for multiple OS platforms, and users can access and run the applications from nearly any type of device on any type of network.

This flexibility is made possible by the byte-code generation technique of the Java Virtual Machine. All Java code is compiled to byte code at run-time prior to execution by the virtual machine (VM) built for a specific platform. The VM recognizes the flavor of OS and the CPU architecture on which the code has to be run.

This functionality, which makes platform independence possible, acts as a double-edged sword: on one end it offers the benefit of "Write Once, Run Anywhere," and on the other, it hinders performance due to the overhead associated with an additional layer, which we refer to as the "virtual platform layer."

Leveraging the Extensible Nature of Java Technology

Business services, such as an employee portal or an online electronic store, that are built on Java technologies such as JavaServer Pages (JSP), Servlets, and Enterprise JavaBeans (EJB), are expected to leverage the extensions made to the application infrastructure and to act as a more scalable application, primarily due to the multitiered nature of the infrastructure.

For example, if an application such as an employee portal is suffering from bottlenecks associated with the employees' Java mail client, the natural approach is to redesign and extend the basic infrastructure service. If the online catalog that provides detailed product literature for an online store (Servlet/JSP) is acting as the bottleneck, the Web and Web proxy server's architecture can be extended.

Scaling a J2EE technology-based application can be achieved by scaling the application infrastructure, assuming the application is multitiered and built on top of these infrastructure services.

A typical application based on J2EE technology has the following components:

  • Servlets and JSPs on the Web server
  • EJB components served by the application server
  • SQLJ or stored procedure (embedded SQL statements in Java code) running on the database servers
  • Authentication/authorization running on the LDAP server
  • J2EE components signed and certified by the certificate server
  • XML-based inter-application integration components running on EAI or B2B application integration servers
  • Java Message Service (JMS) (if asynchronous transactions are executed by the application, indicating a messaging server is in use such as Tibco)
  • Synchronous application transactions (which are run on a transaction server such as Tuxedo)

Overview of J2EE Application Infrastructure

The basic infrastructure services in a typical dot-com environment and their respective J2EE components include:

Table 1 - Dot-com Services and Their J2EE Components

ServiceJ2EE Components
Directory ServerJNDI
Proxy ServerApplets/Servlets
Mail ServerJava Mail
Web ServerJSP/Servlets
Application ServerJB/EJB
Database ServerJDBC/SQLJ
Messaging ServerJMS
Transaction ServerJTS
Certificate ServerPKI
CORBA ServerJavaIDL


Not all dot-com environments have implementations of all these basic services; often they consist of just a few of them. The methods associated with scaling these services are shown in Table 2.

Table 2 - Scaling Techniques

ServiceScaling Method(s)
Directory serverAccess router and replication
Proxy serverCache array routing and proxy chaining
Mail serverMail multiplexing and mail store partitioning
Web serverIPLB and Web farm clustering
Application serverState/session synchronization (S3) between clusters and application partitioning
Database serverSystem-level clustering and parallel databases or data partitioning
Transaction serverData-dependent routing and request prioritization
Messaging serverRedundant processes with distributed queues and one-in-n delivery
Certificate serverFunctional partitioning (RM/CM) and data recovery replica
CORBA serverDistributed director and gatekeepers


Caching or traffic servers can scale the entire application environment by storing frequently requested information at the edges of a network. By caching information locally, traffic servers minimize the amount of "upstream" bandwidth required to service end-user requests for information. This approach reduces both the amount of traffic on the network and the time it takes end-users to retrieve information.

In the following sections we take an in-depth look at each of these scaling techniques.

Scaling Techniques

Directory Server - Access Router and Replication (e.g., iPlanet Directory Server)
The directory access router enables the directory infrastructure to be highly scalable as well as available through automatic fail-over and fail-back. It also provides the capability to configure intelligent load balancing across multiple directory servers and replicas in multiple geographic locations.

Figure 1 shows a typical implementation of an LDAP server in an enterprise.

A typical implementation of an LDAP server in an enterprise.
Figure 1: A typical enterprise LDAP implementation
(Click image to enlarge.)

In some cases, a set of proxy-server farms residing in the external DMZ (EDMZ) look up the LDAP database in the internal DMZ (IDMZ) for authentication purposes. However, if the LDAP schema is implemented where there are multiple directory stores in multiple locations, LDAP data must be synchronized. This often happens when two or more enterprises have just merged or one has been acquired.

Prior to the directory access router (LDAP gateway), the directory server scaled by using directory replication. Multiple replicas essentially acted as cached (read-only) LDAP data within an enterprise's network. This was based on one LDAP database, maintained by the directory master (with write permission). The master replica handled the actual data propagation to multiple replicas. This scheme offloaded the workload associated with LDAP data propagation to the master replica and allowed redundancy for the master LDAP server.


Access Routing
Now, with the second and advanced technique of LDAP access routing, the LDAP servers can actually scale between multiple networks as well as multiple LDAP schemas. Mapping different LDAP client attributes to different directory server attributes is also dynamically accomplished by this LDAP gateway. Figure 2 illustrates using LDAP gateways for scaling.

LDAP scaling between multiple networks and multiple LDAP schemas using LDAP gateways
Figure 2: LDAP scaling between multiple networks and schemas.
(Click image to enlarge.)

If both user information and the client attributes associated with the client device are stored in network 1, the LDAP gateway in network 2 will forward the connection automatically to the LDAP gateway in network 1, which in turn will access the appropriate LDAP data from the LDAP replicas.

The LDAP gateway approach to scaling is especially useful when LDAP data is distributed across multiple networks and different schemas (enterprise mergers). Additionally, fail-over and fail-back is achieved between LDAP replicas on the same network or across multiple networks when the access router is implemented.


Proxy Server - Cache Array Routing and Proxy Chaining (e.g., iPlanet Proxy Server)
There are many reasons for using proxy servers:

  • Efficient caching
  • Limiting access to and from
  • URL filtering
  • Tunneling

Proxy servers are typically deployed on four-way systems located in the EDMZ. The proxy-server farm caches the most frequently accessed static Web content delivered by the Web server farm located at the IDMZ.

Just as directory access routers are also called "LDAP gateways," the proxy servers are also generally referred to as "HTTP gateways."

There are primarily two techniques associated with scaling a proxy implementation:

  • Proxy chaining when scaling logically between multiple proxy servers
  • Cache array routing technique -- scaling physically between multiple proxy server machines

Proxy Chaining
Through proxy chaining, hierarchical caches are built to improve performance on internal networks.

Figure 3 - Proxy chaining hierarchical caches
Figure 3: Proxy chaining hierarchical caches
(Click image to enlarge.)

These hierarchical caches can be implemented within a proxy server or in a proxy farm implemented within the EDMZ. This is "logical" as opposed to "physical" because of the organized layout of the cached content. The cached data is hierarchically located within or between multiple proxy systems. A proxy hierarchy is shown in Figure 3.


Cache Array Routing
On the other hand, Cache Array Routing Protocol (CARP) techniques use a deterministic algorithm for dividing client requests among multiple proxies. The cache content is not logically organized and can be distributed between multiple proxies. If a proxy does not hold the content requested by a client, instead of predicting the proxy most likely to contain the cached content, it uses a deterministic algorithm to choose another proxy.

Cache array routing
Figure 4: Cache array routing
(Click image to enlarge.)

When using this technique alone without proxy chaining, typically the proxy farm is vertically scaled within a four-way system and horizontally scaled from 2 to 12 systems. A typical cache array routing implementation is shown in Figure 4.

In many dot-com enterprises, because the proxies boost performance for the entire site with cached content, both these techniques can be combined and adopted to offer efficient proxy scaling. The cached content may include static HTML pages, applets, documents, images, and so on.


Part 1
Introduction
Directory Server
Proxy Server
Part 2
Mail Server
Web Server
Application Server
Part 3
Database Server
Messaging Server
Transaction Server
Part 4
Certificate Server
CORBA Server

BigAdmin