Fostering Software Reliability in an Increasing Hostile World

Deepak Gaur - StorePerform Technologies Inc.

For past few years, I have been involved in providing high availability and high performance enterprise software solution for big retail clients. During the course of this period, I was actively involved in various phases including performance tuning, high availability analysis, design, development and deployment, and load, scale and stability testing of our enterprise applications.

There are various components that consists of today's enterprise software ranging from load balancer at the front end followed by web servers. Come further down, you may have cluster of application servers running business logic and database server on the back end storing enterprise and application data. Along with that and depending upon the architecture of an application, one could have a messaging engine providing asynchronous processing of messages in the application and a directory server hosting enterprise resource information. There are other components that have to be considered including storage area networks (SAN) and network infrastructure when building high performance applications. Any of these components could become a bottleneck and/or single point of failure for the application and could affect the system reliability, if ignored. It would be interesting to discuss each of these components in this workshop from reliability stand point.

Running an application 24/7 is a goal every enterprise architect wish to achieve, but due to the limitations of resources and time constraints, some of the availability and reliability requirements are sacrificed. Attempt should be made to minimize system downtime. An in depth research should be conducted before deciding to introduce a technology in an application or a system. Classifying components of an application based on high availability and reliability stand point is very important as it determines the overall reliability of an application.

In this workshop, I would like to gain knowledge on how to build a cost effective, highly reliable and highly available enterprise solution. I would also be interested in knowing what technologies that are out there and what all new technologies that are emerging in this space. Also, it would be interesting to share view points on what things to look for, when selecting components and technologies to build a reliable application. How reliable is your system, what to do in case of disaster, how to plan in advance and recover from it. How to test an application for high availability and reliability would be another interesting topic to cover?