Database infrastructures have been in existence for more than half a century. They seem to have evolved over time due to increasing data volumes and the need to access this data faster while maintaining the sanity of cost of maintaining this infrastructure. Today, we live in an age where data is growing at a rapid pace…faster than ever. After the internet boom in later 1990s and early 2000s, the onset of mobile, social media and multitude of sensing technologies in mid to late 2000s has ensured that the data size which once used to be within control has now reached a growth acceleration which is almost impossible to tame. Unstructured data is growing more rapidly than the structured data that we know so well. Current estimates put the data created in the world at 2000 Exabytes by end of this year (1 Exabyte = 1000 Petabytes = 1 million Terabytes). In contrast, the amount of available storage is only at 1000 Exabytes, which is half of what is required to store and effectively use the data. This is truly the age of ‘Big Data’.
In the BI world, the challenge is not that of storage alone, but that of accessing this humongous data and be able to analyze it within a reasonable time. Also, traditionally the infrastructure also was sourced from different vendors for storage and database. Managing and maintaining these were a big headache and a considerable cost for enterprises. To solve this specific problem, we saw the arrival of ‘BI Appliances’.
By the way…what is a ‘BI Appliance’? Similar to the concept of a consumer appliance like an oven or a refrigerator that has a specific function, is low on maintenance and comes assembled and ready to use out of the box, a ‘BI Appliance’ is a database infrastructure that performs a specific function, is low maintenance and is out of the box. BI Appliance is essentially a combination of Hardware (Database Servers, Storage, etc.) combined with Software (Database, OS, etc.) that is packaged together and is available as a single unit, supported by a single vendor and designed for large volume and high performance data processing.
Teradata was the first such BI Appliance in the market, but due to its high costs of ownership, implementation and maintenance, there weren’t many customers willing to pay the high price. But now, with the ever increasing need for faster and economical processing of huge datasets, a new breed BI Appliances like Netezza (IBM), GreenPlum (EMC), DATAllegro (Microsoft), Exadata (Oracle), Asterdata (Teradata) and others, have emerged in the market. They claim to be extremely cost effective (~ $20K/TB) and yet deliver processing power (50x) that is unmatched by traditional database infrastructures.
But, with so many vendors in the market, how do you choose the one that suits your organization? Similar to the process of procuring software, we have to ‘evaluate’ these appliances against a set of ‘selection criteria’. And yes….evaluation of a BI Appliance is an art and science by itself. If you can adopt the methodology prescribed below, you can be assured of a successful evaluation and implementation.
Planning Phase: Understand the Business Problems you are trying to solve. List them down clearly, so at any time during the decision making process, you can refer to this list and determine the action. Apart than these, you may be trying to meet several organizational goals and objectives (like infrastructure preparedness for future growth…not an immediate business problem). Setup a governance structure…a set of people within your organization who are the stakeholders and decision makers. It is important to get a buy-in from each and also know their roles in this decision making process. Then comes the important step of ‘Feasibility and Market Analysis’ where you determine the market trends, understand the different vendors and products available and the key selling features of each. It is important to know the features like MPP, Columnar compression, Hadoop, MapReduce, FPGA, Flash Cache, etc. which will help determine the advantages of each product. After the internal analysis is complete, it is always advisable to invite a few selected vendors (~ 5-8 nos) for a request for information (RFI). This should typically include categories like the vendor background, technical support, architecture, integration with existing BI systems, customer references, pricing, etc. The responses received can be used for further shortlisting. Example below.
Evaluation Phase: The actual evaluation ‘must’ always include a Proof-of-Concept (POC) to prove that the appliance actually does what it is marketed for. The first step is to work on a list of evaluation scenarios. These may include attributes like performance, scalability, compatibility, migration effort and cost, administration effort, architecture, etc. The evaluation scenarios must be fortified with specific boundaries to determine the ‘Selection Criteria’. Before the POC is conducted, a pre-POC benchmarking exercise is absolutely essential. This pre-POC benchmarking must be done on existing database infrastructure against the same set of evaluation scenarios.Now, conduct the POC for the shortlisted vendors while ensuring that the evaluation scenarios are consistent for all the vendors and each vendor is abiding by the rules. It is worth mentioning that the system configurations used by each vendor for the POC must be similar to avoid any bias.While the POC results are being compared and analyzed internally within the organization, ensure that the results are not shared with the competitor vendors. This helps the organization to leverage the results for cost negotiation purposes.
The cost must also be compared by looking at a Total Cost of Ownership (TCO) over a period of 5 years. Example below.
Upon finalizing the vendor (based on the selection criteria), the next step is to acquire funding which may include a funding approval process to justify the costs. Upon approval, the procurement and legal teams typically negotiate the contract terms before the Purchase order is released.
Implementation Phase: Once the Purchase order is released, it is important to start working closely with the vendor teams to plan for the implementation of the BI appliance. The implementation planning includes determination of space within the data center, coordinating with the network engineering and network operations teams to secure IPs for various subsystems and connectivity to enterprise network, working with DBAs to determine the database migration strategy, power supply, seismic isolation (in case of areas of high seismic activity) and any post-implementation support requirements. The implementation phase is amongst the most resource intensive phase due to the coordination required amongst various teams involved from the vendor and your internal organization.
Finally, the success of implementing a BI Appliance lies solely on its relevant usage.
Author: Ashish Mirji (VP Consulting Services, Eviga Inc.)