Data Grid Platforms & Green IT PDF Print E-mail
Written by martcon   
Monday, 12 October 2009 07:56

(Note: Data Grids in computing can also refer to a UI Component. We will not be dealing with these types of Data Grids in this Blog.)

Data Grids are highly concurrent distributed data structures. They typically allow you to address a large amount of memory and store data in a way that is quick to access. They also tend to feature low latency retrieval and maintain adequate copies across a network to provide resilience to server failure. A Data Grid is in essence a data structure where data can be evenly distributed across the network. As you add servers (or nodes) to the network you are adding storage space. Load Balancing policies are not needed.

Data Grids and Grid Computing are concepts that are distinct from Cloud Computing even though the terms frequently go hand in hand. Grid Computing in turn is also distinct from a Data Grid. Cloud Computing is effectively an evolution of Grid Computing wherehardware and software resources and services are provided in the Cloud (Internet). Grid Computing divides pieces of a program into several thousand computers.

A Data Grid, on the other hand, is essentially Grid Computing that is concerned with data. Unlike other types of Grid Computing where, if one piece of a program fails pieces on other resources will also fail, a good Data Grid will have backup of data across the network.  This distributed data will be shared and managed within the Cloud.

A number of Data Grid Platforms have been created for storing and managing data across computer networks. JBoss, for example, have recently launched the Infinispan (http://www.jboss.org/infinispan) open source Data Grid platform which is written in Java.  Infinispan gives a good overview of not just its own platform but also of Data Grids generally. As Infinispan point out the grid creates new copies of lost data if a server fails and puts this copy on other servers. Data Lookups are no longer directed to a single database server which greatly alleviates a major bottleneck for most Enterprise Applications. Gigaspaces (http://www.gigaspaces.com/xap) also offer an In-Memory Data Grid in their eXtreme Application Platform (XAP) that partitions data and distributes the data across large numbers of servers.

In the context of Green IT what role can Data Grids play? With respect to Wireless Sensor Networks (WSNs) , it is predicted there will be millions of sensors deployed in mesh networks in years to come in diverse applications including security, agriculture and energy management. All of these devices emit data readings that need to be stored and translated into information that can be used for decision making which can result in millions of data sets being stored every minute. Given the potential dispersed deployment of sensors thoughout a wide area there is an issue regarding the capturing and storing of this data. Data Grids can provide a solution here whereby sensor data readers or the sensors themselves can relay data to the nearest server on a Data Grid. A Data Grid would also avoid bottlenecks in analysing this data and retrieving it for Business Intelligence purposes.

Smart Grids and the meters therein could also benefit from the provision of a Data Grid. In current field trials of Smart Meters, data is relayed back to a central server every half an hour. Smart Meters will ultimately be deployed in every household and business premises in the vast majority of industrialised countries so the issues relating to data volumes and system bottlenecks applicable to WSNs are also relevant here. The issue of data management for smart grids is further exacerbated when we consider that smart meters will also be deployed for water and natural gas consumption as well as electricity. The other element we need to consider is the securing of smart meters and the data they emit. The encryption or signing of data will be an additional load that could be dealt with within a Data Grid rather than on a centralised server.

In the case of Wind Turbines the case for Data Grids initially seems less clear. Wind Farms can certainly benefit from Cloud Computing - for example, the optimisation of a Wind Farm using simulation services in the Cloud was recently demonstrated (See http://www.computerweekly.com/Articles/2009/07/21/236976/cloud-based-simulation-cuts-engineers-design-costs.htm).  However, when we consider that different Wind Turbine manufacturers have different ways of transmitting data and often have different raw data for what is essentially the same metric it is clear that this data will need to be captured and processed before it becomes meaningful information. Given the processing load, one possible solution could be to distribute data emitted from Wind Farms on a per farm or per manufacturer basis throughout a Data Grid. Furthermore, given that the energy and availability metrics determine a Wind Farm's revenue this data will need to be retrieved in a timely fashion. This will be easier to do with a Data Grid than a centralised server.

Vertoda is examining Data Grids and their use in our framework. We are researching the development of a Cloud Edition of our Framework and given the volume of data the networked devices we manage can emit a Data Grid will certainly play a major role.

Last Updated ( Wednesday, 14 October 2009 10:41 )
 
RocketTheme Joomla Templates