Grid computing is a critical shift in thinking about how to maximize the value of computing resources. The technology is still fairly nascent, but here at the developerWorks grid computing zone, we're publishing a steady stream of new articles, tutorials, resources, and tools to bring developers up to speed on this important cutting-edge technology.
Introduction to grid computing
Many visitors interested in grid computing are asking some very basic questions:
* Where do we start?
* What do we do with all of this stuff?
* How do the pieces fit together?
* Does what I want to do fit in grid computing?
* Are there existing grids I can join?
This is your guide to start learning about the exciting benefits grid computing can offer. Here, we highlight the basics of grid computing in their proper context, and we tie together relevant developerWorks articles, tutorials and tips, IBM® learning services education programs, workshops, and IBM products for further investigation. We place information about grid computing into an intuitive framework, tying the pieces together and highlighting the important details.
If it appears that the grid story isn't quite finished yet, you're right. Grid technology is evolving rapidly. Grid computing has its roots in academia and has been steadily moving toward commercial adoption. Standards, frameworks, implementations, and applications are changing constantly. The state of grid computing today might remind you of the early days of the Web, or even of the emergence of XML and Web services, where things began slowly. But much like those technology areas, once solid standards and tools appear and coalesce, we predict there will be tremendous interest and growth in grid computing. We offer this guide so you can get in on the ground floor.
What is grid computing?
Because it is an emerging technology, grid computing can mean different things to different people. But here is a simple, serviceable definition for the concept of grid computing:
Grid computing allows you to unite pools of servers, storage systems, and networks into a single large system so you can deliver the power of multiple-systems resources to a single user point for a specific purpose. To a user, data file, or an application, the system appears to be a single enormous virtual computing system.
Grid computing is the next logical step in distributed networking. Just as the Internet allows users to share ideas and files as the seeds of projects, grid computing lets us share the resources of disparate computer systems so people can actually start working on those projects. Grid computing takes the ability for computers (and their users) to communicate a step further: With grid computing, you can reach out and use computational or storage resources on machines other than your own.
With grid computing, an organization can transform its distributed and difficult-to-manage systems into a large virtual computer that can be set loose on problems and processes too complex for a single computer to handle efficiently. The problems to be solved can involve data processing, network bandwidth, or data storage. The systems linked in a grid might be in the same room or distributed around the world. They might be running different operating systems on many hardware platforms. They might even be owned by different organizations. Regardless of the depth of a grid's resources, all the grid user experiences is the processing resources of a very large virtual computer.
The major purpose of a grid is to virtualize resources to solve problems. The main resources grid computing is designed to give access to include, but are not limited to:
* Computing/processing power
* Data storage/networked file systems
* Communications and bandwidth
* Application software
Since the concept of putting grids into real-world practice is still relatively new, another good way to describe a grid is to describe what it isn't. The following are not grids:
* Clusters
* Network-attached storage devices
* Scientific instruments
* Networks
Each might be an important component of a grid, but by itself, doesn't constitute a grid. Being able to tie together several million computers -- clusters, workstations, desktop PCs, supercomputers -- with data storage, instruments, visualization devices is the dream grid computing strives to achieve and the reality of grid computing could be revolutionary to science and industry.
So, what does it take to make the vision of the grid computing concept a reality? It requires standard and seamless, open general-purpose protocols and interfaces, all of which are being defined now and are similar to those that enable access to information from the Web.
Why is grid computing important?
Grid computing is about getting computers to work together. Almost every organization is sitting atop enormous unused computing capacity that is widely distributed. UNIX® servers are actually "serving" something less than 10 percent of the time. And most PCs do nothing for 95 percent of a typical day. Imagine an airline with 90 percent of its fleet on the ground, an automaker with 40 percent of its assembly plants idle, a hotel chain with 95 percent of its rooms unoccupied.
Virtualization of the computing environment -- or grid computing -- is a key component of the IBM on demand strategy. Virtualization allows organizations to:
* Use otherwise idle computer resources to accelerate business processes.
* Speed applications so that processing time decreases, driving faster time to market.
* Enable the development of new and more productive applications.
* Drive down the costs of developing new applications.
* Increase collaboration and productivity capabilities.
* Maximize the resources available to users.
* Increase the resiliency and utilization of the IT environment.
Administrators and developers benefit from grid computing because it allows them to:
* Optimize the infrastructure to balance workloads and provide extra capacity for high-demand applications.
* Improve access to data, and support collaboration across disciplines, organizations, and businesses.
* Provide a more resilient infrastructure.
Businesses benefit from grid computing because it allows them to:
* Increase productivity by providing users the resources they need on demand.
* Use existing resources more efficiently.
* Respond quickly to changing business and market demands.
* Enable collaboration among dispersed entities.
* Create virtual organizations that can share resources and data.
One of the most important issues grid computing addresses for businesses is the utilization of existing resources. Companies have made significant investments in computing capacity, but much of it sits idle up to 90 percent of the time. Grid computing can help these businesses connect those under-utilized assets, harness their collective power, and manage them like a single large computer.
What can I do with grid computing?
The concept of grid computing sprang from the research and academic communities, much like that of the Internet, but business has recently started to catch on to the benefits grid computing can provide, such as enabling new types of financial and business models like the following examples:
* In the financial services industry, grid computing can be used to speed trade transactions, crunch huge volumes of data, and provide a more stable IT environment in a mission-critical environment that doesn't tolerate much downtime.
* Government agencies can use grids to pool, secure, and integrate vast stockpiles of data. Many civilian and military agencies need the capabilities of cross-agency collaboration, data integrity and security, and lightning-fast information access across thousands of data repositories.
* Companies involved in the life sciences, such as those that do genome research and pharmaceutical development, can use parallel and grid computing to process, cleanse, cross-tabulate, and compare massive amounts of data. Faster processing means getting to market faster, and in those industries, a slight edge can be the deciding factor.
Not only can these new grid-oriented business models be implemented, some already have.
What are the key components to grid computing?
There are six major components to grid computing:
1. Security
2. User interface
3. Workload management
4. Scheduler
5. Data management
6. Resource management
Computers on a grid are networked and running applications. They can also be handling sensitive or extremely valuable data, so the security component of grid computing is of paramount concern. This component includes elements such as encryption, authentication, and authorization.
Accessing information on the grid is also quite important, and the user interface component handles this task for the user. It often comes in one of two ways:
* An interface provided by an application that the user is running
* An interface provided by the grid administrator, much like a Web portal that provides access to the applications and resources available on the grid in a single virtual space
The portal-style interface is also important because it can be the help space for users to learn how to query the grid.
Applications a user wants to run on a grid must be aware of the resources available. This is where a workload management service comes in handy. An application can communicate with the workload manager to discover the available resources and their status.
A scheduler is needed to locate the computers on which to run an application and to assign the jobs required. This can be as simple as taking the next available resource, but this task often involves prioritizing job queues, managing the load, finding workarounds when encountering reserved resources, and monitoring progress.
If an application is running on a system that doesn't hold the data the application needs, a secure, reliable data management facility takes care of moving that data to the right place across various machines, encountering various protocols.
To handle such core tasks as launching jobs with specific resources, monitoring the status of those jobs, and retrieving results, a resource management facility is necessary.
It's important to remember that grid computing doesn't operate in a vacuum. Just the opposite: It potentially involves every protocol and computer technology in operation today. With that in mind, we've provided links to other technologies and standards you might need to understand to fully appreciate the scope of grid computing's power.
What standards are associated with grid computing?
To better understand the evolving standards for grid computing, you need to understand how the grid architecture is defined. To do that, allow us to give you a bit of information about the definition of the architecture from the Open Grid Services Architecture (OGSA), developed by members of the Global Grid Forum (GGF), now known as the Open Grid Forum (OGF).
The architecture -- OGSA defines what grid services are, and the overall structure and services to be provided in grid environments. Building on existing Web services standards, OGSA defines a grid service as a Web service that conforms to a particular set of conventions. For example, grid services are defined in terms of standard Web Services Definition Language (WSDL) with minor extensions.
Why is this important? Because it gives us a common and open standards-based set of techniques to access various grid services using existing standards, such as SOAP, XML, and WS-Security. With this base, we can add and integrate additional services (such as life-cycle management) in a seamless manner. It provides a standard method to find, identify, and utilize new grid services as they become available.
And as an added benefit, OGSA will provide for interoperability between grids that might have been built using different underlying tools.
The specifications -- Grid specifications are evolving. Working groups in organizations like GGF and OASIS are busy defining an array of grid standards in areas like:
* Applications and programming models
* Architecture
* Data management
* Security
* Performance
* Scheduling and resource management
As you read through many of the standards, you will probably see reference to Open Grid Services Infrastructure (OGSI), which was published by GGF as a a formal specification and infrastructure layer for OGSA. But OGSA is now obsolete and has been superseded by the Web Services Resource Framework (WSRF). The goal of WSRF is to evolve the grid architecture in a way that's more clearly aligned with the general evolution of Web services. Instead of defining a new type of grid service, these specifications will allow the services specified in the OGSA to be based completely on standard Web services.
How much do you need to know about the evolving grid standards? It depends. IBM® and other industry leaders, plus researchers and representatives from many grid software vendors, are actively involved in the work to define grid standards. Are you a corporate software developer? If so, you'll use the grid tools and products that will be based on the new standards as they unfold. You'll want to know about the standards and be generally aware of the work that's going on.
Can I build a grid today?
Sure. You can use both open source and vendors' proprietary tools and products to build a grid right this minute. Over time, as the grid standards solidify, you can expect vendors to enable their tools to comply with the new standards, making it easier to combine components that will work together.
What technologies are fundamental for building grids? Services are essential in grid computing. The services include:
* Job scheduling
* Data and storage management
* Data queries
* Processor requests
* Workload balancing
* Workflow management
* Bandwidth allocation
These services are called grid services. Some computers host grid services, and other computers run applications that contract grid services as clients. Grid services are essentially Web services with additional functionality. Web services are an essential concept valuable in understanding grid computing today.
Web services -- groups of application functions that can be invoked over a network -- allow applications to communicate with each other, regardless of the platforms or programming languages involved.
To build a grid, you need tools. Grid tools fall into these general categories:
* Infrastructure components include file systems, schedulers and resource managers, messaging systems, security applications, certificate authorities, and file-transfer mechanisms, such as GridFTP.
* Systems on a grid must be able to discover what services are available to them. They must be able to define (and monitor) a grid's topology in order to share and collaborate. To do this, there are grid directory services implementations based on such existing successful models as LDAP, DNS, network management protocols, and indexing services.
* One of the main benefits of a grid is the ability to maximize efficiency. This is done through schedulers and load balancers. Schedulers ensure that jobs are completed in some order (such as priority, deadline, urgency, etc.), and load balancers distribute tasks and data management across systems to decrease the chance of bottlenecks.
* Developer tools for grid developers focus on different niches (file transfer, communications, environment control) and range from utilities to full-blown APIs.
* Security in a grid environment can mean authentication and authorization -- controlling who or what can access a grid's resources -- but it can also mean such crucial issues as message integrity and confidentiality.
To build a grid today, a good place to start is to download the Globus Toolkit. Developed by The Globus Project, a research and development project that focuses on enabling the application of grid concepts to scientific and engineering computing, the tool kit is a set of services and software libraries designed to support grids and grid applications.
Built on top of the basic grid computing concepts are Commodity Grid Kits (CoG), which provide access to grid services through a particular framework, including Java™ technology and Python.
How do I enable my applications for grid?
It takes some planning.
Start by considering the basic structure of your grid and the services it provides. You have to understand how the infrastructure components fit together, including security, resource management, information services, and data management, which can affect the application architecture, design, and deployment.
Are there existing production grids I can join?
Yes, but it depends on what you what kind of applications you have and what you intend to do on the grid. Joining an existing grid can be extremely useful, but it is helpful to prepare your application ahead of time and have realistic ideas as to what you want to accomplish. The best thing to do is familiarize yourself with the current production grids out there and see if they will meet your needs.
What can I do next?
As you can see, building a grid and porting applications to the grid is a complicated process. There are many components that need to be dealt with -- including resource management and data management -- when putting a grid together. The next thing you can do do is explore building higher-level services for grid users by simplifying and consolidating access. An example of these higher-level services is Web portals, which offer a simplified, clear, and concise Web interface to complicated grid computing functionality.
FULL ARTICLE WITH EXTERNAL RESOURCES/LINKS from IBM Developerworks.