In cloud computing, resources such as processing and storage are offered for public use. Cloud computing adopts a virtualized, dynamically scalable paradigm for organizing, distributing, and using computing resources. In contrast to the traditional paradigm, which relies on desktop resources, computing resources provided in clouds are part of social infrastructure and may have a profound impact on information technologies and their applications. The development of cloud computing is changing software engineering, configuration of resources in network core and terminals, and acquisition of information and knowledge[1].
Since the concept was proposed in 2007, cloud computing has been promoted by academia and industry and has been transitioning from theory into practice. However, the development of cloud computing technologies and their widespread implementation will be a long-term process because of the profound impact that cloud computing services will bring to the public. Important topics such as the technical basis, service models, and commercial operation of cloud computing have already been widely discussed[2-7].
This article analyzes views on resource virtualization, differences between grid computing and cloud computing, relationship between high-performance computers and cloud computing centers, and security and standards of cloud computing.
1 Virtualization of Computing Resources
Wikipedia defines virtualization as the "abstraction of computing resources[8]." Virtualization technologies emerged early in the history of technology. An operating system, for example, weakens the dependence of software application environments on hardware platforms, and even completely isolates the two. Middleware also weakens the dependence of application software on software operation. Both are examples of virtualization technology. With the change from stand-alone computers to the Internet, virtualization technologies gave birth to cloud computing. The structure and implementation details of a Web-based email administration system are virtualized when mass emails are sent, received, and managed through browsers. Searches and matches are virtualized when a search engine responds to a personalized search request. Network albums are used for storing and sharing pictures, and dynamic administration of the storage center is virtualized. Online transaction and payment systems are also virtualized.
Computing resources as virtualized objects can be classified according to computing capability, repository capability, and interaction capability. These classifications correspond exactly to CPU, repository, and input/output of a traditional computer. Because of resource virtualization, computing is no longer treated as the main sector. If computing is dominant, the main sector is the computing center; if repository is dominant, the main sector is the repository center. Interactions can also be treated as the main sector, while computing and repository become supporting parts. Virtualization services on the Internet can accommodate mass numbers of users to make highly personalized demands through natural interaction. Users need not consider the specific service model of application software nor whether the service is being simultaneously leased by others. The operating system of the computing platform does not enter into consideration, nor does the physical configuration and administration of bottom-layer resources such as software environment. The geographic position of the computing center is also irrelevant to users. Virtualization services such as Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) provide dynamic and scalable organization, distribution, and utilization of computing resources.
Looking back at the transition of resource configuration on the Internet, as shown in Figure 1, can help in understanding the development of virtualization services. The earliest server appeared using the Client/Server (C/S) structure—which was followed by the Browser/Server (B/S) structure. Client ends became thinner, and applications could be directly accessed through the abstract of B/S browser. Servers then emerged that could perform diverse tasks, and many institutions built servers that could host dedicated applications such as email, data, security, and video. This led to a blowout in diversified servers. Such a large number of servers prompted the creation of server trustees for alleviating the cost burden to owners of maintaining servers. However, trusteeship was not an effective solution to intensive use of servers. If virtualized services could be offered and servers changed into "services," the various kinds of computing resources could be better integrated.
From the evolution of virtualized services it can be seen that a service approach to offering computing resources was inevitable for cloud computing. This transition is similar in nature to that of traditional manufacturing industry from mass production to intensive, scalable, specialized production during the Industrial Revolution. Today, the information industry is also moving towards being intensive, scalable and specialized. Virtualization of computing resources allows reasonable configuration and use of resources. The actual utilization of a single server is only around 15% whereas utilization of a server cluster can reach to more than 80%. This has direct implications for energy saving, carbon emission reduction, and green computing[9].
2 Differences between Cloud Computing and Grid Computing
Before the emergence of cloud computing, grid computing[10] had been studied for more than 10 years and had attracted widespread attention. In the first two years of cloud computing, general opinion held that cloud computing was hot in the industrial field but cold in the academic field. For grid computing, the boot was on the other foot. What is the main difference between cloud computing and grid computing? Generally, in a grid system multiple computers are networked into a grid to offer specific large-scale computing services—"multiple for one." In cloud computing, intensive and specialized Internet platforms are used to offer scalable services—"one for multiple."
Ian Foster, the father of grid computing, defined it as "a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.[10]" Relying on private networks or the Internet, grid computing organizes idle computer resources scattered in different areas and with different capabilities and forms them into a virtual super computer by unified scheduling. In this way, complex tasks—such as scientific tasks requiring mass data handling and processing—can be completed. The basic application scenario of grid computing involves integrating computing resources belonging to different people in different locations in order to gain stronger computing capability.
Cloud computing tends to change powerful computing resources of certain Internet nodes into dynamic and scalable virtual resources. These are then provided as services to mass numbers of users. Cloud computing provides user-driven, on-demand and pay-per-use services that may be dissolved. Its basic application scenario is oriented to the Internet, and a centralized computing resource pool is used to meet the mass and scattered demands of terminal users.
The biggest similarities between grid and cloud computing are resource sharing and virtual computing. Both of these provide users with shared Internet resources through virtualization so that resources are reasonably utilized.
However, they are different in some respects:
(1) Cloud computing is based on cluster computing; nodes in cluster are self-governing and oriented to different service objects. Grid computing is based on parallel computing; cross-area computers are networked and a unified scheduling system is used to distribute tasks to different nodes for handling.
(2) Cloud computing recognizes heterogeneity; that is, heterogeneity in principle, scale, and capability of nodes. Inter-node resource sharing is achieved through the interoperability of the service. Grid computing uses middleware to shield heterogeneous system for the upper layer applications, orienting users to the same environment for resource sharing.
(3) Cloud computing is oriented towards persistent and diverse services, and many cloud computing centers in the Internet usually offer many diverse services for different application fields. Grid computing is primarily used for completing one-off, specific tasks that are pre-set.
(4) Cloud computing is operated commercially, providing best-effort multilease services that are rented for agreed terms or charged on a pay-per-use basis. Grid computing relies on coordinated operation between organizations. Bandwidth and performance are guaranteed, however, there are no obvious commercial models.
(5) Cloud computing is designed to meet the demands of mass numbers of users. People interact and communicate with one another and are involved in computing processes. Semantic handling and uncertainty processing are therefore required. Grid computing is oriented to scientific computing; program input/output is well-defined according to specifications and instructions, and people are not generally part of computing processes.
In short, cloud computing and grid computing are applied in different scenarios and have different goals.
3 Performance of a Cloud Computing Center
A cloud computing center is based on cluster computing. The large number of nodes at a cloud computing center are interoperable and form user-oriented virtual servers. Many institutions have purchased high-performance computers and established computing centers. However, can these high-performance computers be used for cloud computing centers? Is a cloud computing center the same as a high-performance computing center? What is the relationship between high-performance computers and virtual servers of a cloud computing center?
If the largest, most intensive and specialized cloud computing centers (owned by Google, Amazon, and Salesforce) are considered, none of the world’s ten most powerful computers have been used in the formation of their service clusters. It is thought that Google’s computing center consists of more than 450,000 ordinary computers located in at least 25 places, while the computing centers of Amazon and Salesforce may be running on cluster systems of 100,000 and 1000 ordinary computers respectively[11]. Since cloud computing aims to meet independent demands of mass numbers of users, dependence and intercrossing between server cluster tasks when responding to different requests is greatly reduced. Such loosely coupled tasks even enable computers at the cloud computing center to be fixed "to tall metal racks with Velcro, making it easy to swap them out should they fail[12]." Through cooperation between clusters, a search task involving billions of microprocessor events and the reading of hundreds of megabytes of data can still be completed in several tenths of a second.
High-performance computers are mainly used in science. The world’s ten most powerful computers in 2009 were mostly deployed in science research institutes, universities, and national institutions. These include the Oak Ridge National Laboratory, California; the Argonne National Laboratory of the US Department of Energy (DOE); the US National Institute for Computational Sciences; Forschungs Zentrum Juelich (FZJ), Germany; and the National Super Computer Center in Tianjin, China. These high-performance computers are applied in energy, manufacturing, weather forecast, nuclear, hydromechanics, and astronomy[13]. The XT5 (Jaguar) in Oak Ridge is currently the most powerful computer in the world with a performance score of 1.75 PFlops in the Linpack test. It has nearly 250,000 computing cores, with a theoretical peak computing rate of up to 2.3 PFlops. An important goal of high-performance computers is to improve computing speed in order to obtain higher performance parameters in Linpack tests.
Services provided by a cloud computing center are oriented towards the diverse applications of large-scale search, network repository, and network business of mass numbers of users. Therefore, cloud computing centers should be capable of providing high-quality service environments for tens of millions of applications and effectively adapt to user demands and service innovation. Compared to supercomputing centers, cloud computing centers do not follow the traditional task-oriented single computing model but provide service-oriented, scalable, and specialized services. Therefore, high-performance computers deployed in high-performance computing centers are appropriate for scientific problems requiring high concurrency computing but are not necessarily suitable for cloud computing.
4 Cloud Security
Resource sharing within cloud computing centers raises many security issues. Cloud computing is not a new weapon designed to solve security issues. It is a computing paradigm based on the Internet and has some of the same security issues that exist in current information systems. These issues include viruses, malicious attacks, and information disclosure when cloud services are being offered. Therefore, current information security technologies may also be used to secure cloud computing centers while additional security measures are developed for cloud computing. Scalable, intensive, and specialized services change the situation whereby information resources are scattered at end equipment. The development of Security as a Service (SECaaS) is expected to improve Internet security. Intensive and specialized security services may be implemented in cloud computing centers, possibly eliminating the need for end users to install patches and kill viruses. Backup may be treated as a kind of service for implementing specific cloud backup services.
Therefore, the security focus in cloud services will gradually turn to trust management. Traditional information security will develop into trust management between service providers and users. The analogy of a bank deposit can be used to describe the trust relationship between users and cloud service centers. In the past, many people thought it was safest to hide their money away in secret places. However, with the development of banks, people signed service contracts to have their wealth safely stored and managed. Sensitive information is similar to wealth. In the signal computer age and the early days of Internet, leaving sensitive data in network service centers was risky because the centers lacked the management, mechanisms, and technical guarantee of trust. Instead, data was saved in private systems to prevent it being disclosed and to protect privacy. Users themselves were responsible for system security, often installing a firewall and antivirus and backing up data. However, the rapid development of cloud computing will change this picture because users may not choose to keep sensitive data themselves. The core model of cloud computing is service based on the premise that there is a trust relationship between users and service providers. The most basic and important guarantee for establishing this relationship is the bottom-up force created by the democratic nature of the Internet, which cannot be formed by a single person but by interactions in communities. Thus, trust is fulfilled not by one-off testing or a set of fixed indicators but by eliminating untrusted elements during these interactions. This is a quality of cloud computing accumulated during operation. One of the key issues to be solved in trust administration of cloud systems is how to best abstract and apply the trust that emerges during the evolution of a cloud computing system. Society, politics, and technology can jointly promote the establishment, maintenance, and administration of trust in cloud computing.
5 Standardization of Cloud Computing
Cloud computing provides users with many types of services with variable granularity. Interconnection, interworking, and interoperation between these services is a key technical foundation for implementing an open cloud platform. In fact, interconnection, interworking, and interoperation are the basic characteristics of networks in any stage of development. Protocols for Local Area Networks (LAN) and Wide Area Networks (WAN) are used for interworking of computing devices. Transport Control Protocol/Internet Protocol (TCP/IP) enables inter-network connection. In the days of WWW, Hypertext Transfer Protocol (HTTP) and Hypertext Markup Language (HTML) enabled interoperation between terminals and Web sites; Web browsers that use these protocols are able to seamlessly access the WWW. Web services and Service-Oriented Architecture (SOA) opens the door for service computing.
Resources in a cloud computing system exist in the form of services. Many commercial enterprises have built their platforms for cloud computing, offering data and services. However, the grammar and semantic differences between data and services still block effective information sharing and exchanges between them. Cloud computing will not subvert existing standards such as Simple Object Access Protocol (SOAP); Web Services Description Language (WSDL); and Universal Description, Discovery, and Integration (UDDI). However, Cloud computing emphasizes service interoperation based on existing standards. Designing higher level protocols and specifications for openness and interoperation is extremely important in order to facilitate interoperation between clouds and interoperation between services and end users.
The International Standardization Organization ISO/IEC JTC1 SC32 is responsible for the development of ISO/IEC 19763—Metamodel Framework for Interoperability (MFI). MFI is a standard family that provides reference for the model registration, ontology registration, and model mapping of registered information data resources. It promotes interoperation of software services. China participated in the development of ISO/IEC 19763-3 in the standard family. ISO/IEC JTC1 SC7 and ISO/IEC JTC1 SC38 established cloud computing study groups in 2009, with the tasks of developing terminology related to cloud computing and drafting study reports on cloud computing standardization.
Industry organizations such as the Cloud Security Alliance (CSA)[14], Open Cloud Consortium[15], and Cloud Computing Interoperability Forum (CCIF)[16] are taking steps to develop relevant cloud computing standards for virtual machine image distribution, virtual machine deployment and control, communication between virtual machines within a cloud, persistent repository, and safe virtual machine configuration. These organizations are ahead of international standardization organizations in the development of cloud computing standards. The China Cloud Computing Technology and Industry Alliance (CCCTIA) also aims to contribute to the standardization of cloud computing.
References
[1] 李德毅, 张海粟. 超出图灵机的云计算 [J]. 中国计算机学会通讯, 2009(12).
LI Deyi, ZHANG Haisu. Cloud Computation Beyond Turing Machines [J]. Communications of CCF, 2009(12).
[2] ARMBRUST M, FOX A, GRIFFITH R, et al. Above the Clouds: A Berkeley View of Cloud Computing [R]. Berkeley, CA, USA: Distributed Systems Lab, University of California, 2009.
[3] FOSTER I, ZHAO Yong, RAICU I, et al. Cloud Computing and Grid Computing 360-Degree Compared [C]//Proceedings of the IEEE Grid Computing Environments Workshop (GCE’08), Nov 12-16, 2008, Austin, TX, USA. Piscataway, NJ, USA: IEEE, 2008: 10p.
[4] GERMAIN-RENAUD C, RANA O F. The Convergence of Clouds, Grids, and Autonomics [J]. IEEE Internet Computing, 2009, 13(6): 9.
[5] ERICKSON J S, SPENCE S, RHODES M, et al. Content-Centered Collaboration Spaces in the Cloud [J]. IEEE Internet Computing 2009, 13(5): 34-42.
[6] LEIBA B. Having One’s Head in the Cloud [J]. IEEE Internet Computing, 2009, 13(5): 4-6.
[7] JENSEN M, SCHWENK J, GRUSCHKA N, et al. On Technical Security Issues in Cloud Computing [C]//Proceedings of the 2009 IEEE International Conference on Cloud Computing (CLOUD’09), Sep 21-25, 2009, Bangalore, India. Los Alamitos, CA, USA: IEEE Computer Society, 2009: 109-116.
[8] Cloud Computing [EB/OL]. [2009-05-23]. http://en.wikipedia.org/wiki/Cloud_computing.
[9] 祁金华. 国内虚拟化案例剖析 [N]. 网络世界, 2007-07-02.
QI Jinhua. Analysis of Domestic Virtualization Cases [N]. China NetworkWorld, 2007-07-02.
[10] FOSTER I, KESSELMAN C. The Grid: Blueprint for a New Computing Infrastructure [M]. San Francisco, CA, USA: Morgan Kaufmann Publishers, 1999.
[11] The Efficient Cloud: All of Salesforce Runs on Only 1000 servers [EB/OL]. [2009-05-23]. http://techcrunch.com/2009/03/23/the-efficient-cloud-all-of-salesforce-runs-on-only-1000-servers/.
[12] 尼古拉斯·卡尔. IT不再重要:互联网大转换的制高点—云计算 [M]. 闫鲜宁, 译. 北京: 中信出版社, 2008.
NICHOLAS Carr. The Big Switch: Rewiring the World, from Edison to Google [M]. Translated by Yan Xianning. Beijing: CITIC Press, 2008.
[13] TOP500 Supercomputing Sites [EB/OL]. [2009-06-25]. http://www.top500.org.
[14] Cloud Security Alliance (CSA)—Security Best Practices for Cloud.org [EB/OL]. [2009-06-28]. http://www.cloudsecurityalliance.org/.
[15] Open Cloud Consortium [EB/OL]. [2009-08-25]. http://opencloudconsortium.org/.
[16] Cloud Computing Interoperability Forum (CCIF) [EB/OL]. [2009-06-25]. http://www.cloudforum.org/.
[Abstract] suitable for a cloud computing; security in cloud computing focuses on trust management between service suppliers and users; and based on the existing standards, standardization of cloud computing should focus on interoperability between services.
[Keywords] cloud computing; virtualization; grid computing; security of cloud computing; standards of cloud computing