Since cloud computing was first introduced by Google, it has scudded swiftly through the Internet community. The number of cloud computing services and platforms has recently mushroomed. Notable examples include Google’s File System (GFS), MapReduce, Bigtable, Chubby, and App Engine; Amazon’s Dynamo, Elastic Compute Cloud (EC2), Simple Storage Service (S3), Simple Queue Service (SQS), SimpleDB and CloudFront; Microsoft’s Azure, SQL, ".Net" and Live service; and VMware’s virtualization platform. There are also many open source platforms such as Hadoop Distributed File System (HDFS), Hbase, and Eucalyptus.
1 Core Technologies of Cloud Computing
Cloud computing is based on two core technologies—resource virtualization and distributed parallel architecture—for which there is much open source software available to support users. The software includes Xen, Kernel-Based Virtual Machine (KVM), Lighttpd, Memcached, Nginx, Hadoop, and Eucalyptus. Cloud computing saves hardware investment, development, and maintenance costs for cloud service providers.
Virtualization technology was first introduced and applied by VMware on the X86 CPU. A virtualization platform is used to partition a server into multiple Virtual Machines (VMs) with configurable performance. It monitors and manages all VMs in the cluster system, and based on actual situations, flexibly allocates and schedules resources in a resource pool.
Distributed parallel architecture integrates a large number of computers into a supercomputer to provide mass data storage and processing. Using distributed file system, distributed database, and MapReduce technology, a supercomputer provides programming methods and running environment for mass file storage, mass structured data storage, and unified mass data processing[1-3].
2 Virtualization
Pooling and managing physical resources are primary functions of virtualization technology. Pooling involves partitioning and virtualizing a physical device into multiple minimum resource units with configurable performance. Management involves flexibly allocating and scheduling minimum resource units in a cluster according to resource availability, user requests, and certain policies. In this way, on-demand resource allocation can be achieved[4-7].
2.1 Pooling Physical Resources
Figure 1 illustrates a cloud computing management platform. Virtualization objects of physical devices include server, storage, network, and security. Different virtualization technologies are developed to solve system problems from different perspectives.
(1) Server Virtualization
Server virtualization involves abstracting the server into virtual resources and then pooling them. Specifically, one server is divided into several homogeneous virtual servers and virtual server resource pools in the cluster are managed.
(2) Storage Virtualization
Storage virtualization involves making traditional Storage Area Network (SAN) and Network Attached Storage (NAS) devices heterogeneous. All storage resources are collated by type to form a unified large-capacity storage resource. The unified storage resource is then pooled according to the authority of each volume or sub-directory and contains resource management methods. Finally, virtual storage resources for different applications are allocated, or directly assigned to end users.
(3) Network Virtualization
Network virtualization involves partitioning a physical network node into multiple virtual network devices (such as switches and load balancers) and managing these resources. In a cloud computing platform, virtual network devices provide cloud services for applications along with virtual machines and virtual storage space.
2.2 Resource Pool Management
Resource pool management involves not only unified management, scheduling, and monitoring of resource pools, but also proper use and maintenance of the cloud platform. A cloud computing management platform can be divided into four layers: device management, virtual resource management, service management, and tenant management.
(1) Device Management
This layer manages hardware devices on the cloud platform and raises an alarm when there is device abnormality. Specifically, the system administrator uses this layer during daily maintenance to check device performance, and to monitor key indexes such as application server CPU usage, memory usage, hard disk usage, network interface usage, storage space availability, and Input/Output (IO) status. A user can set the monitor threshold for a physical device based on actual configuration. The system then automatically starts monitoring and raises an alarm when the threshold is exceeded.
(2) Virtual Resource Management
The virtual resource management layer implements unified management, allocation, and flexible scheduling of virtual resources for various applications. As in the device management layer, the system administrator checks the performance of each minimum virtual resource during daily maintenance, and monitors key indexes—such as CPU usage, memory usage, hard disk usage, network interface usage, virtual storage (such as Amazon’s Elastic Block Storage) availability, and IO status—of virtual machines being used. A user can set the monitor threshold for a virtual resource based on actual configuration. The system then automatically starts monitoring and raises an alarm when the threshold is exceeded.
(3) Service Management
The service management layer manages service templates, service instances, and service catalogs. Based on virtual resources, it promptly provides user-specified operating system and application software to tenants.
(4) Tenant Management
The tenant management layer manages the resource clusters of all tenants. Resource type, quantity, and distribution are managed, as well as tenant lifecycle—from application, examination, and normal operation, to suspension to cancellation.
2.3 Cluster Fault Location and Maintenance
In Google’s cluster maintenance approach, maintenance staff push a handcart to the damaged machine, and locate the fault by checking fault indicators on a customized PC. (In Internet data centers, PCs are often used as the computing resource.) At present, all cloud computing management platforms use a machine’s IP address, either physical or virtual, as its serial number to monitor or raise alarms. For a physical machine hosting VMs, the IP address of its host OS module is its unique identification in the cluster. The IP address of a machine is often allocated in one of two ways: by enabling Dynamic Host Configuration Protocol (DHCP) to automatically obtain the IP, or by assigning it manually. Since there is usually a large number of machines in a cluster, manual assignment creates a heavy workload. Using DHCP to automatically obtain an IP address is therefore often adopted.
However, if IP addresses are automatically obtained, maintenance personnel cannot specifically locate a faulty machine using the IP address when a physical device fault is found in the management platform. Also, a common PC is not configured with auxiliary fault location functions such as fault indicator. Locating the faulty physical machine is complex and tedious.
In a virtualization cluster, this problem can be solved in a simple and effective way: by configuring a Universal Serial Bus (USB) key for each machine. The key stores location information such as rack number. When a machine starts, it reads the physical location from the USB Key, calculates its IP address using an algorithm, and returns the IP address to the management platform. The IP address of each machine corresponds to a physical location, and when a fault occurs, maintenance personnel can accurately determine the IP address and location.
2.4 Grouping of Heterogeneous Resource Pools
SUN, IBM and other manufacturers have adopted their own server virtualization architectures for their minicomputers. These virtualization systems are not compatible with others based on X86 architecture (such as Xen and KVM), so resources are wasted.
The heterogeneous resource pool problem in server virtualization can be solved by:
(1) Grouping resource pools by architecture: Servers and minicomputers adopting different architectures are virtualized into VMs, which are then put into different resource pools categorized by architecture. Different applications are allocated VMs according to the VM architecture.
(2) Using a service scheduler to customize services, integrate virtualization platforms of different architectures, and schedule heterogeneous VMs.
Figure 2 illustrates the grouping of heterogeneous resource pools for unified management. IBM’s PowerSystems minicomputer cluster, HP’s minicomputer cluster, and X86 architecture-based computing resources are virtualized and grouped into different resource pools with virtualization platforms—namely, IBM’s PowerVM system, HP’s VSE system, and Xen/KVM system. An application can be deployed in a resource pool depending on its service characteristics and operating system. Thus, virtualization of heterogeneous minicomputers is achieved. The resource pools based on X86, PowerSystems, and HP architecture are managed by their respective virtualization management softwares (Virtual Machine Manager (VMM), Integrated Virtualization Manager (IVM), and Global Workload Manager (gWLM)). In the upper layer of VMM, IVM, and gWLM, there is an integrated Virtual Machine Manager (iVMM) configured for unified management of the three computing resource pools.
Figure 3 shows the scheduling of heterogeneous virtual resources for different applications. This method comprises four core components: iVMM, service scheduler, versions (provided by the service system) with the same application functions for different resource pool architectures, and the Oracle C++ Call Interface (OCCI) between iVMM and service scheduler.
In the service application layer, a module for service scheduling is added. This module applies to the iVMM in order to increase or decrease VMs, and to adjust load balancing policies. The service system must also prepare different versions with the same functions for different resource pool architectures. The operational procedure of the OCCI is:
(1) The service scheduler makes a resource request to the cloud computing management platform via the OCCI, providing information such as operating systems supported by the service system, and priorities.
(2) After receiving the request, the management platform allocates resources depending on the availability of resources in the cloud. Meanwhile, it informs the service scheduler (via the OCCI) what resource it has allocated to the service system.
(3) The service scheduler sends information such as the service processor’s operating system and service version to the management platform via the OCCI. The management platform then deploys the operating system and service program. Resources are submitted to the service system for use.
3 Distributed Technology
The first inroads into widespread distributed technology were made by Google. Its search services for global users are designed to store massive amounts of data and speed up data processing. Google’s distributed architecture allows millions of cheap computers to work together. Mass data is stored in its distributed file system, and using a distributed programming model, MapReduce, major tasks are broken down and performed in parallel on multiple computers. Google’s distributed database also stores mass structured data. Many Internet operators are now using Key/Value-based distributed storage engines for quickly storing and accessing a great number of small storage objects.
3.1 Distributed File System
Distributed file systems such as Google’s GFS and Hadoop’s HDFS are designed for storing massively large files. Such systems are configured with a pair of host computers, and applications can access the system via a dedicated Application Programming Interface (API). However, distributed file systems are not widely applied because the host cannot quickly respond to each application, and access interfaces are not open.
The host is the master node of a distributed file system. All metadata information is stored in the host’s memory, so the memory size determines the number of files the whole system can accommodate. Metadata of one million files may occupy nearly 1 GB of memory. But in cloud storage applications, files are often counted in the hundreds of millions. Reading and writing a file requires access to the host, and the host's response speed can often get slow. Response speed directly impacts Input/Output Operations Per Second (IOPS) of the storage system. Slow responsiveness of the host can be solved by:
(1) Buffering visited metadata information on the client. When an application accesses the file system, it first queries the metadata on the client, and accesses the host only if the query result is not satisfactory. In this way, access to the host is reduced.
(2) Storing metadata information on the host’s hard disk and buffering it in the host’s memory. This is applicable where metadata comprising over a hundred million large files must be stored. In addition, to enhance reliability and speed of the hard disk, Solid-State Drives (SSD) can be used which can improve performance by a factor of 10.
(3) Changing the standby work modes of the hosts from 1:1 hot standby to 1:X (often 1:4; that is, one host master with 4 standby hosts). The host master is selected by Lock Server. The host master allows the storage system to access the rewritten metadata, and In the case of read-only access, the application’s access the metadata is assigned to a standby host with Distributed Hash Table (DHT) algorithm. This stops the host becoming a bottleneck in the system.
In a distributed file system, an external application must go through a dedicated API to access the system. Therefore, the application range of the distributed file system is limited. With a standard Portable Operating System Interface (POSIX) (for Unix), an application can gain access to the system through the process of Filesystem in Userspace (FUSE), but at a cost of 10% to 20% of performance. Network File System (NFS) can be implemented on the basis of the POSIX interface, by directly calling NFS protocol stack of Linux operating system.
3.2 Key/Value Storage Engine
The structure of Key/Value storage engine is shown in Figure 4. Bucket A data are stored in Nodes B, C and D. The biggest problem for the Key/Value engine is the quick redistribution of data after a routing change. To solve the problem, "virtual node" can be introduced. The ring space in which the Key value is mapped is divided into equally-sized Q buckets, each of which corresponds to a virtual node. For Key mapping, Message-Digest algorithm 5 (MD5) is recommended. Each physical node is responsible for data in several buckets, depending on hardware configuration. The data in a bucket can fall into N nodes, where N is usually 3. Suppose the Q of DCACHE cluster is 100,000; that is, the entire ring space is partitioned into 100,000 buckets. If the maximum capacity of the entire DCACHE cluster is 50 TB, the data in each bucket is only 500 MB, and the migration of 500 MB data between nodes is less than 10 s.
A Key/Value storage engine is a flat storage structure which uses a Hash algorithm to distribute stored contents evenly among all nodes. But in some applications, the service needs batch operations similar to the directory structure. For example, in the Content Delivery Network (CDN), when the website pushes content to CDN nodes, the content is added or deleted based on the directory structure of web pages. A Key/Value storage engine cannot perform that function at the moment. To address the problem, a pair of directory servers can be added in the Key/Value storage engine to store the relationship between Key value and the directory, as well as to operate the directory structure. When an application accesses the Key/Value storage engine, accesses to related nodes is gained in the same way as Hash; but if directory operation is required, the application operates the Key/Value storage engine via directory servers. The directory servers complete the switch from directory operation to
Key/Value mode. Since in most projects read operations are the most common type of operation, the need for directory servers to access to Key/Value storage engine is small; and performance bottleneck is avoided.
4 Conclusion
Building a cloud platform is challenging. Two core technologies of cloud computing have been described in detail: virtualization and distributed architecture. In the area of Infrastructure as a Service (IaaS), this article focuses on virtualization technology, discusses the building and application of heterogeneous virtualization platforms, and the functions of the cloud computing management platform. In the area of distributed technology, distributed file system and Key/Value storage engine are discussed and solutions given for some related problems. A solution is proposed for host bottleneck, and a standard storage interface is proposed for the distributed file system. A directory-based storage scheme for Key/Value storage engine is also suggested.
References
[1] 张为民, 唐剑峰, 罗治国, 等. 云计算:深刻改变未来 [M]. 北京: 科学出版社, 2009.
ZHANG Weimin, TANG Jianfeng, LUO Zhiguo, et al. Cloud Computing: Change the Future Profoundly [M]. Beijing: Science Press, 2009.
[2] 刘鹏. 云计算 [M]. 北京: 电子工业出版社, 2010.
LIU Peng. Cloud Computing [M]. Beijing: Publishing House of Electronics Industry, 2010.
[3] 王庆波, 金, 何乐, 等. 虚拟化与云计算 [M]. 北京: 电子工业出版社, 2009.
WANG Qingbo, JIN Xing, HE Le, et al. Virtualization and Cloud Computing [M]. Beijing: Publishing House of Electronics Industry, 2009.
[4] SCOTT GRANNEMAN S. Google Apps Deciphered: Compute in the Cloud to Streamline Your Desktop [M]. Upper Saddle River, NJ, USA: Prentice-Hall, 2009.
[5] REESE G. Cloud Application Architectures: Building Applications and Infrastructure in the Cloud [M]. Sebastopol, CA, USA: O’Reilly Media, 2009.
[6] ARRASJID J, EPPING D, KAPLAN S. Foundation for Cloud Computing with VMware vSphere 4 [M]. Berkeley, CA, USA: USENIX Association, 2010.
[7] Service Delivery Platforms and Telecom Web Services: An Industry-Wide Perspective [R]. The Moriana Group, 2004.
[Abstract] Virtualization and distributed parallel architecture are typical cloud computing technologies. In the area of virtualization technology, this article discusses physical resource pooling, resource pool management and use, cluster fault location and maintenance, resource pool grouping, and construction and application of heterogeneous virtualization platforms. In the area of distributed technology, distributed file system and Key/Value storage engine are discussed. A solution is proposed for the host bottleneck problem, and a standard storage interface is proposed for the distributed file system. A directory-based storage scheme for Key/Value storage engine is also proposed.
[Keywords] virtualization; distributed computing; cloud computing management platform; key/value storage engine