怎样用log4j读取日志文件然后把日志信息放到数据库(要源代码)。谢谢,急用。第一次这东西

Posted on

怎样用log4j读取日志文件然后把日志信息放到数据库(要源代码)。谢谢,急用。第一次这东西 - ITeye问答

您还未登录 ! 我的应用 登录 注册

ITeye3.0

问答首页入门讨论入门技术

怎样用log4j读取日志文件然后把日志信息放到数据库(要源代码)。谢谢,急用。第一次这东西

悬赏:15 发布时间:2011-05-03 提问人:Andy烦烦 (初级程序员)

log4j.rootLogger=DEBUG,CONSOLE,ROLLING_FILE DEBUG,CONSOLE,FILE,ROLLING_FILE,MAIL,DATABASE log4j.addivity.org.apache=true /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# /# Console Appender /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.logger.org.hibernate.SQL=DEBUG log4j.appender.Threshold=DEBUG log4j.appender.CONSOLE.Target=System.out log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=[tklmobileplatform] %d - %c -%-4r [%t] %-5p %c %x - %m%n log4j.appender.CONSOLE.layout.ConversionPattern=[start]%d{DATE}[DATE]%n%p[PRIORITY]%n%x[NDC]%n%t[THREAD] n%c[CATEGORY]%n%m[MESSAGE]%n%n /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# /# File Appender /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# /#log4j.appender.FILE=org.apache.log4j.FileAppender /#log4j.appender.FILE.File=file.log /#log4j.appender.FILE.Append=false /#log4j.appender.FILE.layout=org.apache.log4j.PatternLayout /#log4j.appender.FILE.layout.ConversionPattern=[lian] %d - %c -%-4r [%t] %-5p %c %x - %m%n /# Use this layout for LogFactor 5 analysis /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# /# Rolling File /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# log4j.appender.ROLLING_FILE=org.apache.log4j.RollingFileAppender log4j.appender.ROLLING_FILE.Threshold=DEBUG log4j.appender.ROLLING_FILE.File=${webapp.root}/tklmobileplatform.log log4j.appender.ROLLING_FILE.Append=true log4j.appender.ROLLING_FILE.MaxFileSize=1000KB log4j.appender.ROLLING_FILE.MaxBackupIndex=1 log4j.appender.ROLLING_FILE.layout=org.apache.log4j.PatternLayout log4j.appender.ROLLING_FILE.layout.ConversionPattern=[ejunnet] %d - %c -%-4r [%t] %-5p %c %x - %m%n /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# /# JDBC Appender /#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/#/# log4j.appender.DATABASE=org.apache.log4j.jdbc.JDBCAppender log4j.appender.DATABASE.URL=jdbc:mysql://localhost:3306/tklpf_db log4j.appender.DATABASE.driver=com.mysql.jdbc.Driver log4j.appender.DATABASE.user=root log4j.appender.DATABASE.password=root log4j.appender.DATABASE.Threshold=DEBUG log4j.appender.DATABASE.sql=log4j.appender.DATABASE.sql=INSERT INTO log(optime,thread,infolevel,class,message) VALUES ('%d', '%t', '%p', '%l', '%m') log4j.appender.DATABASE.layout=org.apache.log4j.PatternLayout log4j.appender.DATABASE.layout.ConversionPattern=[framework] %d - %c -%-4r [%t] %-5p %c %x - %m%n log4j.appender.A1=org.apache.log4j.DailyRollingFileAppender log4j.appender.A1.File=SampleMessages.log4j log4j.appender.A1.DatePattern=yyyyMMdd-HH'.log4j'

log4j.appender.A1.layout=org.apache.log4j.xml.XMLLayout

问题补充: 我们项目中要一些系统日志,错误日志等等。我知道这是一些大量的垃圾数据,但是现在经理要我们这样做啊。我才接触log4j,所以给写源代码。谢谢......

采纳的答案

2011-05-04 shadabing (高级程序员)

http://blog.csdn.net/ziruobing/archive/2009/02/22/3919501.aspx 看看这篇,你就会了 提问者对于答案的评价: 谢谢!虽然现在没用到,还是谢谢你的帮助

问题答案可能在这里 → 寻找更多解答

你把日志文件读到数据库中有什么用,显示在控制台上不行吗。

wenchenyangailiuyan (中级程序员) 2011-05-04

哦,我只知道把log4j配置到控制台上然后查看信息,那样也很方便,不过帮顶了等待高手。

wenchenyangailiuyan (中级程序员) 2011-05-04 看看这个: http://hi.baidu.com/%D1%A6%B9%A6%CF%B2/blog/item/2ef1d53f304a163571cf6c50.html

iihero (资深程序员) 2011-05-08

待解决问题数: 585 已解决问题数: 17818 已关闭问题数: 43501

Ask Myask

问答分类

Java编程和Java企业应用

Java综合 企业应用 Struts Hibernate Spring Tomcat OO Swing JBoss 设计模式 SOA iBATIS DAO 领域模型

Web前端技术

EXT JavaScript Web综合 JQuery AJAX DWR CSS GWT UI dojo prototype JavaFX YUI

移动编程和手机应用开发

Android J2ME iPhone WAP 移动综合 Symbian BlackBerry Maemo

C/C++编程

C++ C D语言

Ruby编程

rails ruby

Python编程

python django GAE

PHP编程

PHP

Flash编程和RIA

Flex Flash ActionScript AIR

Microsoft .Net

C/# .net ASP.net Windows Windows Mobile SilverLight WPF

综合技术

Database oracle Linux 编程综合 数据结构和算法 mysql SQLServer Erlang DB2 OS Unix FP MacOSX PostgreSQL Haskell

入门讨论

入门技术 IT厂商

软件开发和项目管理

项目管理 配置管理 软件测试 敏捷开发 UML 单元测试 TDD UseCase CMM XP UP

行业应用

互联网 电信 网络应用 咨询 金融 电子政务 搜索引擎 教育 物流 制造 嵌入式 SAAS 浏览器 医疗 网游 交通

招聘求职

职场话题 求职经验 企业点评 招聘职位 面试秘籍

海阔天空

工作 生活 JavaEye IT八卦 公告 读书 IT资讯 情感 大众软件 大众硬件 游戏 申诉 笑话 理财 活动 影视 体育 旅游 音乐

答题高手

© 2003-2011 ITeye.com. [ 京ICP证110151号 ]

Advanced HTTPClient Info

Posted on

Advanced HTTPClient Info

Contents

Proxies

Support for proxies (including SOCKS) is fully implemented. However, using proxies in applets is subject to a number of security restrictions (see security for more information on the various security policies and the consequences that arise from them). If you are using an http proxy then use the HTTPConnection.setProxyServer()) method to set the default proxy for all new HTTPConnections; HTTPConnection.setCurrentProxy()) can be used to set a proxy for the current HTTPConnection only. You can also manipulate a list of hosts for which no proxy is to be used with the methods HTTPConnection.dontProxyFor()) and HTTPConnection.doProxyFor()). Example: HTTPConnection.setProxyServer("my.proxy.dom", 8008); HTTPConnection.dontProxyFor("localhost"); HTTPConnection.dontProxyFor(".mycompany.com"); AuthorizationInfo.addBasicAuthorization("my.proxy.dom", 8008, realm, user, passwd); ... HTTPConnection con = new HTTPConnection(...);This will cause all connections to use the proxy (and automatically authenticate with it) except when connecting to localhost or any other machine within the company.

If you are using SOCKS then the method to use is setSocksServer()). Note that both an http proxy and a SOCKS proxy can be set at the same time, in which case a request is sent via the SOCKS server to the proxy server, which in turn relays the request to the desired destination.

Some proxies will proxy for protocols other than http using http to contact the proxy itself. If you have such a proxy then you can use the HTTPClient to do requests for other protocols through the proxy. To do this you need to create an HTTPConnection to the proxy itself (i.e. don't use setCurrentProxy()) or setProxyServer())) and specify the full URL of the file/article/whatever in the Get(), Put(), etc. Example: if you want to retrieve the file /pub/README via ftp from rtfm.mit.edu then you could use something like: HTTPConnection proxy = new HTTPConnection("my.proxy.dom", 8000); HTTPResponse resp = proxy.Get("ftp://rtfm.mit.edu/pub/README"); ...

Timeouts

Sometimes one doesn't want to wait (almost) forever until a connection is established to the server or until the server answers. In this case a timeout can be set using the methods HTTPConnection.setDefaultTimeout()) and HTTPConnection.setTimeout()). Setting a timeout will cause the client to limit the time it will spend trying to get the hosts IP-address and establishing a connection with the server, and it will also set the timeout on the socket while reading the response.

Note: if a timeout occurs while reading the response (i.e. an InterruptedIOException is thrown) then doing another

read() may or may not lose data, depending on what streams have been pushed on the response input stream. In general all HTTPClient streams are restartable, but those from the JDK aren't (e.g GZIPInputStream). This means that if such a stream has been pushed onto the stack (typically when a compression content-coding or transfer-coding was used) and an InterruptedIOException occurs then chances are the response data will be corrupted.

Output Streams

All request methods which may send data (such as POST and PUT) take either a byte[] or an HttpOutputStream. Using the latter may be beneficial when sending large amounts of data because it streams the data directly over the socket in many cases. However, restrictions on many servers may prevent the streaming, necessitating buffering of the data by the HttpOutputStream. The class documentation gives the full details, but in summary here are the things to consider: if you know the length of the data beforehand then use the appropriate constructor and you will be able to stream the data to any server. If you do not know the length beforehand then you may encounter problems, in which case you'll need to set the java system property HTTPClient.dontChunkRequests to true, and all data will be buffered by the client and only sent when the stream is close()'d.

Various modules (such as the RedirectionModule and AuthorizationModule) need to resend certain requests in order to do their job. However, if the request used an HttpOutputStream then they cannot do so; in such cases the application will have to handle those statuses itself. In order to partially solve this problem, an application may set the java system property HTTPClient.deferStreamed to true. Requests which need resending (and which used an HttpOutputStream) are then remembered by the modules and the retry flag) in the response is set to true. The application can then check this flag, and if it's true just blindly resend the request; the modules will detect the resend and apply the appropriate modifications (such as adding the necessary Authorization header field, or changing the request-uri to a new location).

Contexts

By default all authorization info, cookies, permanent redirection lists, etc. are shared between all instance of HTTPConnection. However, there are cases where one wants to simulate multiple independent clients or users within the same application. To enable this the HTTPClient a supports a concept called contexts. Each HTTPConnection instance has an associated context, which can be any Object at all, and which can be set using (HTTPConnection.setContext())). All instances of HTTPConnection which have the same context will share all info on cookies, authorization info, etc. This is done by having each module which keeps information on behalf of the application (such as the cookie module, the authorization module and the redirection module) use a separate list for each context. By setting the context for each HTTPConnection instance appropriately, you can control exactly which instances share data and which don't.

Note that you can't have more than one context per HTTPConnection, i.e. you need at least one separate instance of HTTPConnection for each client.

Example: suppose you want to simulate 3 independent clients, each having two HTTPConnection instances. Then you could do something like: HTTPConnection cl1_con1 = new HTTPConnection(...); HTTPConnection cl1_con2 = new HTTPConnection(...); cl1_con1.setContext("client1"); cl1_con2.setContext("client1"); HTTPConnection cl2_con1 = new HTTPConnection(...); HTTPConnection cl2_con2 = new HTTPConnection(...); cl2_con1.setContext("client2"); cl2_con2.setContext("client2"); HTTPConnection cl3_con1 = new HTTPConnection(...); HTTPConnection cl3_con2 = new HTTPConnection(...); cl3_con1.setContext("client3"); cl3_con2.setContext("client3");

As mentioned above, the context may be any object at all. So, if in the above example you were going to run each client in a separate thread then you could use the thread itself as the context (e.g.

con.setContext(Thread.currentThread()); ). If no context is set a default context is used which is the same for all HTTPConnection's. Therefore applications which don't need this feature can just ignore it.

Persistent Connections (Keep-Alive's)

The Hypertext Transfer Protocol originally allowed only one request per TCP connection. However, establishing a TCP connection is fairly expensive time wise, so that some implementors of HTTP/1.0 added so called Keep-Alive's to keep a connection open after a request was completed and to allow further requests to be made over that connection. Unfortunately, this was not well defined and is broken in the face of proxies. HTTP/1.1 defines persistent connections correctly and even makes them the default.

The HTTPClient will by default try to keep a connection alive for as many requests as possible, both when talking to HTTP/1.0 and HTTP/1.1 servers. To disable persistent connections you can specify a Connection header with the value close. Example: NVPair[] def_hdrs = { new NVPair("Connection", "close") }; con.setDefaultHeaders(def_hdrs);

This will disable persistent connections for all future request (unless overridden by a connection header on the request method call).

Keeping the socket connection to the server open after a request is fine as long as another request follows within a short period of time. However when you are done you should let the library know by passing the above

"Connection: close" header with the last request. Furthermore, to limit the length of time the connection will be held open a timer is started after each request which will close the socket connection if no further requests arrive within the next 60 seconds.

Note that most of this is transparent as far as the functioning of the requests is concerned; the only differences you will notice is in the time required for a request to be sent. Also note that persistent connections are only done within the context of a given instance of HTTPConnection; so if you create two instances both pointing at the same server then they will create separate connections to the server.

Closing Connections

The HTTPClient will close a socket as soon as it has determined that no more data will arrive on the socket or that any outstanding data is to be discarded. However, this is not always easy to do, and in at least one common case the client will keep a connection open indefinitely unless a simple precaution is taken (see below).

A socket is closed when one of the five following conditions occurs:

  • an exception occurs during read or write.
  • the connection timeout triggers and no responses are outstanding (note this is not the timeout set by HTTPConnection.setTimeout()), but an internal, hard coded 60 second timeout).
  • all response streams on the connection have been closed, up to and including a terminal response.
  • stop()) is invoked on the HTTPConnection.
  • the stream demultiplexer is finalized.

A "terminal" response is a response for which client has determined that the connection should be closed after it has been fully received. This includes the server sending a

"Connection: close" (in the case of an HTTP/1.1 server) or not sending a

Connection: keep-alive (in the case of an HTTP/1.0 server) with the response, the response having no Content-length and no self-delimiting body, or the receipt of certain error status codes.

As can be seen from above, a basic condition for the connection to be closed is that the response streams have been closed. This is necessary because the client can't distinguish between a slow application which takes its time to read the response and an application which is not interested in the response. Failing to close the response streams can lead to a lot of open connections hanging around, which will eventually cause you to run out of file descriptors (on a Unix box - socket descriptors on Windoze). Therefore it is important that you always close the response stream of every response. This can be done either by invoking

resp.getData() or invoking

close() on the stream from

resp.getInputStream() (where resp is the HTTPResponse). If you are not interested in the response data you can just do a

resp.getInputStream().close() .

Pipelining

If the connection is kept open across requests then the requests may be pipelined. Pipelining here means that a new request is sent before the response to a previous request is received. Since this can obviously enhance performance by reducing the overall round-trip time for a series of requests to the same server, the HTTPClient has been written to support pipelining (at the expense of some extra code to keep track of the outstanding requests).

The programming model is always the same: for every request you send you get a response back which contains the headers and data of the servers response. Now, to support pipelining, the fields in the response aren't necessarily filled in yet when the HTTPResponse object is returned to the caller (i.e. the actual response headers and data haven't been read off the net), but the first call to any method in the response (e.g. a getStatusCode()) will wait till the response has actually been read and parsed. Also any previous requests will be forced to read their responses if they have not already done so (so e.g. if you send two consecutive requests and receive responses r1 and r2, calling

r2.getHeader("Content-type") will first force the complete response r1 to be read before reading the response r2). All this should be completely transparent, except for the fact that invoking a method on one response may sometimes take a few seconds to complete, while the same method on a different response will return immediately with the desired info.

Relationship between HTTPConnection instances and TCP connections

The relationship between an HTTPConnection instance and actual TCP connections (instances of java.net.Socket) is as follows: each HTTPConnection instance creates its own sockets, i.e. sockets are not shared between instances. Sockets are created on demand, just before a request is sent (so creating an HTTPConnection instance does not immediately cause a TCP connection to be opened). A new socket is only created if either the current socket (if any) has been closed, or the previous request/response over it has determined that it will be closed at the end that response.

Some implications of the above are that 1) applications can control the number of parallel TCP connections to some degree by creating multiple instances of HTTPConnection pointing to the same server, but 2) that control is not absolute: if the server does not support keep-alive's and the application sends requests before having the read the response to the previous request, then you may end up with multiple parallel TCP connections from the same HTTPConnection instance.

Multithreading

The HTTPClient is completely multithread safe (or at least should be ;-). A single HTTPConnection instance may be shared between multiple threads, multiple requests may be run concurrently on a single or multiple HTTPConnection instances, and you may even have multiple threads reading on a single response (though I'm not sure that the latter makes much sense). Because a number of parameters are on a per HTTPConnection instance basis (such as the module list, proxy settings, default headers, etc), care should be taken when modifying these parameters if the HTTPConnection instance is shared among multiple threads.

HttpURLConnection

The HttpURLConnection class provides a java.net.HttpURLConnection compatible interface to the HTTPClient. It exists mainly to allow the HTTPClient to be used as a direct plug-in replacement for Sun's HttpClient (part of the JDK); if you are writing HTTPClient specific code, then I recommend using the HTTPClient's "native" API, i.e. HTTPConnection et. al.

The HttpURLConnection follows the model of java.net.URLConnection and java.net.HttpURLConnection in that each instance is only capable of doing a single request (i.e. a new instance must be created for each request, usually via

URL.openConnection() or

URL.openStream() ). To enable reuse of connections, HttpURLConnection manages a pool of HTTPConnection's internally. Different instances of HttpURLConnection will use the same HTTPConnection if they have the same host and port. This way socket connections are reused as much as possible (i.e. you still get to take advantage of the persistent connections capability). In addition, connect()) will only send the request but not read or parse the response; only when a method such as

getInputStream() ,

getResponseCode() , etc, is invoked is the response read and parsed. This, together with the reuse of HTTPConnection's, allows requests to be pipelined by explicitly invoking

connect() . Example: URLConnection con1 = url1.openConnection(); con1.connect(); // sends first request URLConnection con2 = url2.openConnection(); con2.connect(); // sends second request InputStream is1 = con1.getInputStream(); // waits for first response InputStream is2 = con2.getInputStream(); // waits for second response

Protocol Version

The request protocol version sent is always HTTP/1.1, except in a few circumstances when a broken server is encountered, in which case the version sent reverts to HTTP/1.0. An HTTP/0.9 request is never sent.

The protocol version returned by the server is used to select between different mechanisms for persistent connections. If the server advertises itself as being HTTP/1.1 compliant then HTTP/1.1 persistent connections are used; otherwise HTTP/1.0 keep-alive's are used (the difference is the tokens used in the Connection header for signaling persistence and the end of a connection). Apart from this, the only other distinction made between talking to an HTTP/1.0 or an HTTP/1.1 server is in how request are automatically retried (retried requests with a body need to use slightly different mechanisms for determining how long to wait after sending the headers, before sending the body).

Modules

A number of functions of the HTTPClient, such as authorization and redirection handling, are broken out into modules. Modules can be dynamically added and removed, thereby allowing certain functionality to be disabled or new functionality to be added without having to modify any of the core code. Each connection has a list of modules it will use to process a request and response. When a request method is invoked (such as

Get() or

Post() ) the request is first assembled into a Request instance. Then the request handler) of each module is invoked in turn with this request. This handler may modify the request (such as add headers) or even generate a response directly (such as a cache might do). Only after all handlers have been invoked (and none of them generated a response) is the request actually sent over the wire. Similarly, when a response is read the response handlers) in each module are invoked in turn. They may do certain things based on the status code (such as the redirection module) or the headers (such as the cookie module), modify the response, or even generate a new request (such as in the redirection and authentication modules). If a new request is generated the process starts from the top again.

The currently supplied modules are the AuthorizationModule, the RedirectionModule, the ContentEncodingModule, the TransferEncodingModule, the CookieModule, the ContentMD5Module, the RetryModule and the DefaultModule. These are explained in more detail further down.

Modules can be dynamically added and removed to tailor the request and response processing desired. The methods HTTPConnection.addDefaultModule()), HTTPConnection.removeDefaultModule()) and HTTPConnection.getDefaultModules()) manipulate and return the list of default modules which is used when a new HTTPConnection is created. Similarly, the methods HTTPConnection.addModule()), HTTPConnection.removeModule()) and HTTPConnection.getModules()) manipulate and return the list of current modules for an HTTPConnection instance. Note that for the modules which are not public you must get their Class objects via Class.forName() (e.g. to remove the redirection module, use

con.removeModule(Class.forName("HTTPClient.RedirectionModule")) ).

The default list of modules is initialized from the java system property HTTPClient.Modules. This property must be a "|" separated list of class names. If this property is not set it defaults to all the classes listed below. Normally if during class initialization any module in the list does not exist or cannot be instantiated then an Error is thrown. However, if this is being used in an unsigned applet then the error is suppressed. This way applets can limit the modules loaded over the net by simply not providing them (remember, they can't set system properties due to security restrictions).

You may create your own modules and add them using the above methods. Any module you write must implement the HTTPClientModule interface. See the API docs more info. Note: this interface may change. If you write a module and find the interface insufficient or difficult, please contact me. Also, if you write a module which you think might be of general usefulness and would like to make it freely available, let me know.

Here is a short description of each module.

AuthorizationModule

Authorization briefly described in Getting Started. As mentioned, this module will handle both server and proxy authorization requests (status codes 401 and 407). In addition to the 'Basic' and 'Digest' authorization schemes, the AuthorizationModule can be made to handle other schemes as well, so long as they are not "too exotic" (i.e. they follow the simple challenge-response mechanism outlined in the http specs); this is done by setting your own AuthorizationHandler. Of course, if you need to something more sophisticated you can always plug in your own authorization module.

When confronted with an authorization request the authorization module will query all known authorization info for a possible candidate (the match must be for the host, port, scheme and realm). If no suitable info is found, or if the server rejects any info found, an authorization handler is called to try and get the necessary info from the user; if the user does not give any information, or if the information she gives is also rejected, then the retrying is terminated and the last failure status returned to the caller. The default handler currently only understands requests for the 'Basic' and 'Digest' authorization schemes; you may however set your own handler via the AuthorizationInfo.setAuthHandler()) method. The handler given must implement the AuthorizationHandler interface. To disable the handler completely give null for the handler.

A server (or proxy) may send multiple authorization challenges in the response, in which case the above algorithm is modified to go through the list of challenges in the same order as they were sent, trying to get authorization info for each challenge in the list and going to the next challenge if either no info was found or the server rejects that info. If the end of the list is reached without achieving authorization then the authorization handler is called on each challenge (in the same order) until either an authorization request is successful, the authorization handler returns null (e.g. when the Cancel button in the default popup box is activated) or the list is exhausted, in which case the response to the last failed request is returned.

Whenever the default authorization handler needs to ask the user for a username and password, it queries the current AuthorizationPrompter. This is responsible for popping up a dialog box or otherwise acquiring the necessary info. A default prompter is provided which will pop up a simple dialog box, if the AWT is running, or use a command line prompt if it isn't and the currently running system is Unix or VMS; you may set your own prompter using DefaultAuthHandler.setAuthorizationPrompter()). To prevent the prompter from being queried (e.g. to prevent the popup box from appearing) you can either set the prompter to null, or use con.setAllowUserInteraction(false)).

RedirectionModule

This module handles the redirection status codes 301, 302, 303, 305 and 307. 301 and 307 responses are only redirected if the request method was GET or HEAD; this is because redirecting, say, a POST blindly might lead to undesired behaviour, as the circumstances leading to the POST might have changed. 302 and 303 are treated identically: the new request to the new location is done using GET (this is what many cgi's expect - they are basically directing you to a prefabricated answer). 305 is only honored if the connection is not already using a proxy.

This module also keeps a list of permanently redirected URL's (status code 301) and will preemptively redirect requests for these. This list is volatile (i.e. it will be lost when the application exits).

CookieModule

This module implements cookies as defined by Netscape's cookie spec and the latest HTTP State Management Mechanism spec. During response processing it intercepts and parses the Set-Cookie and Set-Cookie2 header fields (removing them from the response in the process), and during request processing it adds all eligible cookies to the requests.

Because of privacy issues surrounding cookies, whenever the server tries to set a cookie a policy handler is invoked to see whether this cookie should be accepted. The default handler brings up a popup describing the cookie and allowing the user to accept or reject it; the user may also summarily accept or reject cookies from whole domains. You may substitute your own policy handler using the setCookiePolicyHandler()) method. If you set the handler to null then all cookies will be silently accepted. If you do not want any cookies to be accepted then either remove the CookieModule from the list of modules or set your own handler which always returns false. The handler must implement the CookiePolicyHandler interface.

The CookieModule provides methods to view the currently stored cookies, and to add and remove cookies from that list. If you want to manually provide additional cookies for a request then you must add them to the CookieModule's internal list - any Cookie header fields you create yourself and pass to a request will be removed by the CookieModule.

The CookieModule will save and restore cookies at the start and end of the application only if the java system property HTTPClient.cookies.save is set to true. All cookies which are meant to persist across invocations are then saved to a file when the application exits and read from that file when the CookieModule is loaded. The format of the file is a serialized Hashtable of Cookies. The name of file can controlled through the system property HTTPClient.cookies.jar. If this is not set, a system dependent name is used:

  • Unix: $HOME/.httpclient_cookies
  • Mac: System Folder:Preferences:HTTPClientCookies
  • OS/2: $HOME/.httpclient_cookies
  • Win95: $JAVA_HOME/.httpclient_cookies
  • WinNT: $HOME/.httpclient_cookies (where $HOME is the value of the java system property "user.home", and $JAVA_HOME is the value of the java system property "java.home").

ContentEncodingModule

Servers may apply various content encodings to the content. The most widely used encodings are compressions: gzip, compress, and deflate. This module will handle these three content encodings by pushing an appropriate decoding stream; this means that the data read from getInputStream()) will be the clear text. The Content-Encoding header is also modified appropriately.

You might want to consider disabling this module if you are using the HTTPClient for things like web-copying where storing the compressed document makes sense.

TransferEncodingModule

This is similar to the ContentEncoding module except that it applies to transfer encodings. It also handles the gzip, compress, and deflate encodings by pushing an appropriate decoding stream onto the response input stream and modifies the Transfer-Encoding header appropriately.

ContentMD5Module

Some servers may generate a Content-MD5 header which contains an MD5 hash of the message body (after any content encoding, but before any transport encoding is applied). If this header is present this module will push a stream which calculates the MD5 hash of the body. When the stream is closed or the end of the data is reached the calculated hash is compared to the one in the Content-MD5 header and if they don't match an IOException is thrown.

DefaultModule

This handles the response status codes 408 (request timeout) and 411 (length required).

RetryModule

This module is special. It is responsible for automatically retrying requests which were aborted due to an IOException on the socket. It is unlike other modules in that it is closely tied in to the core code, instead of just manipulating the request and response structures as other modules do. The code in this module could have been put in with the rest of the core code, but moving it to a module has the advantage that this automatic retrying of requests may be disabled using the standard mechanism of removing modules.

Ordering the Modules

The handlers in the modules are invoked in the order the modules are placed in the list. Because of certain constraints between modules this order is important. The default order for the supplied modules is:

  1. RetryModule
  2. CookieModule
  3. RedirectionModule
  4. AuthorizationModule
  5. DefaultModule
  6. TransferEncodingModule
  7. ContentMD5Module
  8. ContentEncodingModule However the constraints impose only a partial ordering, so that the above order may be changed as long as the following restrictions are observed:
  • The RetryModule must be first. It catches a special RetryException, which is thrown by the core code. If it were not the first module it would not be able to catch this exception.
  • The TransferEncoding module must precede any module which handles the response message (such as the ContentEncoding module and the ContentMD5Module). There are a number of reasons for this, but basically it's because all the headers refer to the message before transport encoding is applied.
  • The ContentMD5 module must precede the ContentEncoding module, as the hash is calculated for the encoded message.

Java System Properties recognized by HTTPClient

There are a number of java system properties which are used by the HTTPClient. Most are documented somewhere in the api docs. Some of the properties may contain a list of elements, in which case the elements are separated by vertical bars ("|"). White space is ignored, except that a "| |" produces an empty element whereas "||" is treated like a single delimiter (i.e. "|").

The properties are read once, when the class that uses them is loaded (i.e. they're read in the class' static initializer). This means they need to be set either on the command line via the -D option, as in java -DHTTPClient.dontChunkRequests=true MyAppor early in your application before the class is first accessed: System.getProperties().put("HTTPClient.dontChunkRequests", "true"); ... HTTPConnection.setDefaultHeaders(...);

Here is summary of all properties recognized:

http.proxyHostRead by HTTPConnection. Used to specify the http proxy to use. See setProxyServer()) for more info. This is the same property that is used by Sun's JDK 1.1 (and later).http.proxyPortRead by HTTPConnection. Used to specify the http proxy to use. See setProxyServer()) for more info. This is the same property that is used by Sun's JDK 1.1 (and later).proxySetRead by HTTPConnection. Obsolete. Used by Sun's JDK 1.0.2. If http.proxyHost is not set and proxySet is true, then the default proxy is set using the values in proxyHost and proxyPort.proxyHostRead by HTTPConnection. Obsolete. Used by Sun's JDK 1.0.2. If http.proxyHost is not set and proxySet is true, then the default proxy is set using the values in proxyHost and proxyPort.proxyPortRead by HTTPConnection. Obsolete. Used by Sun's JDK 1.0.2. If http.proxyHost is not set and proxySet is true, then the default proxy is set using the values in proxyHost and proxyPort.HTTPClient.nonProxyHostsRead by HTTPConnection. Used to specify a list hosts for which no http proxy is to be used. See dontProxyFor()) for more info.http.nonProxyHostsRead by HTTPConnection. Used to specify a list hosts for which no http proxy is to be used. See dontProxyFor()) for more info. This is the same property that is used by Sun's JDK 1.1 (and later).HTTPClient.socksHostRead by HTTPConnection. Used to specify the SOCKS proxy host. See setSocksServer()) for more info.HTTPClient.socksPortRead by HTTPConnection. Used to specify the SOCKS proxy port. See setSocksServer()) for more info.HTTPClient.socksVersionRead by HTTPConnection. Used to specify the SOCKS proxy version. See setSocksServer()) for more info.HTTPClient.ModulesRead by HTTPConnection. Used to define the default list of modules. See HTTPConnection.addDefaultModule()) for more info.HTTPClient.disable_pipeliningRead by HTTPConnection. Used to disable all pipelining. This should never be needed, but you may encounter a server which displays problems when pipelining requests. Setting this property to true will cause the HTTPClient to stall each request until the headers from the response to the previous request have been received and parsed.HTTPClient.disableKeepAlivesRead by HTTPConnection. Used to disable keep-alive's. This exactly equivalent to (and is implemented as)

con.setDefaultHeaders(new NVPair[] { new NVPair("Connection", "close") }) .HTTPClient.forceHTTP_1.0Read by HTTPConnection. Used to force the client to pretend it's only capable of HTTP/1.0. Setting this to true causes all requests to be sent as if the server could only handle HTTP/1.0, and causes the HTTP-version in the request-line to be sent as HTTP/1.0.HTTPClient.dontChunkRequestsRead by HTTPConnection. Used to prevent the client from using the chunked transfer encoding on requests. Setting this to true causes all data written to an HttpOutputStream (that was not constructed with a content length) to be buffered and only sent when the stream is closed.HTTPClient.deferStreamedRead by HTTPConnection. When a request is sent which uses an HttpOutputStream, and handling of the response by modules would require the module to resend the request, then modules have to punt because they do not have the request data (i.e. the data written to the output stream). In order to make things easier on the application, an application can set this property to true and then check the retry flag) in the response for whether it needs to resend the request. If the value is false, the retry flag will always be false. The default is false in order to prevent memory leaks in applications which aren't prepared for it.HTTPClient.dontTimeoutRespBodyRead by RespInputStream. Used to restore pre-V0.3-3 behaviour of disabling any timeout while reading the response body. Setting this to true causes read()'s on the response input stream to block indefinitely until some data arrives or the connection closes, irrespective of whether a timeout has been set or not.HTTPClient.cookies.hosts.acceptRead by CookieModule. Used to initialize the list of hosts and domains from which to always accept cookies. See setCookiePolicyHandler()) for more info.HTTPClient.cookies.hosts.rejectRead by CookieModule. Used to initialize the list of hosts and domains from which to always reject cookies. See setCookiePolicyHandler()) for more info.HTTPClient.cookies.saveRead by CookieModule. If set to true then the CookieModule will read stored cookie at startup time, and save cookies that should persist at application exit time.HTTPClient.cookies.jarRead by CookieModule. This specifies the cookie jar to use for reading and saving cookies. If not set, a system dependent name is used.HTTPClient.log.fileRead by Log. If defined, this specifies the name of the file to which log messages are to be written. By default log messages are written to System.err.HTTPClient.log.maskRead by Log. If defined, this specifies which debug messages should be logged. The value is a number which is the bitwise OR ('|') of the values of the facilities for which logging should be enabled. E.g. the value -1 enables logging for all facilities. By default logging is disabled.HTTPClient.HttpURLConnection.AllowUIRead by HttpURLConnection. If set to true then set the defaultAllowUserInteraction to true. By default this is set to false in java.net.URLConnection.http.agentRead by HttpURLConnection. If set then the "User-Agent" header is set to this property's value.

HTTP Header Fields

All request methods accept optional header fields to be sent with the request. Here are a list of possible request and response header fields as defined in the HTTP/1.1 spec. I have added some comments to some of them, but for further info I recommend getting the specs (every header field is described in a paragraph of its own in the spec, so you can read just the part that interests you and ignore the rest).

Request Header Fields

  • Cache-Control
  • Connection - used for persistent connections; need not be set by applications
  • Date - use only in PUT/POST; even then optional
  • Keep-Alive - used for persistent connections (HTTP/1.0 only); set by the HTTPClient as needed.
  • Pragma
  • Transfer-Encoding - set by the client as necessary
  • Upgrade
  • Via

  • Accept

  • Accept-Charset
  • Accept-Encoding - set by the ContentEncoding module
  • Accept-Language
  • Authorization - generated by HTTPClient as necessary
  • From
  • Host - set by HTTPClient; may be overridden by application
  • If-Modified-Since - the server returns 304 if the document was not modified
  • If-Match - the server returns 412 if the match fails
  • If-NoneMatch - the server returns 304 if the match succeeds
  • If-Range
  • If-Unmodified-Since - the server returns 412 if modified
  • Max-Forwards - used with TRACE and OPTIONS
  • Proxy-Authorization - generated by HTTPClient as necessary
  • Range
  • Referer
  • User-Agent - generated by HTTPClient; can be modified

  • Allow - only with PUT

  • Content-Base
  • Content-Encoding - examples: gzip, compress, deflate
  • Content-Language
  • Content-Length - set by HTTPClient
  • Content-MD5
  • Content-Range
  • Content-Type

Response Header Fields

  • Cache-Control
  • Connection - used for persistent connections
  • Date - date that the document was delivered (not when it was last modified)
  • Keep-Alive - used for persistent connections (HTTP/1.0 only)
  • Pragma
  • Transfer-Encoding - HTTPClient handles chunked in the core code; gzip and deflate are handled by the TransferEncoding module.
  • Upgrade - sent with a 101 response;
  • Via - carries a list of proxies that were involved in returning the request; especially interesting for TRACE and OPTIONS requests.

  • Age - from caches

  • Location - sent with a redirection status code; since HTTPClient automatically handles 301, 302, 303, 305 and 307 status codes this field is seldom seen.
  • Proxy-Authenticate - read by HTTPClient
  • Public
  • Retry-After - optionally sent with a 503 status
  • Vary
  • Warning
  • WWW-Authenticate - read by HTTPClient

  • Accept-Ranges

  • Allow - with a 405 status
  • Content-Base
  • Content-Encoding - gzip, deflate, and compress are handled by the ContentEncoding module.
  • Content-Language
  • Content-Length
  • Content-Location
  • Content-MD5 - handled by the ContentMD5 module.
  • Content-Range
  • Content-Type
  • ETag
  • Expires - the date when the document expires
  • Last-Modified - the date the document was last modified

Further Reading

* General HTTP Info at W3C* HTTP/1.0 Spec (RFC 1945)* HTTP/1.1 Spec (RFC 2616)* Digest Authentication Spec (RFC 2617)

[HTTPClient]

Ronald Tschalär / 6. May 2001 / ronald@innovation.ch.

HttpClient

Posted on

HttpClient - HttpClient Logging Practices

Last published: 08 February 2008 | Doc for 3.1

Overview

User Guide

Project Documentation

Logging Practices

Being a library HttpClient is not to dictate which logging framework the user has to use. Therefore HttpClient utilizes the logging interface provided by the Commons Logging package. Commons Logging provides a simple and generalized log interface to various logging packages. By using Commons Logging, HttpClient can be configured for a variety of different logging behaviours. That means the user will have to make a choice which logging framework to use. By default Commons Logging supports the following logging frameworks:

  • Log4J
  • java.util.logging
  • SimpleLog (internal to Commons Logging) By implementing some simple interfaces Commons Logging can be extended to support basically any other custom logging framework. Commons Logging tries to automatically discover the logging framework to use. If it fails to select the expected one, you must configure Commons Logging by hand. Please refer to the Commons Logging documentation for more information.

HttpClient performs two different kinds of logging: the standard context logging used within each class, and wire logging.

Context Logging

Context logging contains information about the internal operation of HttpClient as it performs HTTP requests. Each class has its own log named according to the class's fully qualified name. For example the class

HttpClient has a log named

org.apache.commons.httpclient.HttpClient . Since all classes follow this convention it is possible to configure context logging for all classes using the single log named

org.apache.commons.httpclient .

Wire Logging

The wire log is used to log all data transmitted to and from servers when executing HTTP requests. This log should only be enabled to debug problems, as it will produce an extremely large amount of log data, some of it in binary format.

Because the content of HTTP requests is usually less important for debugging than the HTTP headers, these two types of data have been separated into different wire logs. The content log is

httpclient.wire.content and the header log is

httpclient.wire.header .

Configuration Examples

Commons Logging can delegate to a variety of loggers for processing the actual output. Below are configuration examples for Commons Logging, Log4j and java.util.logging.

Commons Logging Examples

Commons Logging comes with a basic logger called

SimpleLog . This logger writes all logged messages to

System.err . The following examples show how to configure Commons Logging via system properties to use

SimpleLog .

Note: The system properties must be set before a reference to any Commons Logging class is made.

Enable header wire + context logging - Best for Debugging System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog"); System.setProperty("org.apache.commons.logging.simplelog.showdatetime", "true"); System.setProperty("org.apache.commons.logging.simplelog.log.httpclient.wire.header", "debug"); System.setProperty("org.apache.commons.logging.simplelog.log.org.apache.commons.httpclient", "debug");

Enable full wire(header and content) + context logging System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog"); System.setProperty("org.apache.commons.logging.simplelog.showdatetime", "true"); System.setProperty("org.apache.commons.logging.simplelog.log.httpclient.wire", "debug"); System.setProperty("org.apache.commons.logging.simplelog.log.org.apache.commons.httpclient", "debug");

Enable just context logging System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog"); System.setProperty("org.apache.commons.logging.simplelog.showdatetime", "true"); System.setProperty("org.apache.commons.logging.simplelog.log.org.apache.commons.httpclient", "debug");

Log4j Examples

The simplest way to configure Log4j is via a log4j.properties file. Log4j will automatically read and configure itself using a file named log4j.properties when it's present at the root of the application classpath. Below are some Log4j configuration examples.

Note: Log4j is not included in the HttpClient distribution.

Enable header wire + context logging - Best for Debugging log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p [%c] %m%n log4j.logger.httpclient.wire.header=DEBUG log4j.logger.org.apache.commons.httpclient=DEBUG

Enable full wire(header and content) + context logging log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p [%c] %m%n log4j.logger.httpclient.wire=DEBUG log4j.logger.org.apache.commons.httpclient=DEBUG

Log wire to file + context logging log4j.rootLogger=INFO log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p [%c] %m%n log4j.appender.F=org.apache.log4j.FileAppender log4j.appender.F.File=wire.log log4j.appender.F.layout=org.apache.log4j.PatternLayout log4j.appender.F.layout.ConversionPattern =%5p [%c] %m%n log4j.logger.httpclient.wire=DEBUG, F log4j.logger.org.apache.commons.httpclient=DEBUG, stdout

Enable just context logging log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%5p [%c] %m%n log4j.logger.org.apache.commons.httpclient=DEBUG

Note that the default configuration for Log4J is very inefficient as it causes all the logging information to be generated but not actually sent anywhere. The Log4J manual is the best reference for how to configure Log4J. It is available at http://logging.apache.org/log4j/docs/manual.html

java.util.logging Examples

Since JDK 1.4 there has been a package java.util.logging that provides a logging framework similar to Log4J. By default it reads a config file from

$JAVA_HOME/jre/lib/logging.properties which looks like this (comments stripped): handlers=java.util.logging.ConsoleHandler .level=INFO java.util.logging.FileHandler.pattern = %h/java%u.log java.util.logging.FileHandler.limit = 50000 java.util.logging.FileHandler.count = 1 java.util.logging.FileHandler.formatter = java.util.logging.XMLFormatter java.util.logging.ConsoleHandler.level = INFO java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter com.xyz.foo.level = SEVERE To customize logging a custom

logging.properties file should be created in the project directory. The location of this file must be passed to the JVM as a system property. This can be done on the command line like so: $JAVA_HOME/java -Djava.util.logging.config.file=$HOME/myapp/logging.properties -classpath $HOME/myapp/target/classes com.myapp.MainAlternatively LogManager/#readConfiguration(InputStream) "External Link") can be used to pass it the desired configuration.

Enable header wire + context logging - Best for Debugging .level=INFO handlers=java.util.logging.ConsoleHandler java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter httpclient.wire.header.level=FINEST org.apache.commons.httpclient.level=FINEST

Enable full wire(header and content) + context logging .level=INFO handlers=java.util.logging.ConsoleHandler java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter httpclient.wire.level=FINEST org.apache.commons.httpclient.level=FINEST

Enable just context logging .level=INFO handlers=java.util.logging.ConsoleHandler java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter org.apache.commons.httpclient.level=FINEST

More detailed information is available from the Java Logging documentation. © 2001-2008, Apache Software Foundation

使用log4j显示quartz的debug信息

Posted on

使用log4j显示quartz的debug信息 - 键盘是用来敲的不是看的 - ITeye技术网站

首页 新闻 论坛 问答 博客 招聘 更多 ▼

群组 搜索

您还未登录 ! 我的应用 登录 注册

键盘是用来敲的不是看的

永久域名 http://corejava2008.iteye.com

Quartz Job Scheduling Framework | spring 通过配置向quartz 注入service

2011-01-14

使用log4j显示quartz的debug信息

文章分类:Java编程 非常的简单,在log4中设置quartz的显示级别就可以啦。 Java代码 收藏代码

  1. log4j.rootLogger=INFO, Console
  2. log4j.appender.Console=org.apache.log4j.ConsoleAppender
  3. log4j.appender.Console.layout=org.apache.log4j.PatternLayout
  4. log4j.appender.Console.layout.ConversionPattern=(%r ms) [%t] %-5p: %c/#%M %x: %m%n
  5. log4j.logger.com.genuitec.eclipse.sqlexplorer=WARN
  6. log4j.logger.org.apache=WARN
  7. log4j.logger.org.hibernate=WARN
  8. log4j.logger.org.hibernate.sql=WARN
  9. log4j.appender.R=org.apache.log4j.RollingFileAppender
  10. log4j.appender.R.File=${catalina.home}/logs/out.log
  11. log4j.appender.R.MaxFileSize=1024KB
  12. log4j.appender.R.MaxBackupIndex=1
  13. log4j.appender.R.layout=org.apache.log4j.PatternLayout
  14. log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%n
  15. log4j.logger.org.quartz=DEBUG
    log4j.rootLogger=INFO, Console log4j.appender.Console=org.apache.log4j.ConsoleAppender log4j.appender.Console.layout=org.apache.log4j.PatternLayout log4j.appender.Console.layout.ConversionPattern=(%r ms) [%t] %-5p: %c/#%M %x: %m%n log4j.logger.com.genuitec.eclipse.sqlexplorer=WARN log4j.logger.org.apache=WARN log4j.logger.org.hibernate=WARN log4j.logger.org.hibernate.sql=WARN log4j.appender.R=org.apache.log4j.RollingFileAppender log4j.appender.R.File=${catalina.home}/logs/out.log log4j.appender.R.MaxFileSize=1024KB log4j.appender.R.MaxBackupIndex=1 log4j.appender.R.layout=org.apache.log4j.PatternLayout log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%n log4j.logger.org.quartz=DEBUG 设置quartz的log信息为DEBUG级别

Quartz Job Scheduling Framework | spring 通过配置向quartz 注入service

评论

发表评论

表情图标

字体颜色: 标准深红红色橙色棕色黄色绿色橄榄青色蓝色深蓝靛蓝紫色灰色白色黑色 字体大小: 标准1 (xx-small)2 (x-small)3 (small)4 (medium)5 (large)6 (x-large)7 (xx-large) 对齐: 标准居左居中居右

代码: [code="ruby"]...[/code] (支持java, ruby, js, xml, html, php, python, c, c++, c/#, sql)

您还没有登录,请登录后发表评论(快捷键 Alt+S / Ctrl+Enter)

corejava2008的博客

corejava2008

搜索本博客

最近访客 >>更多访客

KATHY_123的博客

KATHY_123

songfantasy的博客

songfantasy chenlk823的博客

chenlk823

xrqsjj的博客

xrqsjj

博客分类

最近加入群组

评论排行榜

OpenStack_Hadoop

Posted on

OpenStack_Hadoop

针对OpenStack、Hadoop种不同领域软件的分析

1 OpenStack

1.1 简介

一款管理分布在多台物理机器上的多台虚拟机的开源虚拟机管理软件。

虚拟化是指在同一台物理机器上提供多台虚拟机器的技术。

其可以在多台计算机(PC或者小型机)组成的网络集群上不跨物理机器的前提下自由调配单机资源以虚拟化成一台或者多台个人PC机器、局域网供用户使用和服务。

优点:可自由调配、定制以几乎单机全部的有限资源合理应对模拟几乎无限种可能的低运算量业务处理环境。

缺点:当全部资源都被使用后,实际资源有效利用率低(考虑多个虚拟机重复的操作系统环境、分布式虚拟机镜像文件存储的冗余备份、虚拟机技术本身的VM运行机制)。

其更适用于办公、开发、测试环境使用。

1.2 网文:虚拟化的误区

服务器虚拟化技术之十大误区

2008-09-08 10:20 摘自http://datacenter.chinabyte.com/280/8297280.shtml

[导读]服务器虚拟化技术之十大误区

误区1:虚拟化技术可以实现多台物理服务器资源整合,从而实现单个应用通过虚拟化技术而运行在多台物理硬件上

实际上,虚拟化技术不能将一个应用分布运行在多台物理硬件上,那是分布式计算要去解决的问题。分布式计算环境和虚拟化环境是两种不同的资源整合方式。当然,如果想通过虚拟化技术实现一个应用跨物理平台运行技术上来说是可行的,只是为了解决不同硬件之间的CPU和内存级指令、数据的同步,需要使用一些特别的技术,比如Infiniband等,这会极大地增加系统的复杂性和成本。实际上,基于这种理念的虚拟化产品曾在实验室实现,但是由于成本等因素无法投入市场。今天能看到的所有服务器虚拟化技术解决方案都不提供一个应用跨物理服务器运行,也就是说,虚拟化环境下一个应用能使用的最大资源就是一台独立的物理服务器。

误区2:服务器虚拟化技术就会陷入将多个鸡蛋放到一个篮子的尴尬

通过虚拟化技术,提高了服务器的利用效率和灵活性。但同时也使得单台服务器上运行了多个独立的虚拟机,也就是多个不同的应用。我们原来在一台服务器上只运行一个应用,服务器维护和升级时只会影响单个应用。通过运行虚拟化技术,我们在维护和升级服务器时会影响该服务器上运行的所有虚拟机和应用。这导致很多人认为的问题:多个虚拟机放置在一台服务器上的“鸡蛋和篮子”问题。

实际上,VMware很早就意识到了这个问题,这个问题可以通过两个方面的能力去解决。一是怎么保证虚拟化后的服务器物理硬件维护和升级的问题。二是物理服务器故障时如何保护这些虚拟机的安全。

首先,VMware创造性的发明了VMotion的技术,解决了虚拟化后物理服务器的升级和维护问题。通过VMotion,VMware可以在服务器需要维护升级时动态将虚拟机迁移到其他的物理服务器,通过内存复制技术,确保每台虚拟机任何对外的服务都不发生中断,从而实现了:停物理硬件、不停应用。下图是VMotion的具体实现,已经有超过50%的VMware客户部署了VMotion技术。

其次,VMware推出了VMware HA的功能来保护物理服务器的安全。一旦发生物理服务器故障,VMware HA可以智能检测到这一事件,及时快速地在其他物理服务器上重新启用这些虚拟机,从而保证虚拟机的安全性和可靠性。

误区3:动态在线虚拟机迁移可以跨越任何硬件进行

目前VMware在业界推出了标志性的创新产品功能VMotion,可以实现虚拟机动态在线跨越硬件服务器进行迁移。但是这是有一个兼容前提,也就是两台物理服务器要达到CPU指令级的兼容,或者是完全一样的CPU,或者是同一家族的CPU。如果CPU指令不兼容,进行内存复制后新机器CPU不能识别这些指令就会导致系统崩溃。当然,具体CPU指令级是否兼容,VMotion会自动进行判定。

当然,如果您可以离线进行虚拟机的迁移,就可以跨越任何ESX兼容的硬件进行迁移,就没有CPU型号等的制约。

误区4:数据中心虚拟化后可以节约虚拟机里运行软件许可证的成本

虚拟化技术并未改变软件许可证的发放方式,因此虚拟化技术并不意味着操作系统或应用软件许可证成本的节约,除非操作系统、应用软件厂商重新调整了软件许可证策略。因此,想通过使用虚拟化技术来减少应用软件许可成本的想法是错误的。当然,实施虚拟化技术也不会增加操作系统或应用软件的许可证成本。

误区5:数据中心虚拟化只使用于边缘应用,对关键应用或资源消耗较大的应用目前还不能虚拟化

PC服务器的虚拟化技术已经相当成熟,在美国和欧洲已经获得了广泛应用。实际上,很多关键的业务应用已经运行在虚拟化的平台上。对于资源消耗比较高的应用,需要进行合理的规划才能迁移到虚拟化上来,即使某个机器的资源消耗特别巨大,仍然可以通过升级服务器的内存、CPU来使它顺利迁移到高端PC服务器上来。当然,某个虚拟机能够支持的最大资源仍然是有限制的,比如运行在VMware的ESX Server 3.0上的虚拟机,最多可以支持16GB内存和4颗虚拟CPU。如果这些资源仍然无法满足某个应用的需求,该应用还是不能运行在虚拟化的平台上。基于一般考虑,大多数资源消耗较大的应用仍然能够安全运行到虚拟化平台上。

误区6:英特尔和AMD都开始在CPU级支持虚拟化技术,已不需要再购买虚拟化软件了

CPU的厂商英特尔和AMD都在推行基于CPU的虚拟化,实际上CPU级的虚拟化就是在CPU指令级增加了许多虚拟化的指令而已,这并非说用户可以不需要购买虚拟化软件了,CPU级的虚拟化需要虚拟化软件才能使用起来。目前所有的常用操作系统都不支持CPU级的虚拟化。而VMware提供的虚拟化平台正是通过利用英特尔和AMD提供的CPU指令的虚拟化,进而提高了虚拟化的效率,有效提高了虚拟机的性能,降低了虚拟化带来的损耗,大大加速数据中心虚拟化的进程。所以说,CPU的虚拟化是对服务器虚拟化的极大推动,而不是限制VMware这样的虚拟化产品的推广。

误区7:数据中心虚拟化会极大地降低服务器的性能

虚拟化有两种基本架构:寄居架构和裸金属架构,两种架构如下图所示。寄居架构由于基于传统的操作系统之上,所以性能消耗大,往往会对服务器性能影响很大。而裸金属架构基于专门为虚拟化而设计的虚拟化层而实现,大大降低了虚拟化引入的损耗,可以极大改善虚拟机的性能,是企业级数据中心进行虚拟化的首选架构。

因此,对用户来说,为了满足应用对性能的追求,建议采用企业级虚拟化架构――裸金属架构,这可以尽可能降低数据中心虚拟化对服务器性能的影响,一般影响可以降到10%以下。

下图是采用裸金属架构虚拟化对应用性能的影响情况,这是VMware在中国某个用户现场的实测结果,已经很好说明了虚拟化带来的消耗是很低的。

误区8:虚拟化技术仍然不成熟,数据中心虚拟化还不能提上议事日程

虚拟化技术已经获得了广泛地应用,财富100强的所有用户都已经部署了VMware的虚拟化解决方案,财富1000强中超过800家都是VMware的用户。实际上,VMware的企业级用户数量已经超过20000家,而所有用户的数量已经超过四百万家。VMware的服务器虚拟化方案已经久经考验,成为整个IT业界津津乐道的热点,虚拟化已经成为企业级用户构建新型数据中心的利器,成为值得信赖的可靠、稳定的企业级解决方案。

误区9:虚拟化由于引入了新的层次,会增加数据中心的管理难度

在数据中心引入虚拟化确实增加了一个虚拟化层,但并非因此而增加了管理难度。由于虚拟化的管理软件能够很好的管理控制虚拟平台的同时,简化了杂乱的服务器的管理,从而大大降低了大型数据中心的管理复杂性。如VMware VirtualCenter就是很好的例证,Virtual Center提供了直观的管理界面,提供了丰富的资料和数据来监控整合虚拟化中心,为数据中心高效管理提供了强大的手段,成为新型虚拟化数据中心的必备工具。下图是Virtual center对虚拟机的管理界面。

误区10:服务器虚拟化技术很美好,从原来架构迁移到虚拟架构耗时费力,而且可能风险巨大

如果迁移到虚拟化平台是很多用户的顾虑之一,因为虚拟化是一种架构决策。VMware已经进行了大量工作来简化从物理架构向虚拟架构的迁移,VMware Converter可以让用户不需要重新安装操作系统和应用,通过打包方式,将原来的物理服务器轻松迁移到虚拟平台上来。这不仅简化了流程,也降低了整个的迁移风险,目前很多企业级的用户都在享受VMware Converter所带来的好处。下图是VMware Converter的一个操作主界面,用户可以从VMware的网站免费下载VMware Converter的试用版来进行迁移试验。

1.3 网文:同类系统及其原理

最近笼统地学习和试用了几款比较有名的虚拟化管理软件。学习的内容包括Eucalyptus, OpenNebula, OpenStack, OpenQRM, XenServer, Oracle VM, CloudStack, ConVirt。借这一系列文章,对过去一个月的学习内容作一个阶段性的总结。

摘自http://zhumeng8337797.blog.163.com/blog/static/1007689142011112035330566/

(1)授权协议、许可证管理、购买价格等方面的比较

授权协议

许可证管理

商业模式 Eucalyptus

社区版采用GPLv3授权协议

企业版使用自定义的商业授权协议

社区版不需要安装许可证

企业版需要在云控制器(CLC)节点上安装许可证

社区版免费使用

企业版按处理器核心总数收费,用户购买的许可证针对特定版本永久有效。 OpenStack

Apache 2.0授权协议

不需要许可证

免费使用 OpenNebula

Apache 2.0授权协议

不需要许可证

社区版免费使用

企业版将社区版重新打包,提供补丁等程序的访问权限,使得用户能够更容易的安装、配置和管理,以订阅的模式提供服务。

企业版按物理服务器总数收费,每台物理服务器器的服务价格为250欧元每年。 OpenQRM

社区版使用GPLv2授权协议

企业版使用自定义的商业授权协议

不需要许可证

社区版免费使用

企业版将社区版重新打包,提供补丁等程序的访问权限,使得用户能够更容易的安装、配置和管理,以订阅的模式提供服务。基本、标准和高级服务的价格分别为480、960、1920欧元每月。 XenServer

Citrix XenServer系列产品均使用自定义的商业授权协议

基于XenServer的Xen Cloud Platform使用GPLv2授权协议

不管是XenServer还是Xen Cloud Platform都需要在每台服务器安装许可证

许可证每年更新一次

XenServer免费版本和开源版本的Xen Cloud Platform可以免费使用

XenServer高级版、企业版和白金版按物理服务器数量收费,分别是1000、2500和5000美元。购买的许可证针对特定版本永久有效 Oracle VM

Oracle VM Server是基于Xen开发的,使用GPLv2协议发布,从Oracle的网站可以下载到源代码,但是Oracle并不宣传这一点。

Oracle VM Manager使用自定义的商业授权协议。

Oracle VM VirtualBox的二进制版本使用自定义的商业授权协议,源代码使用GPLv2授权协议。

不需要许可证

免费使用,可以购买技术支持。技术支持的费用为每台物理服务器8184人民币每年。 CloudStack

社区版采用GPLv3授权协议企业版使用自定义的商业授权协议

社区版不需要安装许可证

企业版需要在管理服务器上安装许可证

社区版免费使用企业版提供增强功能和技术支持,收费模式不详。 ConVirt

社区版使用GPLv2授权协议

企业版使用自定义的商业授权协议

社区版不需要安装许可证

企业版需要在管理服务器上安装许可证

社区版免费使用

企业版提供增强功能和技术支持,按物理服务器数量收费,每个节点费用1090美元。购买的许可证针对特定版本永久有效。

(2)项目历史与运营团队、社区规模和活跃程度、沟通交流等方面的比较

项目历史与运营团队社区规模和活跃程度沟通交流

项目历史与运营团队

社区规模和活跃程度

沟通交流 Eucalyptus

最 初是UCSB的HPC研究项目,2009年初成立公司来支持该项目的商业化运营。现任CEO是曾担任MySQL CEO的Marten Mickos,现任工程部门SVP的Tim Cramerc曾担任 Sun公司NetBeans和OpenSolaris项目的执行总监。整个管理团队对开放源代码项目的管理和运营方面具有丰富的经验。

在 同类开放源代码项目当中,Eucalyptus的社区规模最大,活跃程度也最高。主要原因是该项目起源于大学研究项目,次要原因是管理团队对开放源代码理 念的高度认同。Ubuntu 10.04服务器版选择Eucalyptus作为UEC的基础构架,大大地促进了Eucalyptu的推广。

社区发表在论坛上的问题通常在48小时内得到回应,通过技术支持电子邮件提出的问题通常在24小时内得到回应。

Eucalyptus在北京和深圳设有办事处,在中国有工程师提供支持团队。 OpenStack

OpenStack 是服务器托管公司RackSpace与NASA共同发起的开放源代码项目。在开放源代码项目的管理和运营方面,RackSpace和NASA显然缺乏足够 的经验。针对OpenStack项目的批评集中在(1)RackSpace对项目有过于强烈的控制欲,(2)OpenStack项目的运作对于社区成员来 说基本上是不透明的,(3)OpenStack项目对同类开放源代码项目的攻击性过強。

社 区规模较小,主要参与者为支持/参与该项目的公司人员。有几个公开的邮件列表,流量很小。由于该项目比较新,在网络上可以参考的安装与配置方面的文章不 多。Ubuntu 11.04服务器版同时支持Eucalyptus和OpenStack作为UEC的基础构架,将有助于OpenStack的推广。

通过邮件列表进行技术方面的沟通,通常在48小时内得到回应。商务方面的邮件沟通,没有得到回应。 OpenNebula

2005年启动的研究性项目,2008年初发布第一个开放源代码版本,2010年初大力推进开源社区的建设。

社区规模较小,主要参与者为支持/参与该项目的公司人员,以及少量的用户。有几个公开的邮件列表,流量比OpenStack项目的流量稍大。在网络上搜索到一些中文版安装和配置方面的文章,基本上是以讹传讹,缺乏可操作性。英文版的相关文章也不多,可操作的更少。

通过邮件列表进行技术方面的沟通,通常在48小时内得到回应。 OpenQRM

起源于集群管理方面的软件,2006年公开源代码,2008年免费发布,目前版本为4.8。

项目的运营团队较小,似乎只有Matt Rechenburg一个人。

有一些零星的用户,基本上没有形成社区。虽然功能还在不断更新,但是用户文档的日期是2008年的。相关论坛的活跃程度比OpenStack和OpenNebula更差。

在论坛发布的问题,大约有50%左右没有得到回应。通过电子邮件进行商务沟通,反应迅速,在24小时以内得到回应。 XenServer

Citrix公司的产品,与Xen项目的发展基本同步。

围绕Xen Cloud Platform有一些开放源代码的项目,用于替代XenCentor提供基于桌面或者是浏览器的管理功能。

初期商务沟通的速度比较快。 Oracle VM

Oracle公司的产品,用户量较小。Oracle VM仅仅是Oracle用户生态系统中的一部分,不是Oracle的关键业务。

有一定数量的用户,但是没有形成社区。在网络上缺少与Oracle相关的讨论与交流。Oracle VM团队有一个博客网站,但是最近两篇文章的日期分别是2010年11月和2008年1 月。产品下载的速度很慢。

初期商务沟通的速度比较快。在技术方面的沟通,Oracle在国内没有相应的技术人员提供支持。 CloudStack

源于2008年成立的VMOps公司,2010年五月启用cloud.com域名,2010年6 月共同启动OpenStack项目。

用户数量较少,论坛不是很活跃。官方文档非常完备,按照文档操作至少能够顺利地完成安装和配置过程。网络上可以搜索到一些可操作的安装和配置文档(得益于CloudStack的安装和配置比较简单)。

商务沟通比较困难,通过社区论坛和电子邮件提出的问题都没有得到回应。 ConVirt

起源于2006年发起的XenMan项目,与Xen项目的发展基本同步。目前的版本为ConVirt 2.0。现任CEO和工程部门EVP均来自Oracle。

用户规模与Eucalyptus相当,论坛的活跃程度很高。官方文档非常完备,按照文档操作至少能够顺利地完成安装和配置过程。在网络上搜索到的中英文的安装配置教程也基本可用。

商务沟通非常顺畅,社区发表在论坛上的问题通常在48小时内得到回应,通过技术支持电子邮件提出的问题通常在24小时内得到回应。

(3)综合评估

总 的来说,虚拟化管理软件的用户还不是很多。大部分虚拟化管理软件的社区规模较小,活跃程度也不高。除了Eucalyptus积极地鼓励社区用户参与项目的 开发与测试之外,其他项目选择开放源代码只是一种营销策略。如果排除技术和价格方面的因素,最值得选择的软件无疑是Eucalyptus和 ConVirt。这两个项目拥有最大和最活跃的用户社区,其开发/运营团队与潜在客户之间的沟通最为顺畅。XenServer也是一个值得考虑的对象,但 是XenServer社区版要求对每台物理服务器都要每年更新一次许可证。对于拥有大量物理服务器的公司来说,管理和维护成千上百个许可证将是一个令人头 疼的问题。

架构篇:

(1)系统构架比较

系统构架 Eucalyptus

Eucalyptus 是一个与Amazon EC2兼容的IaaS系统。Eucalyptus包括云控制器(CLC)、Walrus、集群控制器(CC)、存储控制器(SC)和节点控制器(NC)。 CLC是整个Eucalyptu系统的核心,负责高层次的资源调度,例如向CC请求计算资源。Walrus是 一个与Amazon S3类似的存储服务,主要用于存储虚拟机映像和用户数据。CC是一个集群的前端,负责协调一个集群内的计算资源,并且管理集群内的网络流量。SC是一个与 Amazon EBS类似的存储块设备服务,可以用来存储业务数据。NC是最终的计算节点,通过调用操作系统层的虚拟化技术来启动和关闭虚拟机。在同一个集群(CC)内 的所有计算节点(NC)必须在同一个子网内。 在一个集群(CC)内通常需要部署一台存储服务器(SC),为该集群内的计算节点提供数据存储服务。

Eucalyptus 通过Agent的方式来管理计算资源。在每一个计算节点上,都需要运行一个eucalyptus-nc的服务。该服务在集群控制器(CC)上注册后,云控 制器(CLC)即可通过集群控制器(CLC)将需要运行的虚拟机映像文件(EMI)拷贝到该计算节点上运行。

Eucalyptus 将虚拟机映像文件存储在Walrus上。当用户启动一个虚拟机实例的时候,Eucalyptus首先将相应的虚拟机映像(EMI)从Walrus拷贝到将 要运行该实例的计算节点(NC)上。当用户关闭(或者是由于意外而重启)一个虚拟机实例的时候,对虚拟机所做的修改并不会被写回到Walrus上原来的虚 拟机映像(EMI)上,所有对该虚拟机的修改都会丢失。如果用户需要保存修改过的虚拟机,就需要利用工具(euca2ools)将该虚拟机实例保存为新的 虚拟机映像(EMI)。如果用户需要保存数据,则需要利用存储服务器(SC)所提供的弹性块设备来完成。

OpenStack

OpenStack是一个与Amazon EC2兼容的IaaS系统。OpenStack包括OpenStack Compute和OpenStack Object Storage两个部分。

OpenStack Compute又包含Web前端、计算服务、存储服务、身份认证服务、存储块设备(卷)服务、网络服务、任务调度等多个模块。OpenStack Compute的不同模块之间不共享任何信息,通过消息传递进行通讯。因此,不同的模块可以运行在不同的服务器上,也可以运行在同一台服务器上。

OpenStack Object Store可以利用通用服务器搭建可扩展的海量数据仓库,并且通过冗余来保证数据的安全性。同一份数据的在多台服务器上都有副本,将出现故障的服务器从集 群中撤除不会影响数据的完整性,加入新的服务器后系统会自动地在新的服务器上为相应的文件创建新的副本。从功能上讲,OpenStack Object Store同时具备Eucalyptus中的Walrus服务和弹性块设备(SC)服务。不过OpenStack Object Store不是一个文件系统,不能够保证数据的实时性。从这个方面来考虑,OpenStack Object Store更适合用于存储需要长期保存的静态数据,例如操作系统映像文件和多媒体数据。

OpenStack通过Agent的方式来管理计算资源。在每一个计算节点上,都需要运行nova- network服务和nova-compute服务。这些服务启动之后,就可以通过消息队列来与云控制器进行交互。

OpenNebula

OpenNebula 的构架包括三个部分:驱动层、核心层、工具层。驱动层直接与操作系统打交道,负责虚拟机的创建、启动和关闭,为虚拟机分配存储,监控物理机和虚拟机的运行 状况。核心层负责对虚拟机、存储设备、虚拟网络等进行管理。工具层通过命令行界面/浏览器界面方式提供用户交互接口,通过API方式提供程序调用接口。

OpenNebula 使用共享存储设备(例如NFS)来提供虚拟机映像服务,使得每一个计算节点都能够访问到相同的虚拟机映像资源。当用户需要启动或者是关闭某个虚拟机 时,OpenNebula通过SSH登陆到计算节点,在计算节点上直接运行相对应的虚拟化管理命令。这种模式也称为无代理模式,由于不需要在计算节点上安 装额外的软件(或者服务),系统的复杂度也相对降低了。

OpenQRM

OpenQRM 是为了管理混合虚拟化环境而开发的一个虚拟化管理框架,包括基础层(框架层)和插件。基础层(框架)的作用是管理不同的插件,而对虚拟资源的管理(计算资 源,存储资源,映像资源)都是通过插件来实现的。OpenQRM的框架类似于Java语言中的Interface,定义了一系列虚拟机资源生命周期管理的 方法,例如创建、启动、关闭虚拟机等等。在个框架的基础上,OpenQRM针对不同的虚拟化平台(Xen、KVM)实现了不同的插件,用来管理不同的物理 和虚拟资源。当出现新的资源需要支持的时候,只需要为OpenQRM编写新的插件,就可以无缝地整合到原来的环境中去。

OpenQRM 插件也是使用无代理模式工作的。当需要管理的目标节点提供SSH登录方式时,OpenQRM插件通过SSH登陆到计算节点,在计算节点上直接运行相对应的 虚拟化管理命令。当需要管理的目标节点提供HTTP/HTTPS/XML-RPC远程调用接口时,OpenQRM插件通过目标节点所提供的远程调用接口实 现对目标平台的管理。

OpenQRM是一个虚拟化管理平台,不提供与Amazon EC2兼容的云管理接口。

XenServer

XenServer 是对Xen虚拟化技术的进一步封装,在Dom0上提供一系列命令行和远程调用接口,独立的管理软件XenCenter通过远程调用这些接口来管理多台物理 服务器。XenSever在标准Xen实现之上所实现的远程调用接口类似于其他虚拟化管理平台中所实现的Agent,因此XenServer是通过 Agent方式工作的。由于只考虑对Xen虚拟化技术的支持,XenServer的构架相对简单。

XenServer 是一个虚拟化管理平台,不提供与Amazon EC2兼容的云管理接口。管理软件XenCenter是运行在Windows操作系统上的,对于需要随时随地访问管理功能的系统管理员来说有点不便。目前 有一些第三方提供的开放源代码的基于浏览器的XenServer管理工具,但是都还处于比较早期的阶段。

Oracle VM

Oracle VM包括Oracle VM Server和Oracle VM Manager两个部分。Oracle VM Server在支持Xen的Oracle Linux上(Dom0)运行一个与Xen交互的Agent,该Agent为Oracle VM Manager提供了远程调用接口。Oracle VM Manager通过一个Java应用程序来对多台Oracle VM Server上的虚拟资源进行管理和调度,同时提供基于浏览器的管理界面。由于只考虑对Xen虚拟化技术的支持,Oracle VM Server / Manager的构架相对简单。

Oracle VM是一个虚拟化管理平台,不提供与Amazon EC2兼容的云管理接口。

值 得注意的是,Oracle VM Manager还通过Web Service的方式提供了虚拟机软件生命周期管理的所有接口,使得用户可以自己使用不同的编程语言来调用这些接口来开发自己的虚拟化管理平台。不过由于 Oracle在开放源代码方面的负面形象,似乎没有看到有这方面的尝试。

CloudStack

与 OpenQRM类似,CloudStack采用了“框架 + 插件”的系统构架,通过不同的插件来提供对不同虚拟化技术的支持。对于标准的Xen / KVM计算节点,CloudStack需要在计算节点上安装Agent与控制节点进行交互;对于XenServer / VMWare计算节点,CloudStack通过XenServer / VMWare所提供的XML-RPC远程调用接口与计算节点进行交互。

CloudStack本身是一个虚拟化管理平台,但是它通过CloudBridge提供了与Amazon EC2相兼容的云管理接口,对外提供IaaS服务。

ConVirt

ConVirt 是一个虚拟化管理平台,使用无代理模式工作。当需要管理的目标节点提供SSH登录方式时,ConVirt通过SSH登陆到计算节点,在计算节点上直接运行 相对应的虚拟化管 理命令。当需要管理的目标节点提供HTTP/HTTPS/XML-RPC远程调用接口时,ConVirt插件通过目标节点所提供的远程调用接口实现对目标 平台的管理。

ConVirt 是一个虚拟化管理平台,不提供与Amazon EC2兼容的云管理接口。但是ConVirt 3.0提供了与Amazon EC2 / Eucalyptus的用户接口,使得ConVirt用户能够在同一个Web 管理界面下同时管理Amazon EC2 / Eucalyptus提供的虚拟计算资源。

(2)云管理平台还是虚拟化管理平台?

在IaaS这个层面,云管理和虚拟化管理的概念非常接近,但是有一些细微的差别。

虚 拟化是指在同一台物理机器上提供多台虚拟机器(包括CPU、内存、存储、网络等计算资源)的能力。每一台虚拟机器都能够像普通的物理机器一样运行完整的操 作系统以及执行正常的应用程序。当需要管理的物理机器数量较小时,虚拟机生命周期管理(资源配置、启动、关闭等等)可以通过手工去操作。当需要管理的物理 机器数量较大时,就需要写一些脚本/程序来提高虚拟机生命周期管理的自动化程度。以管理和调度大量物理/虚拟计算资源为目的系统,属于虚拟化管理系统。这 样一个系统,通常用于管理企业内部计算资源。

云 计算是指通过网络访问物理/虚拟计算机并利用其计算资源的实践。通常来讲,云计算提供商以虚拟机的方式向用户提供计算资源。用户无须了解虚拟机背后实际的 物理资源状况,只需了解自己所能够使用的计算资源配额。因此,虚拟化技术是云计算的基础。任何一个云计算管理平台,都是构建在虚拟化管理平台的基础之上 的。如果某个虚拟化管理平台仅对某个集团内部提供服务,那么这个虚拟化管理平台也可以被称为“私有云”;如果某个虚拟化管理平台对公众提供服务,那么这个 虚拟化管理平台也可以被称为“公有云”。服务对象的不同,对虚拟化管理平台的构架和功能提出了不同的需求。

私 有云服务于集团内部的不同部门(或者应用),强调虚拟资源调度的灵活性。系统管理员需要为不同的部门(或者应用)定制不同的虚拟机,根据部门(或者应用) 对计算资源的需求对分配给某些虚拟机的计算资源进行调整。从这个意义上来讲,OpenQRM、XenServer、Oracle VM、CloudStack和ConVirt比较适合提供私有云服务。

公 有云服务于公众,强调虚拟资源的标准性。通过将计算资源切割成标准化的虚拟机配置(多个系列的产品,每个产品配置相同数量的CPU、内存、磁盘空间、网络 流量配额),公有云提供商可以通过标准的服务合同(Service Level Agreement, SLA)以标准的价格出售计算资源。当用户对计算资源的需求出现改变的时候,用户只需要缩减或者是增加自己所使用的产品数量。由于Amazon EC2是目前比较成功的公有云提供商,大部分云管理平台都在某种程度上模仿Amazon EC2的构架。从这个意义上来讲,Eucalyptus、OpenNebula和OpenStack提供了与Amazon EC2兼容或者是类似的接口,比较适合提供公有云服务。

公有云和私有云之间的界限,就像“内部/外部”和“部门/合作伙伴”的概念一样,并不十分明显。根据项目需求的不同,可能会有不同的解释。

功能篇:

(1)支持的虚拟化技术

Xen

KVM

XenServer / XCP

VMWare

LXC

openVZ Eucalyptus

Y

Y

Y

OpenStack

Y

Y

Y

Y

Y

OpenNebula

Y

Y

Y

OpenQRM

Y

Y

Y

Y

Y

Y XenServer

Y

Oracle VM

Y

CloudStack

Y

Y

Y

ConVirt

Y

Y

可以看出,Xen和KVM是目前获得最广泛的厂商虚拟化技术,紧随其后的是VMWare。需要注意的是,XenServer是对Xen的进一步封装,可以认为是一种新的虚拟化平台(用户在XenServer上不能直接执行Xend相关命令)。

(2)系统安装和配置

(3)前端 计算节点 备注

前端

计算节点

备注 Eucalyptus

使用Ubuntu 10.04或者CentOS 5.5操作系统,通过apt-get install或者yum install的方式直接安装二进制包,构建一个包含CLC、 Walrus、SC、CC的前端。根据官方网站提供的文档进行操作,是比较容易实现的。

使用Ubuntu 10.04或者CentOS 5.5操作系统,通过apt-get install或者yum install的方式直接安装二进制包,构建一个提供NC服务的计算节点。根据官方网站提供的文档进行操作,是比较容易实现的。

Eucalyptus 包含了一个dhcpd,如果配置不好的话,会造成一定的麻烦。另外,计算节点(NC)与集群控制器(CC)必须在一个C类子网里(例如,掩码为 255.255.255.0)。如果NC和CC在一个超网里(例如,掩码为255.255.0.0),在注册服务的时候会出现一些问题。

OpenStack

在Ubuntu 10.04上利用官方网站提供的nova- install脚本进行安装,基本上没有遇到问题。

在Ubuntu 10.04上利用官方网站提供的nova- install脚本进行安装,基本上没有遇到问题。

对于一个简单的系统,安装配置比较简单。 OpenNebula

使 用CentOS 5.5操作系统,配置好CentOS Karan源,启用kbs- CentOS- Testing条目。下载对应的rpm包,直接yum localinstall -nogpgcheck opennebula/*.rpm,就可以直接完成安装过程。按照官方文档创建/srv/cloud/one和/srv/cloud/images目录,通 过NFS共享/srv/cloud目录。创建cloud用户组和属于cloud用户组的oneadmin用户。

按照官方文档创建/srv/cloud/one和/srv/cloud/images目录,通过NFS共享/srv/cloud目录。创建cloud用户组和属于cloud用户组的oneadmin用户。

将前端服务器上oneadmin用户的ssh key拷贝到计算节点上oneadmin用户的authorized_keys中。这样前端服务器才可以通过SSH登陆到计算节点上。

在CentOS 5.5 x86_64上进行安装的时候,如果按照官方网站提供的文档进行操作,先配置好必要的软件依赖关系再安装opennebula,就会出现xmlrpc-c包版本不对的错误。

网络上可以搜索到一些安装配置方面的文档和教程,但是对于熟悉Linux但是不熟悉OpenNebula的开发人员来说,很难按照这些文档完成安装和配置过程。

OpenQRM

在Ubuntu 10.04上通过SVN下载OpenQRM源代码,进入源代码目录后依次执行make / make install / make start命令。按照官方文档的描述创建数据库,然后通过Web界面进行下一步的安装和配置。

计算节点配置好网桥和虚拟化支持之外不需要特别的安装和配置。在OpenQRM管理界面中启用相对应的插件即可通过插件对计算节点进行管理。

在Ubuntu 10.04上安装前端时,可能需要手工安装dhcp3- server。

启用插件管理虚拟资源的操作流程不够直观,并且缺乏详细的文档。

XenServer

前端为基于Windows操作系统的XenCenter。在Windows XP上可以安装,需要.NET Framework Update 2的支持。安转过程非常简单,基本上不需要配置。

从Citrix的网站下载ISO,刻盘直接安装在裸机上即可。计算节点安装完毕后,在XenCenter中把新增计算资源添加到资源池即可。

每一台XenServer服务器都需要安装从Citrix获得License,并且每年更新一次。 Oracle VM

在CentOS 5.5 x86_64上进行安装。将ISO文件mount起来后,执行runinstaller.sh即可。

从Oracle的网站下载ISO,刻盘直接安装在裸机上即可。计算节点安装完毕后,在Oracle VM Manager中把新增计算资源添加到资源池即可。

最好从Oracle的官方网站下载,不过速度很慢。通过迅雷等途径下载的文件,看起来似乎没有问题,但是ISO刻盘后在启动操作系统安装过程中会出现错误。

如果在Oracle VM Server上安装Oracle VM Manager,建议分区的时候把/ 分得大一点,不然的话会由于磁盘空间不够而无法安装Oracle VM Manager。

CloudStack

在CentOS 5.5和Ubuntu 10.4上,按照官方网站的安装文档顺序操作,基本没有问题。

计算节点上必须安装相应的Agent。

安装配置相对简单,但是在删除物理资源的时候存在较多的问题。 ConVirt

在CentOS 5.5和Ubuntu 10.4上,按照官方网站的安装文档顺序操作,基本没有问题。

在Ubuntu 10.04上安装企业版,需要手工sudo apt- get install libmysqlclient- dev。

在计算节点上的root用户必须允许管理节点上运行ConVirt服务的用户通过key auth方式登录。

安装配置相对简单。

不 同的虚拟化管理软件有不同的设计理念,采用不同的系统构架,类似的概念也采用不同的术语来表述,其学习曲线也各不相同。对于大部分用户来说,虚拟化管理软 件还是个新生事物。即使是粗略地尝试一下利用不同的虚拟化管理软件来安装、配置和测试一个最小规模的私有云系统,也需要花费不少的时间和精力。在这个过程 当中,遇见各种各样的问题都在所难免。不过,也只有亲身经验过这些形形色色的问题,才能够切身体会不同虚拟化管理软件的优点和缺点,并且在分析、总结、归 纳的基础上形成自己独特的观点。

用户界面

概述

用户权限

资源池和虚拟机管理 Eucalyptus

Eucalyptus提供了一个基于浏览器的简单用户界面,可以完成用户注册,下载credentials,对提供的产品类型进行简单配置等。资源池和虚拟机生命周期管理需要通过euca2ools在命令行模式下完成。

euca2ools是一组基于命令行的工具,可以与Amazon EC2/S3相兼容的Web Service进行交互。该用具可以管理基于Amazon EC2、Eucalyptus和OpenStack,OpenNebula的云计算服务。

euca2tools的主要功能包括:

  • 查询可以使用的域
  • 管理SSH Key
  • 虚拟机生命周期管理
  • 安全组管理
  • 管理卷和快照
  • 管理虚拟机映像
  • 管理IP

在Eucalyptus社区版中只有两种类型的用户:管理员,普通用户。在Eucalyptus企业版中进一步提供了用户组,属于某个用户组的用户可以管理属于该用户组的计算资源。

管理员可以通过注册或者是撤销注册某个计算节点,配置标准产品类型的计算资源(CPU、内存、存储)。普通用户只能够在标准配置的基础上创建、启动、关闭虚拟机,不能够定制化自己所需要的计算资源。

虚 拟机映像文件(EMI)的制作,以及虚拟机生命周期管理等等操作,需要通过euca2ools在命令行模式下完成。在FireFox浏览器中,可以利用 ElasticFox插件,在浏览器中启动、监控和关闭虚拟机。ElasticFox的界面不够美观,并且提供的功能非常有限。

Eucalyptus不提供console功能。用户可以通过SSH连接到自己所管理的虚拟机。

每一个公开发布的虚拟机映像(EMI),都是一个模板。用户创建虚拟机实例的时候,系统根据用户选择的EMI将相应的虚拟机映像拷贝到目标计算节点上运行。Eucalyptus根据某种算法自动决定用户的虚拟机将在哪个物理服务器上运行,用户对物理服务器的状况一无所知。

Eucalyptus 中的虚拟机实例只是原虚拟机映像(EMI)的一个副本,用户在运行的实例中对虚拟机所做的任何修改,不会被保存到原来的虚拟机映像中。如果用户将运行的虚 拟机实例关闭(例如:shutdown),用户对虚拟机所作的任何修改都会丢失。如果用户需要保存自己对虚拟机所做的修改,用户可以选择使用弹性块设备来 保存数据,或者将正在运行的虚拟机实例发布为新的EMI。(Amazon EC2自动地将停止运行的虚拟机实例保存为新的AMI,直到用户销毁该虚拟机实例为止。因此,用户可以shutdown自己的虚拟机实例,但是保存自己对 虚拟机所作的修改,直到用户选择销毁该虚拟机实例为止。)

OpenStack

OpenStack 不缺省地提供基于浏览器的用户界面。系统管理员需要手工创建用户。大部分的管理操作,需要在命令行下进行。 尽管OpenStack和Eucalyptus在构架上有很大的不同,但是所暴露给用户的界面是类似的(两者都模仿了Amazon EC2的用户接口规范)。因此,OpenStack同样可以使用Eucalyptus所提供的euca2ools进行管理。

OpenStack的openstack- dashboard项目和django- nova项目提供了一个基于浏览器的用户界面,没有被集成到OpenStack安装脚本中,需要单独安装。

OpenStack将用户分成如下几个类别:

admin - 云服务管理员,拥有所有管理权限。

itsec - IT安全管理员,具有隔离有问题的虚拟机实例的权限。

projectmanager - 项目管理员,可以增加属于该项目的新用户,管理虚拟机映像,管理虚拟机生命周期。

netadmin - 网络管理员,负责IP分配,管理防火墙。

developer - 开发人员,可以登录进入属于本项目的虚拟机,管理虚拟机生命周期

在模仿Amazon EC2的云平台(Eucalyptus, OpenStack, OpenNebula)中,OpenStack提供了颗粒度最细的用户权限管理模式。

与Eucalyptus类似,虚拟机映像文件(EMI)的制作,以及虚拟机生命周期管理等等操作,需要通过euca2ools在命令行模式下完成。同样,在FireFox浏览器中,可 以利用ElasticFox插件,在浏览器中启动、监控和关闭虚拟机。

OpenStack不提供虚拟机console功能。用户可以通过SSH连接到自己所管理的虚拟机。

正在开发中的openstack- dashboard,基于浏览器提供了比较完整的资源池管理功能和虚拟机生命周期管理功能。虽然界面还比较简单,但是已经处于可用的状态。

OpenStack的模板和虚拟机实例机制与Eucalyptus类似。与Eucalyptus类似,OpenStack根据某种算法自动决定用户的虚拟机将在哪个物理服务器上运行,用户对物理服务器的状况一无所知。

OpenNebula

OpenNebula不缺省地提供基于浏览器的用户界面。系统管理员需要手工创建用户。大部分的管理操作,需要在命令行下进行。

OpenNebula目前有两个基于浏览器的用户界面:SunStone和OneMC。这两个项目需要单独安装。

同样,OpenNebula提供了与Amazon EC2相兼容的Web Service接口。因此,可以通过FireFox所提供的ElasticFox插件和Eucalyptus提供的euca2ools工具集与OpenNebula云平台进行交互。

OpenNebula只有两种类型的用户:管理员,普通用户。

在早期版本中,OpenNebula管理员可以在后台通过命令行来管理资源池和虚拟机生命周期。 同样,在FireFox浏览器中,可 以利用ElasticFox插件,在浏览器中启动、监控和关闭虚拟机。

SunStone和OneMC这两个项目都提供了比较完整的资源池管理和虚拟机生命周期管理功能。两个项目的界面都比较简单,但是基本上处于可用的状态。SunStone没有提供虚拟机console功能,OneMC通过VNC协议提供了虚拟机console功能。

OpenNebula的模板和虚拟机实例机制与Eucalyptus类似。但是并不缺省地使用euca2ools作为工具。

与Eucalyptus类似,OpenNebula根据某种算法自动决定用户的虚拟机将在哪个物理服务器上运行,用户对物理服务器的状况一无所知。

OpenQRM

基于浏览器的用户界面,功能比较丰富。

OpenQRM的管理界面只有两种用户:管理用户,普通用户。普通用户只有查看权限,没有管理权限。

通过启用不同的插件,可以管理不同的计算资源。所有的资源池和虚拟机生命周期管理操作都可以通过浏览器界面完成。

OpenQRM的novnc插件可以提供基于VNC协议的虚拟机console功能。

XenServer

XenCenter是基于Windows的桌面应用,安装与操作都非常简单,界面美观,功能强大。

在参与评测的8 个软件中,XenCenter的用户界面是表现最出色的。基于Windows桌面的应用能够迅速地对用户的点击动作作出反应,从而提高用户体验的满意度。

系统管理员登录XenCenter之后,可以结合Active Directory在用户和用户组的层面分配管理权限。

授权用户可以通过图形界面方便地进行资源池和虚拟机生命周期管理。在图形界面上可以直观地监控物理服务器和虚拟机的计算资源使用情况(CPU、内存、存储、网络活动)。

提供基于VNC的虚拟机console。

可以基于模板的部署新的虚拟机。

Oracle VM

Oracle VM Manager提供了基于浏览器的管理界面。

Oracle VM Manager同时提供了role和group的概念。其中role定义了用户所具备的权限,属于同一个group的用户拥有该group所被授予的权限。

Oracle VM Manager提供了三种role:

user - 拥有指定资源池的虚拟机生命周期管理权限。

manager - 拥有除了用户管理之外的所有管理权限。

administrator - 拥有整个系统的管理权限。

授权用户可以通过图形界面方便地进行资源池和虚拟机生命周期管理。在图形界面上可以直观地监控物理服务器和虚拟机的计算资源使用情况(CPU、内存、存储、网络活动)。

提供基于VNC的虚拟机console。

可以基于模板的部署新的虚拟机。 CloudStack

基于浏览器的用户界面,功能丰富,美观大方。

CloudStack根据用户的role将用户分成三个类型:

admin - 全局管理员。

domain-admin - 域管理员,可以对某个域下的物理和虚拟资源进行管理。

user - 个体用户,可以管理自己名下的虚拟机资源。

CloudStack 对物理资源的管理完整地模拟了一个物理机房的实际情况,按照“机房(Zones)-》机柜(Pods)-》集群(Cluster)-》服务器 (Server)”的结构对物理服务器进行组织,使得管理员能够在管理界面里面的计算资源和机房里面的计算资源建立起直观的一一对应关系。

授权用户可以通过图形界面方便地进行资源池和虚拟机生命周期管理。在图形界面上可以直观地监控物理服务器和虚拟机的计算资源使用情况(CPU、内存、存储、网络活动)。

提供基于VNC的虚拟机console。

可以基于模板的部署新的虚拟机。

ConVirt

基于浏览器的用户界面,功能丰富,美观大方。

社区版可以注册多个用户,并可将用户按照用户组进行分类,但是所有的用户拥有相同的全局管理权限。企业版则提供了更细致的用户权限管理机制。除此之外,企业版还提供了对LDAP的支持。

授权用户可以通过图形界面方便地进行资源池和虚拟机生命周期管理。在图形界面上可以直观地监控物理服务器和虚拟机的计算资源使用情况(CPU、内存、存储、网络活动)。提供基于VNC的虚拟机console。

可以基于模板的部署新的虚拟机。

ConVirt 的最大优点,在于其通过时程图的方式在不同的层次上直观地展示计算资源(包括物理资源和虚拟资源)的利用情况和健康状况。在整个数据中心和资源池的层 面,ConVirt实时显示资源池数量、物理服务器和虚拟机数量、虚拟机密度、存储资源使用状况、负载最高的N 台物理服务器和虚拟机。在物理服务器和虚拟机的层面,ConVirt实时显示CPU和内存使用情况,监控人员可以通过CPU和内存时程图及时地发现或者是 调查系统异常情况。

在 所有参与评测的虚拟化管理软件中,XenServer / XCP和ConVirt的图形用户界面是做的最好的。XenCenter的图形界面的优点在于提供了独一无二的用户体验,ConVirt的图形界面的优点 在于以图形的方式直观地展示了从机房到虚拟机的健康状况。CloudStack的图形界面非常大气,但是在功能上不如ConVirt那么实用。不过按照 CloudStack的目前的发展势头来看,下一个版本可能比较值得期待。

由于进行评测的时间较短,并且测试系统规模较小的原因,暂时无法对各个软件的稳定性、健壮性、扩展性等等关键问题作出评估。

商务篇:

目前市面上形形色色的虚拟化管理软件总数很多,这一系列文章所提及的几个软件仅仅其中的几个代表。作为一个机构、或者是一家企业,在向虚拟化过渡时都不可避免地要面临软件选型的问题。本文作为这一系列文章的最后一篇,从商务和功能两个方面提出自己的一点粗浅意见。

(1)商务评估

从 商务上进行软件选型,性价比通常是一个决定性的因素。在假定参与选型的软件全部满足技术要求的前提下,企业(机构)需要考虑的因素包括软件的授权协议是否 友好、许可证管理的难易程度、软件和服务的价格高低、运营团队在业界的声誉、开发者社区和用户社区的规模和活跃程度、商业与技术沟通的难易程度。

授 权协议/许可证管理 — 以全部开放源代码为10分,部分开放源代码(例如以企业版的形式提供某些高级功能,或者以服务的形式提供特别版本的安装包和补丁)扣1 分。商业版本需要在控制节点安装许可证不扣分,需要在所有计算节点安装许可证扣1 分,许可证需要每年更新者扣1 分。

价格指数 — 以全部功能免费使用为10分,以企业版的模式提供全部功能的软件,每台物理服务器每花费500美元扣1 分。

运营团队 — 以运营团队的规模、背景、影响力评分,存在的主观因素较多。

社区因素 — 以开发者和用户社区的规模和活跃程度评分,存在的主观因素较多。

沟通交流 — 以个人与运营团队、开发者社区、用户社区之间的沟通顺畅程度评分,存在的主观因素较多。

授权协议

授权协议

许可证管理

价格指数

运营团队

社区因素

沟通交流

总分 Eucalyptus

9

8

9

9

10

45 OpenStack

10

10

8

8

7

43 OpenNebula

9

9

7

8

9

42 OpenQRM

9

8

6

7

8

37 XenServer

7

8

9

10

9

43 Oracle VM

9

7

7

6

7

36 CloudStack

9

8

7

6

7

37 ConVirt

9

8

8

9

10

44

(2)功能评估

从功能上进行虚拟化管理软件选型,需要考虑的因素包括该软件所支持的虚拟化技术、安装配置的难易程度、开发和使用文档的详尽程度、所提供的功能是否全面以及用户界面是否直观友好、二次开发的难易程度、是否提供物理资源和虚拟资源的监控报表等等。

虚拟化技术支持 — 仅支持一种虚拟化技术为6 分,每增加一种虚拟化技术加1 分,10分封顶。

安装配置 — 以按照官方文档进行安装配置的难易程度评分,存在的主观因素较多。

开发/使用文档 — 以官方所提供的开发与使用文档的详尽程度评分,文档详尽程度越高者得分越高。

功能与界面 — 综合评分,涵盖用户进行物理资源和虚拟资源管理、虚拟机生命周期管理、访问虚拟机资源和存储资源的难易程度,用户界面的美观易用程度,以及综合用户体验。

二次开发 — 基础得分6 分,提供与Amazon EC2相兼容的程序调用接口者加3 分,提供二次开发接口但是与Amazon EC2不兼容者加2 分。

监控报表 — 基础得分6 分,依系统所提供监控与分析功能的详尽程度加分。

虚拟化技术支持

安装配置

开发/使用文档

功能与界面

二次开发

监控报表

总分 Eucalyptus

8

8

9

4

9 (Amazon WS)

6

44 OpenStack

10

8

8

4

9 (Amazon WS)

6

45 OpenNebula

8

8

7

4

9 (Amazon WS)

6

42 OpenQRM

10

9

5

10

6 (OS)

7

47 XenServer

6

10

10

10

8 (Plugin)

9

53 Oracle VM

6

9

8

7

8 (WS)

7

45 CloudStack

8

9

8

10

6 (OS)

8

49 ConVirt

7

10

10

10

8 (API)

10

55

(3)综合评估

从 商务上考虑,Eucalyptus和ConVirt以微弱 的优势领先于其他选项。Eucalyptus是私有云管理平台的先行者。Ubuntu 10.04选择捆绑Eucalyptus作为UEC的基础构架,使得Ecualyptus比其他的私有云管理平台拥有更多的用户和更加活跃的社区。此 外,Ecualyptus在中国国内有销售和技术支持人员,在沟通上比选择其他软件要更加容易。ConVirt排名第二,根本原因在于其销售和技术支持团 队与(潜在的)客户保持积极而有效的沟通。Citrix XenServer仅仅与其他两个选项并列排名第三,输在其过于严苛的许可证管理政策。的确,要给100台以上的服务器单独安装许可证并且每年更新一次, 可不是一件有意思的事情。

从 功能上考虑,ConVirt与XenServer遥遥领先于其他选项。虽然ConVirt仅仅支持Xen和KVM两种虚拟化技术,但是其安装配置相对简 单,文档详尽、功能齐全、界面美观、是比较容易上手的虚拟化管理软件。更重要的是,ConVirt的监控报表功能直观地展示了从数据中心到虚拟机的 CPU、内存利用情况,使得用户对整个数据中心的健康状况一目了然。同样,XenServer虽然仅支持Xen一种虚拟化技术,但是在安装配置、操作文 档、用户界面等方面都不亚于ConVirt。如果用户对基于Windows的界面没有强烈的抵触情绪的话,XenServer是比较值得考虑的一个选型。

综 合如上考虑,对于希望利用虚拟化管理软件提高硬件资源利用率和虚拟化管理自动化程度的企业(机构)来说,建议使用ConVirt来管理企业(机构)的计算 资源。如果网管人员不希望深入了解Linux操作系统,并且所管理的物理服务器数量有限的话,XenServer也是一个不错的选择。ConVirt的浏 览器界面是开放源代码的,用户可以对其进行定制化,将自己所需要的其他功能添加到同一个用户界面中去。XenCenter则提供了一种插件机制,用户可以 通过插件的方式讲自己的功能集成到XenCenter中。

不 过,你的基础设施是否需要与Amazon EC2相兼容呢?也就是说,你的用户是否需要使用他们用于访问和操作Amazon EC2的脚本和工具来访问你的计算资源呢?如果是这样的话,你可能需要在Eucalyptus和OpenStack之间作一个选择(CloudStack 和OpenNebula同样提供了与Amazon EC2兼容的操作接口,但是CloudStack在商务方面得分不高,OpenNebula在功能方面得分不高)。Eucalyptus的历史比 OpenStack稍长,用户群比OpenStack要大,社区的活跃程度也比OpenStack要高。不过OpenStack的后台老板NASA比 Eucalyptus要财大气粗,Ubuntu 11.04也集成了OpenStack作为其UEC的基础构架之一,表明OpenStack已经得到了社区的重视和支持。总的来说,开放源代码的云构架, 还是一个不断发展之中的新生食物。笔者只能够建议用户亲自去安装使用每一个软件,最终基于自己的经验以及需求达到一个最适合自己的选择。

虚拟化管理软件比较 -- 幻灯片

结合前段时间对不同虚拟化管理软件的评测工作,准备了一套讲座用的幻灯片。PDF版本的文件可以从这里下载。如果有人需要ODP版本的文件,直接跟我联系吧。

1.4 对我司的作用

建议使用Convirt而不是OpenStack

当其与sunde等代表的云终端(支持由真实单机提供和分配多虚拟机的运行环境,最终用户通过sunde的非PC终端连结各自虚拟机)配合使用时。

可以组成OpenStack管理的真实计算机集群基础上按照业务类型(开发、测试、办公、演示、呼叫、培训)划分的虚拟母机,再分别在虚拟母机上运行sunde的管理端,以管理最终使用者的各个虚拟子机并分别组成各自虚拟子网和公司公关网络,最后由最终用户使用非PC终端接入各自使用的一个或者多个虚拟机进行合理工作。

非移动技术用环境:

比起开发、测试人员、销售使用的个人工作机相比,更紧缺的是各种服务器,譬如:

运维:线上完整模拟测试环境、线上环境备份、工作环境文件共享服务器,公司网站测试环境等。

开发:VS、PD、DB设计服务器、开发测试用完整模拟环境、架构测试实验室等。

测试:QC服务器、浏览器多版本环境测试服务器、独立项目测试用完整模拟环境等。

Call Center:简单应用而需多人处理的办公环境。

等用于管理多虚拟实例的环境管理节点(一台机器、一个环境)甚至线上备份。

但因其不能跨物理机调度、通信、资源整合、虚拟机效率低等原因,不可以用于:

分布式大数据量存储控制业务节点、分布式数据中心存储控制节点等需要整合物理机资源进行分布式并行处理的环境。

2 Hadoop

2.1 简介

一款分布式数据存储与业务系统架构平台,由Apache基金会开发。用户可以在不了解分布式底层细节的情况下,开发分布式程序。充分利用集群的威力高速运算和存储。

其可以在多台计算机(PC或者小型机)组成的网络集群上跨物理机器统一调配单机资源以适应多种可分布并行处理的业务服务。

2.2 原理

其架构由低到高分为 HDFS->MapReduce+BigTable(NoSQLDB)

HDFS算法中

结构分为星型的NameNode核心与DataNode外围,其中NameNode为其瓶颈,当NameNode再大时,只能再整体复制一个集群出去做分布或者将单一机器的NameNode再通过把Key分级做Hadoop化处理。

有效容积率:

假设容许损坏的机器为x台,机器总数为y台

那么y台机器有效数据量为y/(x+1),其有效数据容积率为(y/(x+1))/y=1/(x+1)

健壮率:

假设容许损坏的机器为x台,机器总数为y台

那么y台机器健壮率(最小需要几台机器才能稳定)为1-(y-x)/y = x/y

如果综合考虑 有效容积率与健壮率 那么 1/(x+1)=x/y 所以y = x(x+1)。

MapReduce算法:

典型分散综合分布式处理算法。

基础是处理可以分散处理再综合统计数据的业务类型。

2.3 对我司的作用

其分布并行Map-reduce算法可以用于很多非即时反馈的非事务性业务处理:且不依赖HDFS一种实现,只要是支持节点运算即可。

其HDFS系统可用于大量数据文件的存储。但是不论其节点数量多少,其有效容积率都是由可容忍的宕机数量决定的。可见HDFS算法中更侧重的是稳定而不是并行处理高效和负载,更针对建立索引等搜索类业务处理要求而对少写多读商务类业务处理针对性不强,从百度淘宝的实践看也都证明这一点。

所以其HDFS系统设计之上应根据实际情况加入中央NOSQL缓存扩展与单数据节点线性NOSQL缓存负载。再配合业务服务器自身各种优化措施才能成为公司分布式DB、FS处理负载设计框架。hadoop设计思路可以借鉴不能照搬。