OpenStack源码学习笔记1

作为已经比较成熟的IAAS开源解决方案,OpenStack已经发布了19个版本,目前稳定版是Stein,并且下一个版本Train也预计在10月发布。可以说,从代码架构角度来说对于初学者来说已经略微复杂,但最核心的组件有以下几个:

  1. Nova:负责虚拟机相关。
  2. Glance:负责镜像相关。
  3. Cinder:负责存储相关。
  4. Neutron:负责网络相关。
  5. Keystone:负责鉴权以及服务注册。

大体架构如下图:

all.png

预备知识

正式开始学习Openstack源码前,需要一些预备知识,不用特别深入但要知道是做什么的。

虚拟化

虚拟化本身就是一门比较复杂的学问,涉及硬件、操作系统、软件等多个层面知识。这里至少要知道4个名词:

  1. KVM
  2. Qemu
  3. Qemu-KVM
  4. Libvert

其中KVM在负责对CPU和内存进行虚拟化,Qemu负责对IO进行虚拟化,而Qemu-KVM则是整合了这2者。而Libvert则是提供了一个更加方便的封装,通过Libvertd服务以及virt-*、virsh命令来方便的管理虚拟机。而OpenStack则在此基础上更进一步的封装,通过各种driver插件来管理不同类型的虚拟机。

qemu-kvm-libvert

WSGI

OpenStack的哲学是各个组件通过API以及消息队列建立联系,既然要对外暴露端口,那么其中自然少不了WSGI的存在。 简单来说,WSGI解耦了Web Server和Web Application,可以让开发人员专注于功能实现而不是网络协议的处理上。

Paste Deployment

pastedeploy,这个我也是开始阅读Openstack才学到的新知识,用来建立Server和Application之间的联系。 简单来说就是提供服务发现功能,并且隐藏Application的实现细节的,等开始查看Nova代码时候再详细说明。

通用套路

Openstack的Github主页已经将各个组件拆分,但基本全部遵循着一个相似的结构,比较需要重点关注的文件有3个:

  1. api.py提供对外访问的接口,可以从这开始入手跟踪各个功能实现。
  2. rpcapi.py封装RPC请求调用,大多数是异步调用。
  3. manager.py各种RPC调用的实现,基本和rpcapi.py中调用的名称一一对应。

另外需要关注的就是根目录中的setup.cfg文件,特别是其中的console_scripts,可以说安装完Openstack后提供的各种命令对应的函数都在这里了,可以说是学习Openstack源码的路标。

此外还有一点,Openstack的目录结构是根据功能划分的,比如Nova中compute目录不一定都是在nova-compute节点上运行,而是所有和虚拟机创建相关的功能都在这里。

配置加载与路由绑定

这里以stein版的Nova为例,根据架构图,Nova-api是所有虚拟机相关请求的入口,首先看Nova中setup.cfg文件定义如下:

……省略……

nova.compute.monitors.cpu =
    virt_driver = nova.compute.monitors.cpu.virt_driver:Monitor

console_scripts =
    nova-api = nova.cmd.api:main
    nova-api-metadata = nova.cmd.api_metadata:main
    nova-compute = nova.cmd.compute:main
    nova-conductor = nova.cmd.conductor:main
    nova-console = nova.cmd.console:main
    nova-manage = nova.cmd.manage:main
    nova-network = nova.cmd.network:main
    nova-scheduler = nova.cmd.scheduler:main

nova.scheduler.driver =
    filter_scheduler = nova.scheduler.filter_scheduler:FilterScheduler
    fake_scheduler = nova.tests.unit.scheduler.fakes:FakeScheduler
……省略……

从配置文件可以明显的看出,nova-api对应的文件是nova/cmd/api.pymain()函数:

def main():
    config.parse_args(sys.argv)
    logging.setup(CONF, "nova")
    objects.register_all()
    gmr_opts.set_defaults(CONF)
    if 'osapi_compute' in CONF.enabled_apis:
        # NOTE(mriedem): This is needed for caching the nova-compute service
        # version.
        objects.Service.enable_min_version_cache()
    log = logging.getLogger(__name__)
    gmr.TextGuruMeditation.setup_autorun(version, conf=CONF)
    launcher = service.process_launcher()
    started = 0
    for api in CONF.enabled_apis:
        should_use_ssl = api in CONF.enabled_ssl_apis
        try:
            server = service.WSGIService(api, use_ssl=should_use_ssl)
            launcher.launch_service(server, workers=server.workers or 1)
            started += 1
        except exception.PasteAppNotFound as ex:
            log.warning("%s. ``enabled_apis`` includes bad values. "
                        "Fix to remove this warning.", ex)
    if started == 0:
        log.error('No APIs were started. '
                  'Check the enabled_apis config option.')
        sys.exit(1)
    launcher.wait()

main函数其实就做了一件事,启动配置文件中设定的WSGI服务,默认情况下配置文件位于/etc/nova/nova.conf,配置文件中定义了启用哪些服务、Glance、Cinder、Keystone、Libvert、数据库等信息。说到这里简单提一下 oslo_config这个项目,是从Openstack独立出来的专门处理配置文件的基础功能库。

查看WSGIService的定义,位于nova/service.py:

class WSGIService(service.Service):
    """Provides ability to launch API from a 'paste' configuration."""

    def __init__(self, name, loader=None, use_ssl=False, max_url_len=None):
        """Initialize, but do not start the WSGI server.

        :param name: The name of the WSGI server given to the loader.
        :param loader: Loads the WSGI application using the given name.
        :returns: None

        """
        self.name = name
        # NOTE(danms): Name can be metadata, osapi_compute, per
        # nova.service's enabled_apis
        self.binary = 'nova-%s' % name

        LOG.warning('Running %s using eventlet is deprecated. Deploy with '
                    'a WSGI server such as uwsgi or mod_wsgi.', self.binary)

        self.topic = None
        self.manager = self._get_manager()
        self.loader = loader or api_wsgi.Loader()
        self.app = self.loader.load_app(name)
        # inherit all compute_api worker counts from osapi_compute
        if name.startswith('openstack_compute_api'):
            wname = 'osapi_compute'
        else:
            wname = name
        self.host = getattr(CONF, '%s_listen' % name, "0.0.0.0")
        self.port = getattr(CONF, '%s_listen_port' % name, 0)
        self.workers = (getattr(CONF, '%s_workers' % wname, None) or
                        processutils.get_worker_count())
        if self.workers and self.workers < 1:
            worker_name = '%s_workers' % name
            msg = (_("%(worker_name)s value of %(workers)s is invalid, "
                     "must be greater than 0") %
                   {'worker_name': worker_name,
                    'workers': str(self.workers)})
            raise exception.InvalidInput(msg)
        self.use_ssl = use_ssl
        self.server = wsgi.Server(name,
                                  self.app,
                                  host=self.host,
                                  port=self.port,
                                  use_ssl=self.use_ssl,
                                  max_url_len=max_url_len)
        # Pull back actual port used
        self.port = self.server.port
        self.backdoor_port = None
        setup_profiler(name, self.host)

其中,self.app = self.loader.load_app(name)就是上面所说的使用Paste Deployment来建立连接的部分,相关代码位于nova/api/wsgi.py中:

class Loader(object):
    """Used to load WSGI applications from paste configurations."""

    def __init__(self, config_path=None):
        """Initialize the loader, and attempt to find the config.

        :param config_path: Full or relative path to the paste config.
        :returns: None

        """
        self.config_path = None

        config_path = config_path or CONF.wsgi.api_paste_config
        if not os.path.isabs(config_path):
            self.config_path = CONF.find_file(config_path)
        elif os.path.exists(config_path):
            self.config_path = config_path

        if not self.config_path:
            raise exception.ConfigNotFound(path=config_path)

    def load_app(self, name):
        """Return the paste URLMap wrapped WSGI application.

        :param name: Name of the application to load.
        :returns: Paste URLMap object wrapping the requested application.
        :raises: `nova.exception.PasteAppNotFound`

        """
        try:
            LOG.debug("Loading app %(name)s from %(path)s",
                      {'name': name, 'path': self.config_path})
            return deploy.loadapp("config:%s" % self.config_path, name=name)
        except LookupError:
            LOG.exception(_LE("Couldn't lookup app: %s"), name)
            raise exception.PasteAppNotFound(name=name, path=self.config_path)

默认情况下,配置文件位于/etc/nova/api-paste.ini,内容如下:

############
# Metadata #
############
[composite:metadata]
use = egg:Paste#urlmap
/: meta

[pipeline:meta]
pipeline = cors metaapp

[app:metaapp]
paste.app_factory = nova.api.metadata.handler:MetadataRequestHandler.factory

#############
# OpenStack #
#############

[composite:osapi_compute]
use = call:nova.api.openstack.urlmap:urlmap_factory
/: oscomputeversions
# v21 is an exactly feature match for v2, except it has more stringent
# input validation on the wsgi surface (prevents fuzzing early on the
# API). It also provides new features via API microversions which are
# opt into for clients. Unaware clients will receive the same frozen
# v2 API feature set, but with some relaxed validation
/v2: openstack_compute_api_v21_legacy_v2_compatible
/v2.1: openstack_compute_api_v21

[composite:openstack_compute_api_v21]
use = call:nova.api.auth:pipeline_factory_v21
noauth2 = cors http_proxy_to_wsgi compute_req_id faultwrap request_log sizelimit osprofiler noauth2 osapi_compute_app_v21
keystone = cors http_proxy_to_wsgi compute_req_id faultwrap request_log sizelimit osprofiler authtoken keystonecontext osapi_compute_app_v21

[composite:openstack_compute_api_v21_legacy_v2_compatible]
use = call:nova.api.auth:pipeline_factory_v21
noauth2 = cors http_proxy_to_wsgi compute_req_id faultwrap request_log sizelimit osprofiler noauth2 legacy_v2_compatible osapi_compute_app_v21
keystone = cors http_proxy_to_wsgi compute_req_id faultwrap request_log sizelimit osprofiler authtoken keystonecontext legacy_v2_compatible osapi_compute_app_v21

[filter:request_log]
paste.filter_factory = nova.api.openstack.requestlog:RequestLog.factory

[filter:compute_req_id]
paste.filter_factory = nova.api.compute_req_id:ComputeReqIdMiddleware.factory

[filter:faultwrap]
paste.filter_factory = nova.api.openstack:FaultWrapper.factory

[filter:noauth2]
paste.filter_factory = nova.api.openstack.auth:NoAuthMiddleware.factory

[filter:osprofiler]
paste.filter_factory = nova.profiler:WsgiMiddleware.factory

[filter:sizelimit]
paste.filter_factory = oslo_middleware:RequestBodySizeLimiter.factory

[filter:http_proxy_to_wsgi]
paste.filter_factory = oslo_middleware.http_proxy_to_wsgi:HTTPProxyToWSGI.factory

[filter:legacy_v2_compatible]
paste.filter_factory = nova.api.openstack:LegacyV2CompatibleWrapper.factory

[app:osapi_compute_app_v21]
paste.app_factory = nova.api.openstack.compute:APIRouterV21.factory

PasteDeploy的配置文件可以包括多个段。每个段包含一个名称和多个键值对。名称由type和name组成,使用冒号分隔。每个键值对占一行,键名和值使用等号分隔。比较常用的type有:

  • composite:用来分发请求到其它应用去。其下的键值对use = egg:Paste#urlmap表示使用Paste中的urlmap应用。urlmap是使用路径的前缀来将请求映射到不同的应用去。
  • app:基本的WSGI应用,通常的用法是paste.app_factory = <模块名>:<类名>.<类方法>。
  • filter-app:过滤器,通过next可以指定下一步交给谁处理,next指定的可以是一个普通的WSGI应用,也可以是另一个过滤器。
  • filter:过滤器,与filter-app用法上不同,其他应用中使用filter-with来指定使用此filter。
  • pipeline:管道,可以将多个filter和最后一个WSGI应用串联起来。

比如对于metadata,就直接使用了Paste中的路由,而对于osapi_compute,则使用nova自己的urlmap_factory处理。这里以V2版本接口为例,当程序收到请求后转交给了openstack_compute_api_v21_legacy_v2_compatible,然后经过pipeline定义的各种filter处理后由osapi_compute_app_v21接手,而这个程序对应的代码就是nova/api/openstack/compute/routes.py中的APIRouterV21类的factory方法,而这个方法本质上就是创建了一个类实例并建立路由和函数之间的映射关系。

这些操作完成后通过使用eventlet创建wsgi.Server、绑定监听IP和端口,位于nova/wsgi.pyServer类中,这里就不贴出代码了。

创建虚拟机

Nova-Api

根据文档,创建虚拟机的行为其实就是向/servers发送POST请求而已。而上面已经知道在nova/api/openstack/compute/routes.py中定义了对应的函数是:

('/servers', {
        'GET': [server_controller, 'index'],
        'POST': [server_controller, 'create']
    })

而这个server_controller是位于nova/api/openstack/compute/servers.pyServersController类,其create方法定义如下:

def create(self, req, body):
    """Creates a new server for a given user."""
    context = req.environ['nova.context']
    server_dict = body['server']
    password = self._get_server_admin_password(server_dict)
    name = common.normalize_name(server_dict['name'])

    ......

    availability_zone = server_dict.pop("availability_zone", None)
    image_uuid = self._image_from_req_data(server_dict, create_kwargs)
    self._process_networks_for_create(
        context, target, server_dict, create_kwargs)
    flavor_id = self._flavor_id_from_req_data(body)
    try:
        inst_type = flavors.get_flavor_by_flavor_id(
                flavor_id, ctxt=context, read_deleted="no")
        supports_multiattach = common.supports_multiattach_volume(req)
        supports_port_resource_request = \
            common.supports_port_resource_request(req)
        (instances, resv_id) = self.compute_api.create(context,
            inst_type,
            image_uuid,
            display_name=name,
            display_description=description,
            availability_zone=availability_zone,
            forced_host=host, forced_node=node,
            metadata=server_dict.get('metadata', {}),
            admin_password=password,
            check_server_group_quota=True,
            supports_multiattach=supports_multiattach,
            supports_port_resource_request=supports_port_resource_request,
            **create_kwargs)
    except (exception.QuotaError,
            exception.PortLimitExceeded) as error:
        raise exc.HTTPForbidden(
            explanation=error.format_message())
    
    ......

经过一些获取、验证操作后,调用了self.compute_api.create()方法,这个compute_api就是nova/compute/api.py中定义的API类,create函数定义如下:

@hooks.add_hook("create_instance")
def create(self, context, instance_type,
            image_href, kernel_id=None, ramdisk_id=None,
            min_count=None, max_count=None,
            display_name=None, display_description=None,
            key_name=None, key_data=None, security_groups=None,
            availability_zone=None, forced_host=None, forced_node=None,
            user_data=None, metadata=None, injected_files=None,
            admin_password=None, block_device_mapping=None,
            access_ip_v4=None, access_ip_v6=None, requested_networks=None,
            config_drive=None, auto_disk_config=None, scheduler_hints=None,
            legacy_bdm=True, shutdown_terminate=False,
            check_server_group_quota=False, tags=None,
            supports_multiattach=False, trusted_certs=None,
            supports_port_resource_request=False,
            requested_host=None, requested_hypervisor_hostname=None):
    ......
    return self._create_instance(
        context, instance_type,
        image_href, kernel_id, ramdisk_id,
        min_count, max_count,
        display_name, display_description,
        key_name, key_data, security_groups,
        availability_zone, user_data, metadata,
        injected_files, admin_password,
        access_ip_v4, access_ip_v6,
        requested_networks, config_drive,
        block_device_mapping, auto_disk_config,
        filter_properties=filter_properties,
        legacy_bdm=legacy_bdm,
        shutdown_terminate=shutdown_terminate,
        check_server_group_quota=check_server_group_quota,
        tags=tags, supports_multiattach=supports_multiattach,
        trusted_certs=trusted_certs,
        supports_port_resource_request=supports_port_resource_request,
        requested_host=requested_host,
        requested_hypervisor_hostname=requested_hypervisor_hostname)


def _create_instance(self, context, instance_type,
               image_href, kernel_id, ramdisk_id,
               min_count, max_count,
               display_name, display_description,
               key_name, key_data, security_groups,
               availability_zone, user_data, metadata, injected_files,
               admin_password, access_ip_v4, access_ip_v6,
               requested_networks, config_drive,
               block_device_mapping, auto_disk_config, filter_properties,
               reservation_id=None, legacy_bdm=True, shutdown_terminate=False,
               check_server_group_quota=False, tags=None,
               supports_multiattach=False, trusted_certs=None,
               supports_port_resource_request=False,
               requested_host=None, requested_hypervisor_hostname=None):
    # Normalize and setup some parameters
    ......
    if image_href:
        image_id, boot_meta = self._get_image(context, image_href)
    else:
        # This is similar to the logic in _retrieve_trusted_certs_object.
        if (trusted_certs or
            (CONF.glance.verify_glance_signatures and
                CONF.glance.enable_certificate_validation and
                CONF.glance.default_trusted_certificate_ids)):
            msg = _("Image certificate validation is not supported "
                    "when booting from volume")
            raise exception.CertificateValidationFailed(message=msg)
        image_id = None
        boot_meta = self._get_bdm_image_metadata(
            context, block_device_mapping, legacy_bdm)

    self._check_auto_disk_config(image=boot_meta,
                                    auto_disk_config=auto_disk_config)

    ......
    self.compute_task_api.schedule_and_build_instances(
        context,
        build_requests=build_requests,
        request_spec=request_specs,
        image=boot_meta,
        admin_password=admin_password,
        injected_files=injected_files,
        requested_networks=requested_networks,
        block_device_mapping=block_device_mapping,
        tags=tags)
    return instances, reservation_id

最终,这个函数又调用了self.compute_task_api.schedule_and_build_instances方法,而这个compute_task_api就是nova/conductor/api.py中定义的ComputeTaskAPI类,方法定义如下:

def schedule_and_build_instances(self, context, build_requests,
                                request_spec, image,
                                admin_password, injected_files,
                                requested_networks, block_device_mapping,
                                tags=None):
    self.conductor_compute_rpcapi.schedule_and_build_instances(
        context, build_requests, request_spec, image,
        admin_password, injected_files, requested_networks,
        block_device_mapping, tags)

这个函数很简单,就是调用了nova/conductor/rpcapi.pyComputeTaskAPI类的schedule_and_build_instances方法:

def schedule_and_build_instances(self, context, build_requests,
                                request_specs,
                                image, admin_password, injected_files,
                                requested_networks,
                                block_device_mapping,
                                tags=None):
    version = '1.17'
    kw = {'build_requests': build_requests,
            'request_specs': request_specs,
            'image': jsonutils.to_primitive(image),
            'admin_password': admin_password,
            'injected_files': injected_files,
            'requested_networks': requested_networks,
            'block_device_mapping': block_device_mapping,
            'tags': tags}

    if not self.client.can_send_version(version):
        version = '1.16'
        del kw['tags']

    cctxt = self.client.prepare(version=version)
    cctxt.cast(context, 'schedule_and_build_instances', **kw)

cast代表这个是一个异步调用,schedule_and_build_instances是调用的方法名。虽然目录从api到compute到conductor`,但上面的所有过程都是在nova-api管控下的。一直到发出这个异步请求为止,nova-api阶段结束,返回响应,虚拟机状态为building。

Nova-Conductor-1

上面说过,manager.py是对应rpc调用方法的实现,所以在nova/conductor/manager.py中找到函数定义如下:

def schedule_and_build_instances(self, context, build_requests,
                                     request_specs, image,
                                     admin_password, injected_files,
                                     requested_networks, block_device_mapping,
                                     tags=None):
    # Add all the UUIDs for the instances
    instance_uuids = [spec.instance_uuid for spec in request_specs]
    try:
        host_lists = self._schedule_instances(context, request_specs[0],
                instance_uuids, return_alternates=True)
    except Exception as exc:
        LOG.exception('Failed to schedule instances')
        self._bury_in_cell0(context, request_specs[0], exc,
                            build_requests=build_requests,
                            block_device_mapping=block_device_mapping,
                            tags=tags)
        return
    
    ......

def _schedule_instances(self, context, request_spec,
                        instance_uuids=None, return_alternates=False):
    scheduler_utils.setup_instance_group(context, request_spec)
    with timeutils.StopWatch() as timer:
        host_lists = self.query_client.select_destinations(
            context, request_spec, instance_uuids, return_objects=True,
            return_alternates=return_alternates)
    LOG.debug('Took %0.2f seconds to select destinations for %s '
                'instance(s).', timer.elapsed(), len(instance_uuids))
    return host_lists

首先这函数调用_schedule_instances方法,这个方法中又调用了select_destinations,而self.query_client其实是一个SchedulerQueryClient类实例,位于nova/scheduler/query.py中,代码很耿直:

    def select_destinations(self, context, spec_obj, instance_uuids,
            return_objects=False, return_alternates=False):
        return self.scheduler_rpcapi.select_destinations(context, spec_obj,
                instance_uuids, return_objects, return_alternates)

一看到rpcapi,果不其然在nova/scheduler/rpcapi.py中找到对应的函数:

def select_destinations(self, ctxt, spec_obj, instance_uuids,
            return_objects=False, return_alternates=False):
    ......
    cctxt = self.client.prepare(
        version=version, call_monitor_timeout=CONF.rpc_response_timeout,
        timeout=CONF.long_rpc_timeout)
    return cctxt.call(ctxt, 'select_destinations', **msg_args)

注意这里使用的call而不是cast,所以是一个同步调用,此时nova-conductor会堵塞等待直到nova-scheduler返回。

Nova-Scheduler

根据套路,在nova/scheduler/manager.py中找到函数定义如下:

@messaging.expected_exceptions(exception.NoValidHost)
def select_destinations(self, ctxt, request_spec=None,
        filter_properties=None, spec_obj=_sentinel, instance_uuids=None,
        return_objects=False, return_alternates=False):
    ......
    selections = self.driver.select_destinations(ctxt, spec_obj,
            instance_uuids, alloc_reqs_by_rp_uuid, provider_summaries,
            allocation_request_version, return_alternates)
    # If `return_objects` is False, we need to convert the selections to
    # the older format, which is a list of host state dicts.
    if not return_objects:
        selection_dicts = [sel[0].to_dict() for sel in selections]
        return jsonutils.to_primitive(selection_dicts)
    return selections

这个函数本质上又是一层封装,最终调用的是nova-api.conf文件中定义的scheduler.driver,这里以自带的FilterScheduler为例,位于nova/scheduler/filter_scheduler.py,入口函数是_schedule,在这里首先获取全部的host,然后经过_get_sorted_hosts函数的筛选和权重计算,返回满足条件的主机给nova-conductor。

Nova-Conductor-2

回到nova/conductor/manager.pyschedule_and_build_instances方法:

def schedule_and_build_instances(self, context, build_requests,
                                     request_specs, image,
                                     admin_password, injected_files,
                                     requested_networks, block_device_mapping,
                                     tags=None):
    
    ......

    zipped = six.moves.zip(build_requests, request_specs, host_lists,
                            instances)
    for (build_request, request_spec, host_list, instance) in zipped:
        
        ......

        with obj_target_cell(instance, cell) as cctxt:
            # send a state update notification for the initial create to
            # show it going from non-existent to BUILDING
            # This can lazy-load attributes on instance.
            notifications.send_update_with_states(cctxt, instance, None,
                    vm_states.BUILDING, None, None, service="conductor")
            objects.InstanceAction.action_start(
                cctxt, instance.uuid, instance_actions.CREATE,
                want_result=False)
            instance_bdms = self._create_block_device_mapping(
                cell, instance.flavor, instance.uuid, block_device_mapping)
            instance_tags = self._create_tags(cctxt, instance.uuid, tags)

        # Update mapping for instance. Normally this check is guarded by
        # a try/except but if we're here we know that a newer nova-api
        # handled the build process and would have created the mapping
        inst_mapping = objects.InstanceMapping.get_by_instance_uuid(
            context, instance.uuid)
        inst_mapping.cell_mapping = cell
        inst_mapping.save()
        
        ......
        
        legacy_secgroups = [s.identifier
                            for s in request_spec.security_groups]
        with obj_target_cell(instance, cell) as cctxt:
            self.compute_rpcapi.build_and_run_instance(
                cctxt, instance=instance, image=image,
                request_spec=request_spec,
                filter_properties=filter_props,
                admin_password=admin_password,
                injected_files=injected_files,
                requested_networks=requested_networks,
                security_groups=legacy_secgroups,
                block_device_mapping=instance_bdms,
                host=host.service_host, node=host.nodename,
                limits=host.limits, host_list=host_list)

拿到主机列表后,由于可能同时创建多台主机,所以使用for循环,进行了一些数据库操作后,进行rpc调用,位于nova/compute/rpcapi.py

def build_and_run_instance(self, ctxt, instance, host, image, request_spec,
        filter_properties, admin_password=None, injected_files=None,
        requested_networks=None, security_groups=None,
        block_device_mapping=None, node=None, limits=None,
        host_list=None):
    ......    
    cctxt.cast(ctxt, 'build_and_run_instance', **kwargs)

Nova-Compute

套路都熟悉了,直接看nova/compute/manager.py吧,其实最核心函数是_build_and_run_instance,这里进行镜像、网络资源的准备以及各种验证、状态修改、发送通知等。最后调用对应driver的spawn方法,这里以libvert为例,对应文件是nova/virt/libvirt/driver.py

def spawn(self, context, instance, image_meta, injected_files,
            admin_password, allocations, network_info=None,
            block_device_info=None, power_on=True):
    disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type,
                                        instance,
                                        image_meta,
                                        block_device_info)
    injection_info = InjectionInfo(network_info=network_info,
                                    files=injected_files,
                                    admin_pass=admin_password)
    gen_confdrive = functools.partial(self._create_configdrive,
                                        context, instance,
                                        injection_info)
    self._create_image(context, instance, disk_info['mapping'],
                        injection_info=injection_info,
                        block_device_info=block_device_info)

    # Required by Quobyte CI
    self._ensure_console_log_for_instance(instance)

    # Does the guest need to be assigned some vGPU mediated devices ?
    mdevs = self._allocate_mdevs(allocations)

    xml = self._get_guest_xml(context, instance, network_info,
                                disk_info, image_meta,
                                block_device_info=block_device_info,
                                mdevs=mdevs)
    self._create_domain_and_network(
        context, xml, instance, network_info,
        block_device_info=block_device_info,
        post_xml_callback=gen_confdrive,
        destroy_disks_on_failure=True,
        power_on=power_on)
    LOG.debug("Guest created on hypervisor", instance=instance)

    def _wait_for_boot():
        """Called at an interval until the VM is running."""
        state = self.get_info(instance).state

        if state == power_state.RUNNING:
            LOG.info("Instance spawned successfully.", instance=instance)
            raise loopingcall.LoopingCallDone()

    if power_on:
        timer = loopingcall.FixedIntervalLoopingCall(_wait_for_boot)
        timer.start(interval=0.5).wait()
    else:
        LOG.info("Instance spawned successfully.", instance=instance)

在这里拉取镜像创建根目录,生成XML(默认在/etc/libvert/qemu目录),定义网络和domain并启动,然后虚拟机状态为running。到此为止,创建虚拟机完成。