Proj-2012-2013-OAR-Cloud: Difference between revisions
Jordan.Calvi (talk | contribs) |
|||
(50 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
=OAR Cloud Project= |
=OAR Cloud Project= |
||
This project aims at creating a light cloud computing architecture on top of the batch scheduler [ |
This project aims at creating a light cloud computing architecture on top of the batch scheduler [[OAR]]. The main goal is to bring the best of the cloud computing in the Grid world. |
||
You can access to the Git repository on [https://github.com/mickours/oar-cloud github]. |
|||
* [https://github.com/mickours/oar-cloud/blob/master/doc/presentation_RICM4/RICM4_Projet_Cloud_Computing_Rapport_CALVI_MAURICE.pdf?raw=true Project Report] |
|||
* [https://github.com/mickours/oar-cloud/blob/master/doc/presentation_RICM4/RICM4_Projet_Cloud_Computing_Presentation_CALVI_MAURICE.pdf?raw=true Presentation] |
|||
* [https://raw.github.com/mickours/oar-cloud/master/doc/presentation_RICM4/RICM4_Projet_Cloud_Computing_Flyer_CALVI_MAURICE.png Flyer] |
|||
=Project Members= |
=Project Members= |
||
Line 14: | Line 20: | ||
*Michael Mercier (RICM5) |
*Michael Mercier (RICM5) |
||
=State of the art= |
|||
=Conception= |
|||
This project begins on January 2013. Here is the current state of the art concerning the cloud computing technologies and the underlying virtualization technologies. |
|||
==Cloud computing== |
|||
===Definition=== |
|||
The cloud computing is the use computing and storing resources through a network. It is providing a stack of services, from the hardware to the applications, named "''Something'' as a Service". You can read a more exhaustive definition on [http://en.wikipedia.org/wiki/Cloud_computing Wikipedia]. |
|||
===IaaS=== |
|||
The layer we are interested in, is the Infrastructure as a Service layer (IaaS). It provided access to computation resources with an hardware abstraction made by virtualization. The leader in this domain is [http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud Amazon EC2]. There IaaS proprietary platform is considered as the industrial standard and almost every other cloud computing solution are compatible with it's API. They provide, through a Command Line Interface (CLI) and a Web interface, a simple way to deploy scaling web applications and services. This is an on-demand self-service and you pay only when you are using it. |
|||
The main features of Amazon EC2 are (from the [http://aws.amazon.com/ec2/#functionality Amazon EC2 web site]): |
|||
*Select a pre-configured, templated Amazon Machine Image (AMI) to get up and running immediately. Or create an AMI containing your applications, libraries, data, and associated configuration settings. |
|||
*Configure security and network access on your Amazon EC2 instance. |
|||
*Choose which instance type(s) you want, then start, terminate, and monitor as many instances of your AMI as needed, using the web service APIs or the variety of management tools provided. |
|||
*Determine whether you want to run in multiple locations, utilize static IP endpoints, or attach persistent block storage to your instances. |
|||
*Pay only for the resources that you actually consume, like instance-hours or data transfer. |
|||
But there is lots of alternatives to Amazon. Several company provide the same kind of services using proprietary or open source solutions. You can see a comparison of some of these solutions [http://en.wikipedia.org/wiki/Cloud_computing_comparison here]. |
|||
==Virtualization technologies== |
|||
===Hardware virtualization=== |
|||
There is a lot of system virtualization technologies. For system virtualization Amazon is using the [http://xen.org/ XEN] virtualization technology like most of their competitors. As we are looking for open source technologies, there is two serious alternatives to consider: [http://www.linux-kvm.org/page/Main_Page KVM] and [http://lxc.sourceforge.net/ LXC]. |
|||
;Xen |
|||
:Well known and used in big mainframe cloud solution, Xen is consider as reliable. It is a bare-metal (type1) hypervisor that pivot directly on hardware. This means that every operating systems that runs on top of this hypervisor needs to be ported to fit with the [http://en.wikipedia.org/wiki/Paravirtualization paravirtualization] API. It recently provide an hardware-assisted virtualization like KVM. |
|||
;KVM |
|||
:Included to the Linux Kernel, KVM (Kernel-based Virtual Machine) is using hardware-assisted virtualization for x86 processors with the virtualization extensions (Intel VT or AMD-V). It seems to be quite equivalent to Xen, but some say that it's quite easier to use. |
|||
;[[LXC]] |
|||
:The most recent one, LXC for ''LinuX Container'' use an other kind of virtualization technique based on isolation. The major drawback is that it works only for Linux OS or program. Also, the actual stable version (0.7.5) is still a work-in-progress and the API is not stabilize yet. There is lots of security issues not really solve for now[http://mattoncloud.org/2012/07/16/are-lxc-containers-enough/] [http://fr.slideshare.net/dpavlin/security-of-linux-containers-in-the-cloud]. The major advantage is speed and the fact that LXC is directly accessible into the kernel. |
|||
We want to know if [[LXC]] can be a good alternative to Xen or KVM. |
|||
[http://libvirt.org/ Libvirt] is a useful tool for virtualization. It allows to use different kind of virtualization tools (above-mentioned) within a single interface. It is using XML configuration file to describe the VMs and a dedicated driver for each kind of virtualization tool. |
|||
===Network virtualization=== |
|||
In cloud computing infrastructure we need to connect the Virtual Machines (VMs). To achieve this, the network virtualization is used. In the Linux world the most common virtual network used the virtual interface named [http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge bridge]. The bridge is forwarding packet at Layer 2, so it is independent of upper layer protocols. It can be manage via the ''brctl'' command. Advanced features, like filtering or firewalling, are done using the [http://ebtables.sourceforge.net/ ebtable project]. |
|||
A more complete tools named [[OpenVSwitch]] is open source and commonly used in lots of cloud computing solutions. It's basically a virtual switch that can handle VMs inside a node or between multiple node over real switch. |
|||
To fit our demand regarding the network we will try to use OpenVSwitch to build our solution. |
|||
=Conception= |
|||
Here is the big picture of what could be the entire solution. It aims at giving us an overview of the project and helping us to design a roadmap. |
|||
==Context== |
==Context== |
||
There is two kind of actors that are dealing with OAR cloud, users and administrators. The F.u* and the F.a* are the user and Administrator features describe below. |
There is two kind of actors that are dealing with OAR cloud, users and administrators. The F.u* and the F.a* are the user and Administrator features describe below. |
||
Line 62: | Line 109: | ||
'''F.a.6''' Being inform by e-mail and/or notification for interesting events |
'''F.a.6''' Being inform by e-mail and/or notification for interesting events |
||
==Logical |
==Logical Views== |
||
Here is the logical view of the OAR Cloud system. Every component on this diagram represents a software component type. The links between these components represent the communication between them. |
Here is the logical view of the OAR Cloud system. Every component on this diagram represents a software component type. The links between these components represent the communication between them. |
||
===Version 0.1=== |
|||
[[File:DiagLogic.0.1.png|alt=OAR Cloud Logical View|Logical View (version 0.1)]] |
|||
This is the first version of what could be the entire system architecture. It is design to make all the above features works. |
|||
[[File:DiagLogic.0.1.png|alt=OAR Cloud Logical View v0.1|Logical View (version 0.1)]] |
|||
Description of the main components: |
Description of the main components: |
||
Line 72: | Line 122: | ||
; InstanceManager : Manage the creation, configuration and deletion of instances all over the severals nodes. It also handles the appliances persistence and deployment |
; InstanceManager : Manage the creation, configuration and deletion of instances all over the severals nodes. It also handles the appliances persistence and deployment |
||
; UserCLI & AdminCLI : Command line access tools for users and admins |
; UserCLI & AdminCLI : Command line access tools for users and admins |
||
===Version 0.2=== |
|||
Not relevant : skip it! |
|||
===Version 0.3=== |
|||
This logical view was made after the [[#M1 | M1]] milestone and based on the knowledge gained this step. It is a much more minimal approach. However, it only complete the F.u.0 to F.u.3 and F.a.0 to F.a.4 features. |
|||
[[File:DiagLogic.0.3.png|alt=OAR Cloud Logical View v0.3|Logical View (version 0.3)]] |
|||
Description of the main component: |
|||
; OARVirtCLI : It is a command line tool (CLI) that is install on the user's computer. It allows to : |
|||
*create an instance |
|||
*display the available images |
|||
*run one or more images on an instance |
|||
*connect to a running image |
|||
*stop a running image |
|||
*delete an instance |
|||
====How it works==== |
|||
* The user create an instance |
|||
[[File:DiagSequence new instance.png|alt=OAR virtual CLI - New instance sequence diagram |New instance sequence diagram]] |
|||
* He gets the available images from his instance |
|||
[[File:DiagSequence show image.png|alt=OAR virtual CLI - Show images list sequence diagram |Show images list sequence diagram]] |
|||
* He launch the the chosen image |
|||
[[File:DiagSequence run image.png|alt=OAR virtual CLI - Run image sequence diagram |Run image sequence diagram]] |
|||
* Then he connect himself to the image using ssh |
|||
ssh root@IP |
|||
=Milestones= |
=Milestones= |
||
This table presents the milestones of the project. Each Milestones are described below. |
This table presents the milestones of the project. Each Milestones are described below. |
||
{| class="wikitable alternance centre" |
{| class="wikitable alternance centre" style="text-align:center;" |
||
|+ Project milestones |
|+ Project milestones |
||
| |
| |
||
! scope="col" | Resource reservation |
! scope="col" | OAR Resource reservation |
||
! scope="col" | |
! scope="col" | LXC |
||
! scope="col" | |
! scope="col" | libvirt |
||
! scope="col" | OpenVSwitch |
! scope="col" | OpenVSwitch |
||
! scope="col" | VM SSH Access |
! scope="col" | VM SSH Access |
||
Line 92: | Line 168: | ||
| X |
| X |
||
| X |
| X |
||
| |
| |
||
| |
| |
||
| |
| |
||
Line 113: | Line 189: | ||
| |
| |
||
|- |
|- |
||
! scope="row" | |
! scope="row" | M3 |
||
| X |
| X |
||
| X |
| X |
||
Line 127: | Line 203: | ||
==M1== |
==M1== |
||
In an Ubuntu 12.04 LTS environement |
|||
In the pre configured OAR appliance |
|||
#install and configure OAR |
|||
#Configure LXC and libvirt |
|||
#install and configure LXC |
|||
#OAR reservation |
|||
#make OAR reservation |
|||
#launch one or more VM |
|||
#launch one or more VM using LXC |
|||
#Connect to the VM |
#Connect to the VM |
||
#check if killing the job do kill the VM |
|||
#Make a script to automate this! |
|||
#script this! |
|||
The cigri devel appliance was used as an configuration example for this. |
|||
==M2== |
|||
The Ubuntu 12.04 LTS distribution has been chosen because it seems to be one of |
|||
Directly in a Linux based OS: |
|||
the few distributions where LXC works out-of-the-box. |
|||
#configure LXC, libvirt and OpenVswitch |
|||
#launch at least 2 VM |
|||
#make the VM to ping each others |
|||
#make a script |
|||
===Encountered problems=== |
|||
==M3== |
|||
TODO |
|||
====OAR settings==== |
|||
=Tools= |
|||
* the job manager "job_resource_manager_cgroups.pl" generate cpuset errors |
|||
==Libvirt and LXC == |
|||
<code> |
|||
===LXC=== |
|||
[job_resource_manager_cgroups][41][DEBUG] init |
|||
mount: special device none does not exist |
|||
rm: cannot remove `/dev/cpuset': Is a directory |
|||
ln: failed to create symbolic link `/dev/cpuset/oar_cgroups': Operation not permitted |
|||
[job_resource_manager_cgroups][41][ERROR] Failed to mount cgroup pseudo filesystem |
|||
</code> |
|||
* the job manager "job_resource_manager.pl" generate cpuset errors too |
|||
<code> |
|||
[job_resource_manager][40][DEBUG] init |
|||
[debug] [2013-02-21 20:03:15.153] [MetaSched] Start of meta scheduler |
|||
sh: 1: cannot create /dev/cpuset//oar/cpu_exclusive: Permission denied |
|||
[job_resource_manager][40][ERROR] Failed to create cpuset /oar |
|||
</code> |
|||
* I thought the problem come from a database conflict so I tried to use |
|||
<code> |
|||
% sudo oar-database --reset |
|||
Are you sure you want to reset your database ? (The database content will be lost) [y/N]: y |
|||
resetting the database 'oar'... |
|||
ERROR 1064 (42000) at line 2: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'schema' at line 1 |
|||
Fail to execute /usr/lib/oar/database/mysql_reset_structure.sql |
|||
. at /usr/sbin/oar-database line 188, <FIN> line 1. |
|||
</code> |
|||
* I tried to run the `update_cpuset_id.sh` script but it shows an error message either: |
|||
<code> |
|||
% sudo /etc/oar/update_cpuset_id.sh 127.0.0.1 |
|||
The authenticity of host '[127.0.0.1]:6667 ([127.0.0.1]:6667)' can't be established. |
|||
RSA key fingerprint is 72:91:a6:40:29:60:b2:c2:18:ba:b7:66:4a:c5:d7:2f. |
|||
Are you sure you want to continue connecting (yes/no)? yes |
|||
Warning: Permanently added '[127.0.0.1]:6667' (RSA) to the list of known hosts. |
|||
Permission denied (publickey,keyboard-interactive). |
|||
DBD::mysql::st execute failed: Unknown column 'ip' in 'where clause' at /usr/share/perl5/OAR/IO.pm line 4774. |
|||
DBD::mysql::st fetchrow_hashref failed: fetch() without execute() at /usr/share/perl5/OAR/IO.pm line 4776. |
|||
</code> |
|||
The problem comes from the ''cgroup-lite'' service that run by default in an Ubuntu 12.04. |
|||
LXC is a lightweight hypervisor allowing to run isolated appliances. Indeed, it provides a virtual environment that has its own process and network space. It is similar to a chroot. As LXC is implemented on given linux kernel, only operating systems that are compatible with the hosting kernel will be able to run. It is based on cgroups (control groups), a Linux kernel feature to manage ressources like CPU, memory and disk I/O by limiting resources, prioritizing groups, accounting (measuring), isolating (separate namespaces for groups, it means processes, network connections and files are not visible by other groups) and controling groups. |
|||
Stop this service using |
|||
service cgroup-lite stop |
|||
solve the problem for OAR but puts LXC down. |
|||
I find a trick to make OAR and LXC working together: I disable the cpuset feature of OAR. |
|||
In the /etc/oar/oar.conf (there is a copy in the M1 folder) I have comment CPUSET_PATH and |
|||
set to yes OARSUB_FORCE_JOB_KEY as it is provided in the CPUSET_PATH comment. |
|||
Thus, I could run an LXC container inside a job. The container was vanished when the job has been killed. |
|||
====Installation==== |
|||
===Questions=== |
|||
* Is the OAR cpuset mandatory, even if the LXC manage it? |
|||
/!\ LXC as been set up succesfully on ubuntu 12.04 LTS as container launching does not works on Debian Wheeze testing OS. /!\ |
|||
* Is LXC mature enough to use it in production? |
|||
===M1 Conclusion=== |
|||
During this milestone we learned several things: |
|||
''Packages installation'' |
|||
* libvirt uses a separate implementation of LXC and not the LXC package itself. Also, it can be harder to use and cuts some LXC functionalities like reboot and shutdown a container. |
|||
:<code>aptitude install lxc bridge-utils debootstrap</code> |
|||
* by default OAR and LXC (like it works on Ubuntu 12.04) are in conflict because of the use of cgroups by OAR. The current default mount point of cgroups with LXC use /sys/fs/cgroup/. Each [https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt subsystems] (cpuset, cpuacct, memory,...) is mount separately on /sys/fs/cgroup/<cgroup-subsystem> where <cgroup-subsystem> is one of the subsystem name. It seems that rewriting the cgroups management part of the '''job_resource_manager_cgroups.pl''' file, to fit with the cgroups mount points, should eventually makes them works together. After some try, we found out that '''oarsh''' and '''oarsh_shell''' should be modify too as it is said in the [http://oar.imag.fr/sources/2.5/docs/documentation/OAR-DOCUMENTATION-USER/#cpuset documentation] but the functions to modify does not exists. However it might be possible to configure OAR and to works with LXC if we dig deeper. |
|||
:/?\ Conteners will be placed in /var/lib/lxc /?\ |
|||
* The LXC security is not efficient by default. Some distribution are using SELinux to secure the container but OAR is not working with SELinux. Even so, there is several ways to make a container secure using AppArmor or systemd with a sandbox[http://fr.slideshare.net/dpavlin/security-of-linux-containers-in-the-cloud]. |
|||
* LXC is quite a young project and it's not stabilized for now. The documentation is a bit poor and makes LXC works depends on a lots of parameters. Moreover, depending on the Linux distribution the LXC configuration is totally different. It makes this tool really hard to understand and configure. |
|||
We can notice that [http://0pointer.de/public/systemd-man/systemd-nspawn.html#example-nsenter systemd-nspawn], a kind of container launcher based on [http://en.wikipedia.org/wiki/Systemd systemd]. It is currently just a hack tool but it might become an LXC alternative for systemd based distributions. |
|||
==M2== |
|||
''Mounting cgroups automatically : edit /etc/fstab and add the following |
|||
In Ubuntu 12.04 LTS |
|||
:<code>cgroup /sys/fs/cgroup cgroup defaults 0 0</code> |
|||
#install and configure LXC, libvirt and OpenVswitch |
|||
#launch at least 2 VMs |
|||
#make the VMs to ping each others |
|||
#script this! |
|||
===Problems encountered during M2=== |
|||
We found this [http://s3hh.wordpress.com/2012/05/28/connecting-containers-on-several-hosts-with-open-vswitch/ page] explaining how to use OpenVswitch (OVS) to connect two EC2 instance with a GRE tunnel. The script attached to this page was used to set up an OVS bridge to replace the default linux bridge. Two problems appeared: |
|||
# the containers with OVS bridge makes more then 2 minutes to be launched while it takes less then 10 seconds with Linux bridges |
|||
# when you are using an OVS bridge between two containers. It's is possible to reach one by an other but sometimes one of the container loose his IP address and breaks the connection. |
|||
===M2 Conclusion=== |
|||
''Enabling previous modifications |
|||
The use of OVS as the main bridge for an LXC container is unstable. It seems that LXC and OpenVswitch are not really compatible for now. |
|||
:<code>mount -a</code> |
|||
==M3== |
|||
TODO |
|||
=Tools= |
|||
''Checking everything is ok |
|||
:<code>lxc-checkconfig</code> |
|||
==Libvirt and LXC == |
|||
===[[LXC]]=== |
|||
===[[Libvirt]]=== |
|||
====Manipulation of containers==== |
|||
===[[Open vSwitch]]=== |
|||
==Internal links== |
|||
''Creating a container running Ubuntu |
|||
:<code>lxc-create -t ubuntu -n ubuntu1</code> |
|||
:/!\ By default, the version of the guest OS is the same as the hosting one. /!\ |
|||
*[[Proj-2012-2013-OAR-Cloud_UML | '''UML''']] |
|||
==External links== |
|||
''Showing existing containers and thoses that are running |
|||
*a [https://www.berrange.com/posts/2011/09/27/getting-started-with-lxc-using-libvirt/ getting started page] for using LXC with libvirt |
|||
:<code>lxc-ls</code> |
|||
*the [https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt cgroups documentation] |
|||
:/?\ The first line indicates existing containers and the second one thoses in running state. /?\ |
|||
*a [http://fr.slideshare.net/dpavlin/security-of-linux-containers-in-the-cloud slide show] about LXC security. |
|||
=Journal= |
|||
==08/04== |
|||
''Obtaining information about ubuntu1 |
|||
*add a script to configure LXC with Open vSwitch and connect 2 VMs |
|||
:<code>lxc-info -n ubuntu1</code> |
|||
==19/03== |
|||
*add content to [[#M2 | M2]] part |
|||
==15/03== |
|||
''Starting the container |
|||
* begin presentation (for Michael who's leaving the project the 21/03) |
|||
:<code>lxc-start -n ubuntu1</code> |
|||
==12/03== |
|||
*add [[#Version 0.3 | 0.3 version ]] in conception part |
|||
==11/03== |
|||
''Connection to the container |
|||
*begin of M2 |
|||
:<code>lxc-console -n ubuntu1</code> |
|||
==09/03== |
|||
*add [[#State of the art | State of the art]] part |
|||
==08/03== |
|||
''Shutting down the container |
|||
*M1 is closed. OAR cpuset and LXC does not work in competition in the same node (see [[#M1 Conclusion| why]]) |
|||
:<code>lxc-stop -n ubuntu1</code> |
|||
==22/02== |
|||
*M1 works in simple mode (OAR does not use cpuset) |
|||
''Exiting console |
|||
:perform <code>CTRL-a q</code> |
|||
''Deleting the container'' |
|||
:<code>virsh -c lxc:/// undefine ubuntu1</code> |
|||
====Configuring the container==== |
|||
At boot time, a virtual machines reads the file /var/lib/lxc/{VM-name}/config to set up its configuration (root file system, number of TTY, limites, etc). |
|||
<PRE> |
|||
lxc.network.type=veth |
|||
lxc.network.link=lxcbr0 |
|||
lxc.network.flags=up |
|||
lxc.network.hwaddr = 00:16:3e:24:e5:9a |
|||
lxc.utsname = ubuntu1 |
|||
lxc.devttydir = lxc |
|||
lxc.tty = 4 |
|||
lxc.pts = 1024 |
|||
lxc.rootfs = /var/lib/lxc/ubuntu1/rootfs |
|||
lxc.mount = /var/lib/lxc/ubuntu1/fstab |
|||
lxc.arch = amd64 |
|||
lxc.cap.drop = sys_module mac_admin |
|||
lxc.pivotdir = lxc_putold |
|||
# uncomment the next line to run the container unconfined: |
|||
#lxc.aa_profile = unconfined |
|||
lxc.cgroup.devices.deny = a |
|||
# Allow any mknod (but not using the node) |
|||
lxc.cgroup.devices.allow = c *:* m |
|||
lxc.cgroup.devices.allow = b *:* m |
|||
# /dev/null and zero |
|||
lxc.cgroup.devices.allow = c 1:3 rwm |
|||
lxc.cgroup.devices.allow = c 1:5 rwm |
|||
# consoles |
|||
lxc.cgroup.devices.allow = c 5:1 rwm |
|||
lxc.cgroup.devices.allow = c 5:0 rwm |
|||
#lxc.cgroup.devices.allow = c 4:0 rwm |
|||
#lxc.cgroup.devices.allow = c 4:1 rwm |
|||
# /dev/{,u}random |
|||
lxc.cgroup.devices.allow = c 1:9 rwm |
|||
lxc.cgroup.devices.allow = c 1:8 rwm |
|||
lxc.cgroup.devices.allow = c 136:* rwm |
|||
lxc.cgroup.devices.allow = c 5:2 rwm |
|||
# rtc |
|||
lxc.cgroup.devices.allow = c 254:0 rwm |
|||
#fuse |
|||
lxc.cgroup.devices.allow = c 10:229 rwm |
|||
#tun |
|||
lxc.cgroup.devices.allow = c 10:200 rwm |
|||
#full |
|||
lxc.cgroup.devices.allow = c 1:7 rwm |
|||
#hpet |
|||
lxc.cgroup.devices.allow = c 10:228 rwm |
|||
#kvm |
|||
lxc.cgroup.devices.allow = c 10:232 rwm |
|||
</PRE> |
|||
====Configuring default network and switch==== |
|||
see: |
|||
/etc/default/lxc |
|||
===Libvirt=== |
|||
====Installation==== |
|||
''Packages installation'' |
|||
:<code>apt-get install libvirt-bin</code> |
|||
''Creating an XML file configuration to import an existing container in libvirt'' |
|||
: /!\ Notice that libvirt can not install am OS in a container. Therefore, an LXC container with an OS must have been set up previously (that was the we saw before). Then, the file system directory will be given to libvirt when importing the VM. /!\ |
|||
: In order to create a libvirt container, an XML file describing the VM we want to import must be filled. There is a sample of such an XML file that belong to VM "ubuntu1" we have just created : |
|||
: |
|||
: |
|||
<PRE> |
|||
<domain type='lxc' |
|||
<name>ubuntu1</name> |
|||
<memory>332768</memory> |
|||
<os> |
|||
<type>exe</type> |
|||
<init>/sbin/init</init> |
|||
</os> |
|||
<vcpu>1</vcpu> |
|||
<clock offset='utc'/> |
|||
<on_poweroff>destroy</on_poweroff> |
|||
<on_reboot>restart</on_reboot> |
|||
<on_crash>destroy</on_crash> |
|||
<devices> |
|||
<emulator>/usr/lib/libvirt/libvirt_lxc</emulator> |
|||
<filesystem type='mount'> |
|||
<source dir='/var/lib/lxc/ubuntu1/rootfs'/> |
|||
<target dir='/'/> |
|||
</filesystem> |
|||
<interface type='network'> |
|||
<source network='default'/> |
|||
</interface> |
|||
<console type='pty' /> |
|||
</devices> |
|||
</domain> |
|||
</PRE> |
|||
''Setting the container as a libvirt one'' |
|||
:<code>virsh --connect lxc:/// define ubuntu1.xml</code> |
|||
''Booting the container'' |
|||
:<code>virsh -c lxc:/// start ubuntu1</code> |
|||
''Connecting to the container localy'' |
|||
:<code>virsh -c lxc:/// console ubuntu1</code> |
|||
''Connecting to the container remotly'' |
|||
:<code>virsh -c lxc+{IPDEST}:/// console ubuntu1</code> |
|||
''Shutting the container '' |
|||
:<code>virsh -c lxc:/// destroy ubuntu1</code> |
|||
''Deleting the container'' |
|||
:<code>virsh -c lxc:/// undefine ubuntu1</code> |
|||
====Problemes==== |
|||
* Using Ubuntu as a host,when connecting to a libvirt VM running Debian, the guest appliance waits for the user to log in through two interfaces at a time (tty1 and console), so it is not possible to get identified. |
|||
* Using Debian Wheeze as a host, when connecting to a libvirt VM, the console does not offer the user the possibility to log in. However, when using directly LXC there is no issue. |
|||
==Open vSwitch== |
|||
==Internal links== |
|||
*[[Proj-2012-2013-OAR-Cloud_UML | '''UML''']] |
|||
==External links== |
|||
*a [https://www.berrange.com/posts/2011/09/27/getting-started-with-lxc-using-libvirt/ getting started page] for using LXC with libvirt |
|||
=Journal= |
|||
==19/02== |
==19/02== |
||
*milestone definitions |
*milestone definitions |
||
==07/02== |
|||
*add [[#Conception | Conception]] part |
|||
==04/02== |
==04/02== |
Latest revision as of 09:10, 11 April 2013
OAR Cloud Project
This project aims at creating a light cloud computing architecture on top of the batch scheduler OAR. The main goal is to bring the best of the cloud computing in the Grid world.
You can access to the Git repository on github.
Project Members
This project is proposed by:
Olivier Richard - Teacher and researcher in RICM's Polytech Grenoble training
Three students from RICM are working on it:
- Jordan Calvi (RICM4)
- Alexandre Maurice (RICM4)
- Michael Mercier (RICM5)
State of the art
This project begins on January 2013. Here is the current state of the art concerning the cloud computing technologies and the underlying virtualization technologies.
Cloud computing
Definition
The cloud computing is the use computing and storing resources through a network. It is providing a stack of services, from the hardware to the applications, named "Something as a Service". You can read a more exhaustive definition on Wikipedia.
IaaS
The layer we are interested in, is the Infrastructure as a Service layer (IaaS). It provided access to computation resources with an hardware abstraction made by virtualization. The leader in this domain is Amazon EC2. There IaaS proprietary platform is considered as the industrial standard and almost every other cloud computing solution are compatible with it's API. They provide, through a Command Line Interface (CLI) and a Web interface, a simple way to deploy scaling web applications and services. This is an on-demand self-service and you pay only when you are using it. The main features of Amazon EC2 are (from the Amazon EC2 web site):
- Select a pre-configured, templated Amazon Machine Image (AMI) to get up and running immediately. Or create an AMI containing your applications, libraries, data, and associated configuration settings.
- Configure security and network access on your Amazon EC2 instance.
- Choose which instance type(s) you want, then start, terminate, and monitor as many instances of your AMI as needed, using the web service APIs or the variety of management tools provided.
- Determine whether you want to run in multiple locations, utilize static IP endpoints, or attach persistent block storage to your instances.
- Pay only for the resources that you actually consume, like instance-hours or data transfer.
But there is lots of alternatives to Amazon. Several company provide the same kind of services using proprietary or open source solutions. You can see a comparison of some of these solutions here.
Virtualization technologies
Hardware virtualization
There is a lot of system virtualization technologies. For system virtualization Amazon is using the XEN virtualization technology like most of their competitors. As we are looking for open source technologies, there is two serious alternatives to consider: KVM and LXC.
- Xen
- Well known and used in big mainframe cloud solution, Xen is consider as reliable. It is a bare-metal (type1) hypervisor that pivot directly on hardware. This means that every operating systems that runs on top of this hypervisor needs to be ported to fit with the paravirtualization API. It recently provide an hardware-assisted virtualization like KVM.
- KVM
- Included to the Linux Kernel, KVM (Kernel-based Virtual Machine) is using hardware-assisted virtualization for x86 processors with the virtualization extensions (Intel VT or AMD-V). It seems to be quite equivalent to Xen, but some say that it's quite easier to use.
- LXC
- The most recent one, LXC for LinuX Container use an other kind of virtualization technique based on isolation. The major drawback is that it works only for Linux OS or program. Also, the actual stable version (0.7.5) is still a work-in-progress and the API is not stabilize yet. There is lots of security issues not really solve for now[1] [2]. The major advantage is speed and the fact that LXC is directly accessible into the kernel.
We want to know if LXC can be a good alternative to Xen or KVM.
Libvirt is a useful tool for virtualization. It allows to use different kind of virtualization tools (above-mentioned) within a single interface. It is using XML configuration file to describe the VMs and a dedicated driver for each kind of virtualization tool.
Network virtualization
In cloud computing infrastructure we need to connect the Virtual Machines (VMs). To achieve this, the network virtualization is used. In the Linux world the most common virtual network used the virtual interface named bridge. The bridge is forwarding packet at Layer 2, so it is independent of upper layer protocols. It can be manage via the brctl command. Advanced features, like filtering or firewalling, are done using the ebtable project.
A more complete tools named OpenVSwitch is open source and commonly used in lots of cloud computing solutions. It's basically a virtual switch that can handle VMs inside a node or between multiple node over real switch.
To fit our demand regarding the network we will try to use OpenVSwitch to build our solution.
Conception
Here is the big picture of what could be the entire solution. It aims at giving us an overview of the project and helping us to design a roadmap.
Context
There is two kind of actors that are dealing with OAR cloud, users and administrators. The F.u* and the F.a* are the user and Administrator features describe below.
Features
User
Main features:
F.u.0 Connect to an account
F.u.1 Launch and configure one or more instances
F.u.2 Deploy an image on one or more instances
F.u.3 Modify and save images
F.u.4 Setup alarms based on rules using metrics
F.u.5 Being inform by e-mail and/or notification for interesting events
Advanced features:
F.u.6 Automated resize of an instance (adapt the resources) using predefined rules and schedule
F.u.7 Load balancing between several instances
F.u.8 Advanced Network configuration for user: ACL, subnets, VPN...
Administrator
F.a.0 Create/delete user account
F.a.1 Add/remove and manage resources
F.a.2 Visualize resources and instances states
F.a.3 Install and update nodes operating systems
F.a.4 Handle users access rights
F.a.5 Setup alarms based on rules using metrics
F.a.6 Being inform by e-mail and/or notification for interesting events
Logical Views
Here is the logical view of the OAR Cloud system. Every component on this diagram represents a software component type. The links between these components represent the communication between them.
Version 0.1
This is the first version of what could be the entire system architecture. It is design to make all the above features works.
Description of the main components:
- AccountManager
- Handle users and admins access rights
- AccessPoint
- The system access point reached by the different access tools
- InstanceManager
- Manage the creation, configuration and deletion of instances all over the severals nodes. It also handles the appliances persistence and deployment
- UserCLI & AdminCLI
- Command line access tools for users and admins
Version 0.2
Not relevant : skip it!
Version 0.3
This logical view was made after the M1 milestone and based on the knowledge gained this step. It is a much more minimal approach. However, it only complete the F.u.0 to F.u.3 and F.a.0 to F.a.4 features.
Description of the main component:
- OARVirtCLI
- It is a command line tool (CLI) that is install on the user's computer. It allows to :
- create an instance
- display the available images
- run one or more images on an instance
- connect to a running image
- stop a running image
- delete an instance
How it works
- The user create an instance
- He gets the available images from his instance
- He launch the the chosen image
- Then he connect himself to the image using ssh
ssh root@IP
Milestones
This table presents the milestones of the project. Each Milestones are described below.
OAR Resource reservation | LXC | libvirt | OpenVSwitch | VM SSH Access | Access Security | Network isolation | Resource cleanup | Client CLI | Storage | |
---|---|---|---|---|---|---|---|---|---|---|
M1 | X | X | X | |||||||
M2 | X | X | X | |||||||
M3 | X | X | X | X | X | X |
M1
In an Ubuntu 12.04 LTS environement
- install and configure OAR
- install and configure LXC
- make OAR reservation
- launch one or more VM using LXC
- Connect to the VM
- check if killing the job do kill the VM
- script this!
The cigri devel appliance was used as an configuration example for this. The Ubuntu 12.04 LTS distribution has been chosen because it seems to be one of the few distributions where LXC works out-of-the-box.
Encountered problems
OAR settings
- the job manager "job_resource_manager_cgroups.pl" generate cpuset errors
[job_resource_manager_cgroups][41][DEBUG] init
mount: special device none does not exist
rm: cannot remove `/dev/cpuset': Is a directory
ln: failed to create symbolic link `/dev/cpuset/oar_cgroups': Operation not permitted
[job_resource_manager_cgroups][41][ERROR] Failed to mount cgroup pseudo filesystem
- the job manager "job_resource_manager.pl" generate cpuset errors too
[job_resource_manager][40][DEBUG] init
[debug] [2013-02-21 20:03:15.153] [MetaSched] Start of meta scheduler
sh: 1: cannot create /dev/cpuset//oar/cpu_exclusive: Permission denied
[job_resource_manager][40][ERROR] Failed to create cpuset /oar
- I thought the problem come from a database conflict so I tried to use
% sudo oar-database --reset
Are you sure you want to reset your database ? (The database content will be lost) [y/N]: y
resetting the database 'oar'...
ERROR 1064 (42000) at line 2: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'schema' at line 1
Fail to execute /usr/lib/oar/database/mysql_reset_structure.sql
. at /usr/sbin/oar-database line 188, <FIN> line 1.
- I tried to run the `update_cpuset_id.sh` script but it shows an error message either:
% sudo /etc/oar/update_cpuset_id.sh 127.0.0.1
The authenticity of host '[127.0.0.1]:6667 ([127.0.0.1]:6667)' can't be established.
RSA key fingerprint is 72:91:a6:40:29:60:b2:c2:18:ba:b7:66:4a:c5:d7:2f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[127.0.0.1]:6667' (RSA) to the list of known hosts.
Permission denied (publickey,keyboard-interactive).
DBD::mysql::st execute failed: Unknown column 'ip' in 'where clause' at /usr/share/perl5/OAR/IO.pm line 4774.
DBD::mysql::st fetchrow_hashref failed: fetch() without execute() at /usr/share/perl5/OAR/IO.pm line 4776.
The problem comes from the cgroup-lite service that run by default in an Ubuntu 12.04. Stop this service using
service cgroup-lite stop
solve the problem for OAR but puts LXC down.
I find a trick to make OAR and LXC working together: I disable the cpuset feature of OAR. In the /etc/oar/oar.conf (there is a copy in the M1 folder) I have comment CPUSET_PATH and set to yes OARSUB_FORCE_JOB_KEY as it is provided in the CPUSET_PATH comment.
Thus, I could run an LXC container inside a job. The container was vanished when the job has been killed.
Questions
- Is the OAR cpuset mandatory, even if the LXC manage it?
- Is LXC mature enough to use it in production?
M1 Conclusion
During this milestone we learned several things:
- libvirt uses a separate implementation of LXC and not the LXC package itself. Also, it can be harder to use and cuts some LXC functionalities like reboot and shutdown a container.
- by default OAR and LXC (like it works on Ubuntu 12.04) are in conflict because of the use of cgroups by OAR. The current default mount point of cgroups with LXC use /sys/fs/cgroup/. Each subsystems (cpuset, cpuacct, memory,...) is mount separately on /sys/fs/cgroup/<cgroup-subsystem> where <cgroup-subsystem> is one of the subsystem name. It seems that rewriting the cgroups management part of the job_resource_manager_cgroups.pl file, to fit with the cgroups mount points, should eventually makes them works together. After some try, we found out that oarsh and oarsh_shell should be modify too as it is said in the documentation but the functions to modify does not exists. However it might be possible to configure OAR and to works with LXC if we dig deeper.
- The LXC security is not efficient by default. Some distribution are using SELinux to secure the container but OAR is not working with SELinux. Even so, there is several ways to make a container secure using AppArmor or systemd with a sandbox[3].
- LXC is quite a young project and it's not stabilized for now. The documentation is a bit poor and makes LXC works depends on a lots of parameters. Moreover, depending on the Linux distribution the LXC configuration is totally different. It makes this tool really hard to understand and configure.
We can notice that systemd-nspawn, a kind of container launcher based on systemd. It is currently just a hack tool but it might become an LXC alternative for systemd based distributions.
M2
In Ubuntu 12.04 LTS
- install and configure LXC, libvirt and OpenVswitch
- launch at least 2 VMs
- make the VMs to ping each others
- script this!
Problems encountered during M2
We found this page explaining how to use OpenVswitch (OVS) to connect two EC2 instance with a GRE tunnel. The script attached to this page was used to set up an OVS bridge to replace the default linux bridge. Two problems appeared:
- the containers with OVS bridge makes more then 2 minutes to be launched while it takes less then 10 seconds with Linux bridges
- when you are using an OVS bridge between two containers. It's is possible to reach one by an other but sometimes one of the container loose his IP address and breaks the connection.
M2 Conclusion
The use of OVS as the main bridge for an LXC container is unstable. It seems that LXC and OpenVswitch are not really compatible for now.
M3
TODO
Tools
Libvirt and LXC
LXC
Libvirt
Open vSwitch
Internal links
External links
- a getting started page for using LXC with libvirt
- the cgroups documentation
- a slide show about LXC security.
Journal
08/04
- add a script to configure LXC with Open vSwitch and connect 2 VMs
19/03
- add content to M2 part
15/03
- begin presentation (for Michael who's leaving the project the 21/03)
12/03
- add 0.3 version in conception part
11/03
- begin of M2
09/03
- add State of the art part
08/03
- M1 is closed. OAR cpuset and LXC does not work in competition in the same node (see why)
22/02
- M1 works in simple mode (OAR does not use cpuset)
19/02
- milestone definitions
07/02
- add Conception part
04/02
- We have specified the subject
- Distribute the work between us.
- Jordan: LXC and Libvirt
- Alexandre: OpenVSwitch and Libvirt
- Michael: OAR and global architecture