Closing out 2018…and welcoming in 2019!!

2018 has been another exciting year for OpenNebula. It has brought continued developments and advancements in the OpenNebula product capabilities. At the same time, we’ve seen a fervor and a steady commitment by the User Community which continues to bring unmatched value. At OpenNebula Systems, we have our sights set on continued improvement for 2019, and we are excited about several promising, emerging developments. But again, one of the key dynamics of the project is that we wouldn’t be able to grow without you.

For that reason, one of the recent developments in the community has been our request for your participation in our 2018 User Survey.  This is a simple vehicle to allow us to learn about the use cases, platforms, and overarching technical needs of the OpenNebula User Community.  We look to remain in synch with your needs, and to develop alongside with you. Fill it out, and share your thoughts!

Speaking of developments…

Among the various version releases this year, we released version 5.6 “Blue Flash” with a huge set of improvements both at the core level, as well as for vCenter integration. And from there, we have jumped right into focused development on the upcoming version 5.8.  In it, we have been working on many different features – a long-awaited support for LXD containers, being one of them.

2018 has seen a certain dedicated focus on the emerging developments surrounding Edge Computing, and while we have been working closely with customers and partners, learning the details of evolving use cases, we have also made developments around integrations with OpenNebula along “the edge”.  Earlier this year, we released an initial prototype of “oneProvision”, allowing users to provision and deploy bare-metal resources directly within an OpenNebula cluster.  Upcoming development of oneProvision will include being able to deploy not only one host, but a cluster of hosts.  At the same time, we partnered with Packet to demonstrate our continued focus in bringing capabilities to the edge.

Recent releases of new capabilities like miniONE and VirtualNetwork Scheduler, and the Image Converter to/from VMDK and QCOW2 all demonstrate our driven effort to making OpenNebula the easiest-to-use platform out there.

The “Calendar of Events”

In 2018, we held several OpenNebula TechDay events throughout Europe – in Sofia hosted by StorPool, in Barcelona hosted by CSUC, and in Frankfurt hosted by LINBIT – and in the US – in Santa Clara, CA hosted by Hitachi Vantara and in Cambridge, MA hosted by OpenNebula Systems.  We also held our OpenNebulaConf in Amsterdam.  We thank our sponsors and hosts for collaborating to put these events together.

Events schedule in 2019

The lineup for OpenNebula TechDays for this coming year will tentatively be in the following locations, with dates and details to be determined:

  • Frankfurt
  • Barcelona
  • Vienna
  • Sofia
  • Boston

And plan to attend our 2019 OpenNebulaConf in Barcelona on October 21-22, 2019.

Great support from the Community

Lastly, as we continuously try to make clear, the OpenNebula project would not have the vitality nor the reach it has if it weren’t for our dedicated User Community.  We’ve seen a continued growth of OpenNebula Champions.  Throughout the year, users have taken the time to publish tutorials like these from Pandora FMSVirtuozzo 7, and CSUC.  Our OpenNebula Blog has been used by many from the Community to publish share insight and experiences. This year, we also created our Partner Ecosystem, another instrument to show and share integrations between ONE and other great technologies.

This has been an exciting year for OpenNebula! We give you our utmost thanks, and we look forward to our collaboration going into 2019!!

Stay Connected!

The Scientific IT Services (SIS) of ETH Zurich offers  scientific computing, research data management and analysis support, as well as software engineering expertise to ETH researchers.

To support the personalized health research community, the SIS built and actively develops further “Leonhard Med”: a secure and powerful high-performance platform designed for computing, storage, management, interoperability and controlled sharing of confidential research data (e.g., biomedical patient data). Leonhard Med is operated by the Scientific IT Services (SIS) of ETH Zurich and it is part of the emerging BioMedIT national network whose role is to provide secure and interoperable data and computing infrastructures for research projects in the Swiss personalized health programs

While being in production use since beginning of 2018, Leonhard Med must be constantly developed further to keep up with new and changing requirements within a rapidly, evolving scientific environment. For example, our customers needed additional services that could not be hosted on a regular HPC infrastructure (e.g., databases, terminal servers, webapps or data management applications). This brought us to the idea of providing a cloud solution. We had some previous experience running vCloud Director (VMware) and we also had a close look to OpenStack but both came with a high price tag either in terms of license costs or manpower. Luckily one of our consultants introduced OpenNebula to us and after a few weeks of testing we fell in love with it. It met all our requirements and we found it quite intuitive and easy to maintain and support. We were actually looking for a lightweight but powerful product that is easy to maintain with few IT personnel resources and on the other hand we were aware of the challenges lying ahead of us when integrating OpenNebula into the secure environment of Leonhard Med.

We began deploying and integrating OpenNebula almost 4 months ago, using 2 physical hosts from the cluster (new hardware) and set-up the OpenNebula and 2x KVM nodes on them. We now have a fully functional and productive installation ready to serve our consumer needs and we achieved this with only a few sysadmins working on the project part time time over the four months. Our private cloud running OpenNebula sits in a restricted zone without Internet access. The access is done via proxy servers using 2 factor authentication and Sunstone is only reachable via socks proxy. For reproducibility purposes, the installation and all processes running inside the cloud has been automated with Ansible.

Challenges: We did face a couple of challenges during installation and later on during the upgrade to v5.6. For example, we had to search for a couple of ruby gems, built rpms and move them into our secured environment. These were mostly related to our network security restrictions. Nevertheless, as a “nice to have” I’d include all dependencies required during installation or upgrade within OpenNebula’s repository for RH/CentOS platforms.

Just a few minutes of your time…

As we continue to focus on improvements of OpenNebula, we need direction from you, the User Community.

Please take a few minutes to fill out this OpenNebula Survey 2018 – to help us understand how you are using OpenNebula, and what you need going forward.  All information collected is confidential, and will not be shared.

Many, many thanks!

Our newsletter contains the highlights of the OpenNebula project and its Community throughout the month.

Technology

There are a lot of OpenNebula features currently being worked on that deserve some attention:

  • We are working on an upcoming feature with the aim to simplify the management of VM templates that can be deployed on multiple clusters – by creating an automated selection process for the VM networks.  Check out the recent post.
  • Migrating workloads both to and from KVM to VMware hypervisors will soon be as simple as bread and butter for breakfast. Check out the recent post.
  • We are also working on a “self-provisioning” method for Virtual Networks.  No longer will Virtual Networks be created only by cloud administrators, but rather, end-users can be given the ability to make changes at the logic level, like changes to IP ranges, to the DNS server, etc.

Keep an eye out for the upcoming version 5.8!

Community

November has been an exciting month for the User Community.

Outreach

  • We currently have our 2019 OpenNebula TechDay Call for Hosts open.  Take a look at your calendars, and think about planning a TechDay of your own!
  • We want your feedback!!
    • In the coming weeks, we are going to be sending out an OpenNebula survey, with the intention of learning a bit about how you are using OpenNebula.  Please plan to take the time to fill it out, as its purpose is for us to be able to serve you better!
    • If you attended the OpenNebulaConf in Amsterdam, and you haven’t submitted your feedback survey, please do so, and let us know what you think!

Let’s welcome in December!

Stay Connected!

We are opening the Call for Hosts for the OpenNebula TechDays in 2019!

Why don’t you host an OpenNebula TechDay of your own?

The OpenNebula Cloud TechDays are day-long educational and networking events to learn about OpenNebula.  Join our technical experts from OpenNebula Systems for a one-day, hands-on workshop on cloud installation and operation.  You’ll get a comprehensive overview of OpenNebula and will be equipped with the skills and insight to take back to your company and implement right away.

OpenNebula TechDays started in March 2014 and we’ve already celebrated over 30 different TechDays in the Netherlands, Belgium, Spain, United States, Romania, Czech Republic, France, Canada, Malaysia, Bulgaria, Germany and Ireland. They have been hosted by organizations like:

  • BestBuy
  • Telefonica
  • BIT.nl
  • Transunion
  • Hitachi
  • Microsoft
  • BlackBerry
  • Harvard University
  • Netways
  • and many others

Think about hosting a Cloud TechDay – we would love to work with you.  We only require that you provide a room with enough capacity for the attendees and some essential materials (WiFi, projector, etc…).

Go to the  TechDay Guidelines and Registration Form

The deadline for this call is December 11, 2018.  We look forward to hearing from you!

Infrastructure as Code (IaC) is changing the way that we’re doing things. Some people think that it’s the motorway that we have to follow and be aligned with business, as a resume they want us to be agile.

The arrival of tools such as Ansible, Puppet, SaltStack, and Chef, have enabled sysadmins to maintain modular, automatable infrastructure. This time I would like to introduce the Terraform tool.

Terraform is a provisioning declarative tool that is based on the Infrastructure as Code paradigm. Terraform is a multipurpose composition tool: it composes multiple tiers (SaaS/PaaS/IaaS).

Terraform is not a cloud agnostic tool, but in combination with OpenNebula, it can be amazing. By taking advantage of the template concept it will allow us to deploy vm’s agnostically in different cloud providers, such as AWS, Azure or on premise cloud infrastructure.

From the OpenNebula community we can observe several Terraform providers that have been developed. The first example is the project started by the Runtastic team that has recently been enhanced by Blackberry.

After this little introduction about Terraform, let’s go with a tutorial where a PaaS Rancher platform is deployed in an automated way with Terraform and RKE.

Deploy Rancher HA in OpenNebula with Terraform and RKE

Install Terraform

To install Terraform, find the appropriate package for your system and download it

$ curl -O https://releases.hashicorp.com/terraform/0.11.10/terraform_0.11.10_linux_amd64.zip

After downloading Terraform, unzip the package

$ sudo mkdir /bin/terraform
$ sudo unzip terraform_0.11.10_linux_amd64.zip -d /bin/terraform

After installing Terraform, verify the installation worked by opening a new terminal session and checking that Terraform is available.

$ export PATH=$PATH:/bin/terraform
$ terraform --version
Add Terraform providers for Opennebula and RKE

You need to install go first: https://golang.org/doc/install

After go is installed and set up, just type:

$ go get github.com/blackberry/terraform-provider-opennebula
$ go install github.com/blackberry/terraform-provider-opennebula 

Copy your terraform-provider-opennebula binary in a folder, like /usr/local/bin, and write this in ~/.terraformrc:

providers {
opennebula = "/usr/local/bin/terraform-provider-opennebula"
}

providers {
rke = "/usr/local/bin/terraform-provider-rke"
}

For RKE provider, download the binary and copy in the same folder:

$ wget https://github.com/yamamoto-febc/terraform-provider-rke/releases/download/0.5.0/terraform-provider-rke_0.5.0_linux-amd64.zip 
$ sudo unzip terraform-provider-rke_0.5.0_linux-amd64.zip -d /usr/local/bin/terraform-provider-rke

Install Rancher

Clone this repo:
$ git clone https://github.com/CSUC/terraform-rke-paas.git
Create infrastructure

First we have to initialize Terraform simply with:

$ terraform init

We let Terraform create a plan, which we can review:

$ terraform plan

The plan command lets you see what Terraform will do before actually doing it.

Now we execute:

$ terraform apply

That’s it – you should have a functional Rancher server:

Now, you can install the Docker Machine OpenNebula Driver and deploy new Kubernetes clusters in your Rancher platform:

The complete tutorial is available at Github:

https://github.com/CSUC/terraform-rke-paas

If you are interested in more details, don’t miss the talk: Hybrid Clouds: Dancing with “Automated” Virtual Machines in the next OpenNebula Conf 2018 in Amsterdam.

See you cloudadmins!

Barcelona UserGroup Team –  www.cloudadmins.org

 

Here’s a quick “Thank you” to LINBIT for hosting the OpenNebula TechDay in Frankfurt.  Thank you for your continued partnership and support!   It was a great opportunity for everyone to meet up and share insights and experiences.

We are glad to let everyone know that the slide presentations are now available!

Here’s a link to the complete agenda.

Tino kicking off the OpenNebula TechDay in Frankfurt.

 

Sergio walking the group through the Hands-On Tutorial.

 

A live-demo of OpenNebula and LINSTOR integration done by Philipp Reisner.

Prolog

Building and maintaining a cloud RSS reader requires resources. Lots of them! Behind the deceivingly simple user interface there is a complex backend with huge datastore that should be able to fetch millions of feeds in time, store billions of articles indefinitely and make any of them available in just milliseconds – either by searching or simply by scrolling through lists. Even calculating the unread counts for millions of users is enough of a challenge that it deserves a special module for caching and maintaining. The very basic feature that every RSS reader should have – being able to filter only unread articles, requires so much resource power that it contributes to around 30% of the storage pressure on our first-tier databases.

Until recently we were using bare-metal servers to operate our infrastructure, meaning we deployed services like database and application servers directly on the operating system of the server. We were not using virtualization except for some really small micro-services and it was practically one physical server with local storage broken down into several VMs. Last year we have reached a point where we had a 48U (rack-units) rack full of servers. More than half of those servers were databases, each with its own storage. Usually 4 to 8 spinning disks in RAID-10 mode with expensive RAID controllers equipped with cache modules and BBUs. All this was required to keep up with the needed throughput.

There is one big issue with this setup. Once a database server fills up (usually at around 3TB) we buy another one and this one becomes read-only. CPUs and memory on those servers remain heavily underutilized while the storage is full. For a long time we knew we have to do something about it, otherwise we would soon need to rent a second rack, which would have doubled our bill. The cost was not the primary concern. It just didn’t feel right to have a rack full of expensive servers that we couldn’t fully utilize because their storage was full.

Furthermore redundancy was an issue too. We had redundancy on the application servers, but for databases with this size it’s very hard to keep everything redundant and fully backed up. Two years ago we had a major incident that almost cost us an entire server with 3TB of data, holding several months worth of article data. We have completely recovered all data, but that was close.

 

Big changes were needed!

While the development of new features is important, we had to stop for a while and rethink our infrastructure. After some long sessions and meetings with vendors we have made a final decision:

We will completely virtualize our infrastructure and we will use OpenNebula + KVM for virtualization and StorPool for distributed storage.

 

 

Cloud Management

We have chosen this solution not only because it is practically free if you don’t need enterprise support but also because it is proven to be very effective. OpenNebula is now mature enough and has so many use cases it’s hard to ignore. It is completely open source with big community of experts and has an optional enterprise support. KVM is now used as primary hypervisor for EC2 instances in Amazon EWS. This alone speaks a lot and OpenNebula is primarily designed to work with KVM too. Our experience with OpenNebula in the past few months didn’t make us regret this decision even once.

 

Storage

Now a crucial part of any virtualized environment is the storage layer. You aren’t really doing anything if you are still using the local storage on your servers. The whole idea of virtualization is that your physical servers are expendable. You should be able to tolerate a server outage without any data loss or service downtime. How do you achieve that? With a separate, ultra-high performance fault-tolerant storage connected to each server via redundant 10G network.

There’s EMC‘s enterprise solution, which can cost millions and uses proprietary hardware, so it’s out of our league. Also big vendors doesn’t usually play well with small clients like us. There’s a chance that we will just have to sit and wait for a ticket resolution if something breaks, which contradicts our vision.

Then there’s RedHat’s Ceph, which comes completely free of charge, but we were a bit afraid to use it since nobody at the team had the required expertise to run it in production without any doubt that in any event of a crash we will be able to recover all our data. We were on a very tight schedule with this project, so we didn’t have any time to send someone for trainings. Performance figures were also not very clear to us and we didn’t know what to expect. So we decided not to risk with it for our main datacenter. We are now using Ceph in our backup datacenter, but more on that later.

Finally there’s a one still relatively small vendor, that just so happens to be located some 15 minutes away from us – StorPool. They were recommended to us by colleagues running similar services and we had a quick kick-start meeting with them. After the meeting it was clear to us that those guys know what they are doing at the lowest possible level.
Here’s what they do in a nutshell (quote from their website):

StorPool is a block-storage software that uses standard hardware and builds a storage system out of this hardware. It is installed on the servers and creates a shared storage pool from their local drives in these servers. Compared to traditional SANs, all-flash arrays, or other storage software StorPool is faster, more reliable and scalable.

Doesn’t sound very different from Ceph, so why did we chose them? Here are just some of the reasons:

  • They offer full support for a very reasonable monthly fee, saving us the need to have a trained Ceph expert onboard.
  • They promise higher performance than ceph.
  • They have their own OpenNebula storage addon (yeah, Ceph does too, I know)
  • They are a local company and we can always pick up the phone and resolve any issues in minutes rather than hours or days like it usually ends up with big vendors.

 

The migration

You can read the full story of or migration with pictures and detailed explanations in our blog.

I will try to keep it short and tidy here. Basically we managed to slim down our inventory to half of the previous rack-space. This allowed us to reduce our costs, create enough room for later expansion, which immediately and greatly increasing our compute and storage capacities. We have mostly reused our old servers in the process with some upgrades to make the whole OpenNebula cluster homogenous – same CPU model and memory across all servers, which allowed us to use “host=passthrough” to improve VM performance without the risk of VM crash during a live migration. The process took us less than 3 months with the actual migration happening in around two weeks. While we waited for the hardware to arrive we had enough time to play with OpenNebula in different scenarios, try out VM migrations, different storage drivers and overall try to break it while it’s still in test environment.

 

The planning phase

So after we made our choice for virtualization it was time to plan the project. This happened in November 2017, so not very far from now. We have rented a second rack in our datacenter. The plan was to install the StorPool nodes there and gradually move servers and convert them into hypervisors. Once we move everything we will remove the old rack.

We have ordered 3 servers for the StorPool storage. Each of those servers have room for 16 hard-disks. We have only ordered half of the needed hard-disks, because we knew that once we start virtualizing servers, we will salvage a lot of drives that won’t be needed otherwise.

We have also ordered the 10G network switches for the storage network and new Gigabit switches for the regular network to upgrade our old switches. For the storage network we chose Quanta LB8. Those beasts are equipped with 48x10G SFP+ ports, which is more than enough for a single rack. For the regular Gigabit network, we chose Quanta LB4-M. They have additional 2x10G SFP+ modules, which we used to connect the two racks via optic cable.

We also ordered a lot of other smaller stuff like 10G network cards and a lot of CPUs and DDR memory.  Initially we didn’t plan to upgrade the servers before converting them to hypervisors in order to cut costs. However after some benchmarking we found that our current CPUs were not up to the task. We were using mostly dual CPU servers with Intel Xeon E5-2620 (Sandy Bridge) and they were already dragging even before the Meltdown patches. After some research we chose to upgrade all servers to E5-2650 v2 (Ivy Bridge), which is a 16-core (with Hyper-threading) CPU with a turbo frequency of 3.4 GHz. We already had two of these and benchmarks showed two-fold increase in performance compared to E5-2620.

We also decided to boost all servers to 128G of RAM. We had different configurations, but most servers were having 16-64GB and only a handful were already at 128G. So we’ve made some calculations and ordered 20+ CPUs and 500+GB of memory.

After we placed all orders we had about a month before everything arrive, so we used that time to prepare what we can without additional hardware.

 

The preparation phase

We used the whole December and part of January while waiting for our equipment to arrive to prepare for the coming big migration. We learned how OpenNebula works, tried everything that came to our minds to break it and to see how it behaves in different scenarios. This was a very important part to avoid production mistakes and downtime later.
We didn’t wait for our hardware to arrive. Instead we purchased one old but still powerful server with lots of memory to temporarily hold some virtual machines. The idea was to free up some physical servers, so we can shut them down, upgrade them and convert them into hypervisors in the new rack.

 

The execution phase

After the hardware arrived it was time to install it in the new rack. We started with the StorPool nodes and the network. This way we were able to bring up the storage cluster prior to adding any hypervisor hosts.
      
Now it was time for StorPool to finalize the configuration of the storage cluster and to give us green light to connect our first hypervisor to it. Needless to say, they were quick about it and on the next day we were able to bring in two servers from the old rack and to start our first real OpenNebula instance with StorPool as a storage.

After we had our shiny new OpenNebula cluster with StorPool storage fully working it was time to migrate the virtual machines that were still running on local storage. The guys from StorPool helped us a lot here by providing us with a migration strategy that we had to execute for each VM. If there is interest we can post the whole process in a separate post.

From here on we were gradually migrating physical servers to virtual machines. The strategy was different for each server, some of them were databases, others application and web servers. We’ve managed to migrated all of them with several seconds to no downtime at all. At first we didn’t have much space for virtual machines, since we had only two hypervisors, but at each iteration we were able to convert more and more servers at once.

     

After that each server went through a complete change. CPUs were upgraded to 2x E5-2650 v2 and memory was bumped to 128GB. The expensive RAID controllers were removed from the expansion slots and in their place we installed 10G network cards. Large (>2TB) hard drives were removed and smaller drives were installed just for the OS. After the servers were re-equipped, they were installed in the new rack and connected to the OpenNebula cluster. The guys from StorPool configured each server to have a connection to the storage and verified that it is ready for production use. The first 24 leftover 2TB hard drives were immediately put to work into our StorPool.

 

The result

In just couple of weeks of hard work we have managed to migrate everything!

In the new rack we have a total of 120TB of raw storage, 1.5TB of RAM and 400 CPU cores. Each server is connected to the network with 2x10G network interfaces.

That’s roughly 4 times the capacity and 10 times the network performance of our old setup with only half the physical servers!

The flexibility of OpenNebula and StorPool allows us to use the hardware very efficiently. We can spin up virtual machines in seconds with any combination of CPU, memory, storage and network interfaces and later we can change any of those parameters just as easy. It’s the DevOps heaven!

This setup will be enough for our needs for a long time and we have more than enough room for expansion if need arise.

 

Our OpenNebula cluster

We now have more than 60 virtual machines because we have split some physical servers into several smaller VMs with load balancers for better load distribution and we have allocated more than 38TB of storage.

We have 14 hypervisors with plenty of resources available on each of them. All of them are using the same model CPU, which gives us the ability to use the “host=passthrough” setting of QEMU to improve VM performance without the risk of VM crash during a live migration.

We are very happy with this setup. Whenever we need to start a new server, it only takes minutes to spin up a new VM instance with whatever CPU and memory configuration we need. If a server crashes, all VMs will automatically migrate to another server. OpenNebula makes it really easy to start new VMs, change their configurations, manage their lifecycle and even completely manage your networks and IP address pools. It just works!

StorPool on the other hand takes care that we have all the needed IOPS at our disposal whenever we need them.

 

Goodies

We are using Graphite + Grafana to plot some really nice graphs for our cluster.

We have borrowed the solution from here. That’s what’s so great about open software!

Our team is constantly informed for the health and utilization of our cluster. A glance at our wall-mounted TV screen is enough to tell that everything is alright. We can see both our main and backup data centers, both running OpenNebula. It’s usually all green :)

 

StorPool is also using Grafana for their performance monitoring and they have also provided us with access to it, so we can get insights about what the storage is doing at the moment, which VMs are the biggest consumers, etc. This way we can always know when a VM has gone rogue and is stealing our precious IOPS.

 

Epilog

If you made it this far – Congratulations! You have geeked out as much as we did building this infrastructure with the latest and greatest technologies like OpenNebula and StorPool.

Intro

Given the fact that AWS now offers a bare metal service as another choice of EC2 instances, you are now able to deploy virtual machines based on HVM technologies, like KVM, without tackling the heavy performance overhead imposed by nesting virtualization. This condition enables you to leverage the highly scalable and available AWS public cloud infrastructure in order to deploy your own cloud platform based on full virtualization.

Architecture Overview

The goal is to have a private cloud running KVM virtual machines, able to to communicate each other, the hosts, and the Internet, running on remote and/or local storage.

Compute

I3.metal instances, besides being bare metal, have a very high compute capacity. We can create a powerful private cloud with a small number of these instances roleplaying worker nodes.

Orchestration

Since OpenNebula is a very lightweight cloud management platform, and the control plane doesn’t require virtualization extensions, you can deploy it on a regular HVM EC2 instance. You could also deploy it as a virtual instance using a hypervisor running on an i3.metal instance, but this approach adds extra complexity to the network.

Storage

We can leverage the i3.metal high bandwidth and fast storage in order to have a local-backed storage for our image datastore. However, having a shared NAS-like datastore would be more productive. Although we can have a regular EC2 instance providing an NFS server, AWS provides a service specifically designed for this use case, the EFS.

Networking

OpenNebula requires a service network, for the infrastructure components (frontend, nodes and storage) and instance networks for VMs to communicate. This guide will use Ubuntu 16.04 as base OS.

Limitations

AWS provides a network stack designed for EC2 instances. Since you don’t really control interconnection devices like the internet gateway, the routers or the switches. This model has conflicts with the networking required for OpenNebula VMs.

  • AWS filters traffic based on IP – MAC association
    • Packets with a source IP can only flow if they have the specific MAC
    • You cannot change the MAC of an EC2 instance NIC
  • EC2 Instances don’t get public IPv4 directly attached to their NICs.
    • They get private IPs from a subnet of the VPC
    • AWS internet gateway (an unmanaged device) has an internal record which maps pubic IPs to private IPs of the VPC subnet.
    • Each private IPv4 address can be associated with a single Elastic IP address, and vice versa
  • There is a limit of 5 Elastic IP addresses per region, although you can get more non-elastic public IPs.
  • Elastic IPs are bound to a specific private IPv4 from the VPC, wich bounds them to specific nodes
  • Multicast traffic is filtered

If you try to assign an IP of the VPC subnet to a VM, traffic won’t flow because AWS interconnection devices don’t know the IP has been assigned and there isn’t a MAC associated ot it. Even if it would has been assigned, it would had been bound to a specific MAC. This leaves out of the way the Linux Bridges and the 802.1Q is not an option, since you need to tag the switches, and you can’t do it. VXLAN relies on multicast in order to work, so it is not an option. Openvswitch suffers the same link and network layer restriction AWS imposes.

Workarounds

OpenNebula can manage networks using the following technologies. In order to overcome AWS network limitations it is suitable to create an overlay network between the EC2 instances. Overlay networks would be ideally created using the VXLAN drivers, however since multicast is disabled by AWS, we would need to modify the VXLAN driver code, in order to use single cast. A simpler alternative is to use a VXLAN tunnel with openvswitch. However, this lacks scalability since a tunnel works on two remote endpoints, and adding more endpoints breaks the networking. Nevertheless you can get a total of 144 cores and 1TB of RAM in terms of compute power. The network then results in the AWS network, for OpenNebula’s infrastructure and a network isolated from AWS encapsulated over the Transport layer, ignoring every AWS network issues you might have. It is required to lower the MTU of the guest interfaces to match the VXLAN overhead.

In order to grant VMs Internet access, it is required to NAT their traffic in an EC2 instance with a public IP to its associated private IP. in order to masquerade the connection originated by the VMs. Thus, you need to set an IP belonging to the VM network the openvswitch switch device on an i3.metal, enablinInternet-VM intercommunicationg the EC2 instance as a router.

In order for your VMs to be publicly available from the Internet you need to own a pool of available public IP addresses ready to assign to the VMs. The problem is that those IPs are matched to a particular private IPs of the VPC. You can assign several pair of private IPs and Elastic IPs to an i3.metal NIC. This results in i3.metal instances having several available public IPs. Then you need to DNAT the traffic destined to the Elastic IP to the VM private IP. You can make the DNATs and SNATs static to a particular set  of private IPs and create an OpenNebula subnet for public visibility containing this address range. The DNATs need to be applied in every host in order to give them public visibility wherever they are deployed. Note that OpenNebula won’t know the addresses of the pool of Elastic IPs, nor the matching private IPs of the VPC. So there will be a double mapping, the first, by AWS, and the second by the OS (DNAT and SNAT) in the i3.metals :

 Elastic IP → VPC IP → VM IP→ VPC IP → Elastic IP

→ IN → Linux → OUT

 

 

Setting up AWS Infrastructure for OpenNebula

You can disable public access to OpenNebula nodes (since they are only required to be accessible from frontend) and access them via the frontend, by assigning them the ubuntu user frontend public key or using sshuttle.

Summary

  • We will launch 2 x i3.metal EC2 instances acting as virtualization nodes
  • OpenNebula will be deployed on a HVM EC2 instance
  • An EFS will be created in the same VPC the instances will run on
  • This EFS will be mounted with an NFS client on the instances for OpenNebula to run a shared datastore
  • The EC2 instances will be connected to a VPC subnet
  • Instances will have a NIC for the VPC subnet and a virtual NIC for the overlay network

Security Groups Rules

  1. TCP port 9869 for one frontend
  2. UDP port 4789 for VXLAN overlay network
  3. NFS inbound for EFS datastore (allow from one-x instances subnet)
  4. SSH for remote access

Create one-0 and one-1

This instance will act as a router for VMs running in OpenNebula.

  1. Click on Launch an Instance on EC2 management console
  2. Choose an AMI with an OS supported by OpenNebula, in this case we will use Ubuntu 16.04.
  3. Choose an i3.metal instance, should be at the end of the list.
  4. Make sure your instances will run on the same VPC as the EFS.
  5. Load your key pair into your instance
  6. This instance will require SGs 2 and 4
  7. Elastic IP association
    1. Assign several private IP addresses to one-0 or one-1
    2. Allocate Elastic IPs (up to five)
    3. Associate Elastic IPs in a one-to-one fashion to the assigned private IPs of one-0 or one-1

Create one-frontend

  1. Follow the same steps of the nodes creation, except
    1. Deploy a t2.medium EC2 instance
    2. It is also required SG1

Create EFS

  1. Click on create file system on EFS management console
  2. Choose the same VPC the EC2 instances are running on
  3. Choose SG 3
  4. Add your tags and review your config
  5. Create your EFS

After installing the nodes and the frontend, remember to follow shared datastore setup in order to deploy VMs using the EFS. In this case you need to mount the filesystem exported by the EFS on the corresponding datastore id the same way you would with a regular NFS server. Take a look to EFS doc to get more information.

Installing OpenNebula on AWS infrastructure

Follow Front-end Installation.

Setup one-x instances as OpenNebula nodes

Install opennebula node, follow KVM Node Installation. Follow openvswitch setup, don’t add the physical network interface to the openvswitch.

You will create an overlay network for VMs in a node to communicate with VMs in the other node using a VXLAN tunnel with openvswitch endpoints.

Create an openvswitch-switch. This configuration will persist across power cycles.

apt install openvswitch-switch

ovs-vsctl add-br ovsbr0

Create the VXLAN tunnel. The remote endpoint will be one-1 private ip address.

ovs-vsctl add-port ovsbr0 vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=10.0.0.12

This is one-0 configuration, repeat the configuration above in one-1 changing the remote endpoint to one-0.

Setting up one-0 as the gateway for VMs

Set the network configuration for the bridge.

ip addr add 192.168.0.1/24 dev ovsbr0

ip link set up ovsbr0

In order to make the configuration persistent

echo -e "auto ovsbr0 \niface ovsbr0 inet static \n       address 192.168.0.1/24" >> interfaces

Set one-0 as a NAT gateway for VMs in the overlay network to access the Internet. Make sure you SNAT to a private IP with an associated public IP.  For the NAT network.

iptables -t nat -A POSTROUTING -s 192.168.0.0/24 -j SNAT --to-source 10.0.0.41

Write the mappings for the public visibility in both one-0 and one-1 instance.

iptables -t nat -A PREROUTING -d 10.0.0.41 -j DNAT --to-destination 192.168.0.250

iptables -t nat -A PREROUTING -d 10.0.0.42 -j DNAT --to-destination 192.168.0.251

iptables -t nat -A PREROUTING -d 10.0.0.43 -j DNAT --to-destination 192.168.0.252

iptables -t nat -A PREROUTING -d 10.0.0.44 -j DNAT --to-destination 192.168.0.253

iptables -t nat -A PREROUTING -d 10.0.0.45 -j DNAT --to-destination 192.168.0.254

Make sure you save your iptables rules in order to make them persist across reboots. Also, check /proc/sys/net/ipv4/ip_forward is set to 1, opennebula-node package should have done that.

Defining Virtual Networks in OpenNebula

You need to create openvswitch networks with the guest MTU set to 1450. Set the bridge to the bridge with the VXLAN tunnel, in this case, ovsbr0.

For the public net you can define a network with the address range limited to the IPs with DNATs and another network (SNAT only) in a non-overlapping address range or in an address range containing the DNAT IPs in the reserved list. The gateway should be the i3.metal node with the overlay network IP assigned to the openvswitch switch. You can also set the DNS to the AWS provided DNS in the VPC.

Public net example:

Testing the Scenario BRIDGE = "ovsbr0"
 DNS = "10.0.0.2"
 GATEWAY = "192.168.0.1"
 GUEST_MTU = "1450"
 NETWORK_ADDRESS = "192.168.0.0"
 NETWORK_MASK = "255.255.255.0"
 PHYDEV = ""
 SECURITY_GROUPS = "0"
 VLAN_ID = ""
 VN_MAD = "ovswitch"

Testing the Scenario

You can import a virtual appliance from the marketplace to make the tests. This should work flawlessly since it only requires a regular frontend with internet access. Refer to the  marketplace documentation.

VM-VM Intercommunication

Deploy a VM in each node …

and ping each other.

Internet-VM Intercommunication

Install apache2 using the default OS repositories and view the default index.html file when accessing the corresponding public IP port 80 from our workstation.

First check your public IP

Then access the port 80 of the public IP. Just enter the IP address in the browser.

 

 

 

The OpenNebula Project is proud to announce the agenda and line-up of speakers for the seventh OpenNebula Conference to be held in Amsterdam on the 12-13 of November 2018.

OpenNebulaConf is your chance to get an up-close look at OpenNebula’s latest product updates, hear the project’s vision and strategy, get hands-on tutorials and workshops, and get lots of opportunities to network and share ideas with your peers. You’ll also get to attend all the parties and after-parties to keep the networking and the good times going long after the show floor closes for the day.

Keynotes

The agenda includes three keynote speakers:

Educational Sessions 

This year we will have two pre-conference tutorials:

Community Sessions

We had a big response to the call for presentations. Thanks for submitting a talk proposal!.

Like in previous editions, we will have a single track with 15-minute talks, to keep all the audience focused and interested. We have given our very best to get the perfect balance of topics.

We will also have a Meet the Experts sessions providing an informal atmosphere where delegates can interact with experts who will give their undivided attention for knowledge, insight and networking; and a session for 5-minute lightning talksIf you would like to talk in these sessions, please contact us!

Besides its amazing talks, there are multiple goodies packed with the OpenNebulaConf registration. You have until September 15th to get your OpenNebulaConf tickets for the deeply discounted price of just €400 (plus taxes) apiece. However, space is limited, so register asap.

We are looking forward to welcoming you personally in Amsterdam!