The primary season is in full swing for the 2016 presidential elections but there’s another important ballot that needs your vote! Community voting is now open for OpenStack’s Austin Summit discussion topics. Check out ZeroStack’s abstracts and cast your votes here before February 17th.
Put your OpenStack deployment on Autopilot
If we can put a plane on auto-pilot, why not do the same for an OpenStack cloud? Enterprises encounter several challenges in managing an OpenStack deployment today. In particular, the operational overhead increases with the size and complexity of the private cloud. The burden on the operations team should not increase with growth in the private cloud deployment.The common technique of creating redundant services and restarting a service doesn’t scale well and requires human intervention to heal the controllers in case of a failure.
In addition, just checking a service’s health without knowing about other services doesn’t work due to inter-dependencies. For example, nova depends on MySQL and RabbitMQ to work.
In this talk we describe a set of techniques to put your environment on auto-pilot by creating a self-healing highly available OpenStack based cloud.
Virtual Machine High Availability
High availability of virtual machines is a feature that OpenStack does not support right now. This hinders OpenStack’s adoption in the enterprise. In many cases, administrators want workloads to get restarted in case of hardware failures. This talk describes a reference architecture and techniques for providing VM high availability. We will discuss various options and their pros/cons along with a reference design.
The techniques outlined will be conducive for automation. Automating the failure detection and VM failover actions leads to higher reliability due to decreased response time and avoidance of operator error. We will also describe a software platform we have built which automates these techniques. The administrators and users will learn how to use and leverage this feature for better application availability.
Lessons learnt from operating RabbitMQ and MySQL within OpenStack
Almost all OpenStack API services are stateless by design. In typical deployments, MySQL serves as the source of truth for the cluster-wide state, which includes configurations as well as internally generated state. RabbitMQ, by virtue of being the message broker, forms the central nervous system of the OpenStack cluster. In this talk, we share our operational experiences with MySQL and RabbitMQ in the context of providing stability, detecting/remediating problems and increasing the overall availability of the cluster.
Python Basics for Operators Troubleshooting OpenStack
As any OpenStack operator knows a straightforward request, such as creating a VM, requires multiple services to successfully communicate with each other and perform a designated function. When this doesn’t go to plan there are a number of potential log files to analyze, each potentially supplying a piece of the puzzle.
This talk will provide knowledge on how an operator who has minimal experience of the Python programming language can extract the information they need to root cause and resolve issues.
Smart Placement: Get the Most Out of Your OpenStack Cloud
It has long been recognized that typical applications deployed in cloud and virtual environments can receive performance benefits such as reduced latency, increased interconnect throughput and isolation from noisy neighbors by the careful placement of VMs on a cluster of hosts. In addition, neighbor-aware placement can also be used to effectively enforce business requirements such as licensing rules and service contracts.
In this talk, we begin by explaining several sample performance metrics, availability concerns and business requirements for an example, multi-VM application. Next, we survey a large number of KVM and OpenStack resource controls and scheduling features that can be used to achieve these requirements. We also explain how custom filters and weights can be developed to realize specific goals. Finally, we show how these requirements and principles can be tied together with Heat templates to deliver application-aware clouds via smart placement.
Control Plane Architectures: Design Solutions
Building an OpenStack private cloud can be very complex. Underpinning the entire operational capabilities of your cloud is the Control Plane. It’s common knowledge you should treat your cloud resources “like cattle” and not “pets”, yet most clouds treat the control plane like a favored pet.
How should the modern cloud architect choose to design the control plane aspect of your private cloud?
Backing up OpenStack VM workloads
It is widely recognized that backing up your data is as important as generating it. Disaster situations can potentially endanger data availability, and potentially cause loss of data. Ability to take backup of all data at virtual machine level is hence immensely important for business continuity. Here, we present our system of backing up VM data using cinder primitives and protecting it.
Capacity Planning and Infrastructure Sizing: Stop Overpaying!
Capacity planning is one of the most important tasks of a cloud admin. Since a cloud can be built using diverse technologies in terms of compute servers, storage and networking, sizing each one correctly gets very complicated. Better planning can not oly reduce costs, but also lets you take advantage of the hardware improvements over time. The cloud has to be correctly sized initially and the size needs to be revisited periodically to keep the footprint ahead of the growth. This problem manifests itself in the context of projects also and the quotas may need to be adjusted based on the usage.
In this talk, we will showcase a tool to do sizing based on a set of workloads. The tool can do the initial sizing using a set of hardware nodes and compute the configuration in terms of minimal cost or minimal number of nodes. We will also discuss which metrics and aggregation of stats, one should use to do growth projections and get time available per resource.
Quantifying the Noisy Neighbor Problem in OpenStack
Two of the desirable features for private clouds are better control and predictable performance. Although public clouds have been extensively researched to characterize their unpredictable performance, private clouds have received less scrutiny.
In this talk, we will present how production workloads interfere with each other in an Openstack based cloud. We draw lessons from a several month long study of running workloads in different configurations on highly available implementation of Openstack. We study the impact of noisy neighbors on the network and storage IO performance of applications. We also look at the performance metrics of Openstack control plane and how the API calls are impacted with more number of entities like networks, routers, VMs, volumes. Our study relies on a tool that we developed to create clean and noisy workload deployments, using micro-benchmarks as well as enterprise workloads such as Hadoop, Jenkins and Redis.
Learning From Integrating AD With Kilo
Active Directory and other LDAP solutions are prevalent in enterprises and most customers who deploy OpenStack would like to integrate with their existing AD/LDAP. This is easier said than done. This talk focuses on my personal journey as I worked thru the process of learning about how Keystone worked with AD/LDAP and. The reason I want to share this so that others do not have to spend as much time as I had to.
Archiving and Restoring an OpenStack Project
An OpenStack Project is where the real work happens. That’s where customers’ workloads live. Enterprise workloads need to be moveable from one part of the enterprise to another, need to be spruced when sprawls occur (happens more often than you’d think). Deleting a project that has VMs, networks, routers, security groups, VMs, volumes, etc. is not a trivial job. Same holds for archiving and resoring it.OpenStack does not have a set of tools that make it easy to do such tasks. This talk will focus on these use cases and a possible approach to solving them.