The primary season is in full swing for the 2016 presidential elections, but there is another important ballot that needs your vote! Community voting is now open for OpenStack’s Austin Summit discussion topics. Check out ZeroStack’s abstracts and cast your votes here before February 17th.
Put Your OpenStack Deployment On Autopilot
If we can put a plane on auto-pilot, why not do the same for an OpenStack cloud? Enterprises often encounter several challenges in managing an OpenStack deployment. In particular, the operational overhead increases with the size and complexity of the private cloud. The burden placed on the operations team should not increase with growth in the private cloud deployment.The common technique of creating redundant services and restarting a service does not scale well. It also requires human intervention to heal the controllers if they fail.
In addition, only checking a service’s health without knowing about other services will not be possible due to interdependencies. For example, nova depends on MySQL and RabbitMQ to work.
In this talk, we describe a set of techniques to put your environment on auto-pilot by creating a self-healing highly available OpenStack-based cloud.
Virtual Machine High Availability
High availability of virtual machines is a feature that OpenStack does not currently support. This hinders OpenStack’s adoption in the enterprise. In many cases, administrators want workloads to restart in case of hardware failures. This talk describes a reference architecture and techniques for providing VM high availability. We will discuss various options and their pros/cons, along with a reference design.
The techniques outlined will be conducive for automation. Automating the failure detection and VM failover actions lead to higher reliability due to decreased response time and avoidance of operator error. We will also describe a software platform that we have built which automates these techniques. The administrators and users will learn how to use and leverage this feature for better application availability.
Lessons Learnt From Operating RabbitMQ And MySQL Within OpenStack
Almost all OpenStack API services are stateless by design. In typical deployments, MySQL serves as the source of truth for the cluster-wide state, which includes configurations as well as internally generated states. RabbitMQ, by virtue of being the message broker, forms the central nervous system of the OpenStack cluster. In this talk, we share our operational experiences with MySQL and RabbitMQ in the context of providing stability, detecting/remediating problems and increasing the overall availability of the cluster.
Python Basics For Operators Troubleshooting OpenStack
As any OpenStack operator knows, a straightforward request, such as creating a VM, requires multiple services to successfully communicate with each other and perform a designated function. When this does not go according to plan, there are a number of potential log files to analyze, each potentially supplying a piece of the puzzle.
This talk will provide knowledge on how an operator, who has minimal experience of the Python programming language, can extract the information they need to root cause and resolve issues.
Smart Placement: Get The Most Out Of Your OpenStack Cloud
It has long been recognized that typical applications deployed in cloud and virtual environments can receive performance benefits such as reduced latency, increased interconnect throughout and isolation from noisy neighbors by the careful placement of VMs on a cluster of hosts. In addition, neighbor-aware placement can be used to effectively enforce business requirements such as licensing rules and service contracts.
In this talk, we begin by explaining several sample performance metrics, availability concerns and business requirements (e.g., multi-VM applications). Next, we survey a large number of KVM and OpenStack resource controls and scheduling features that can be used to achieve these requirements. We also explain how custom filters and weights can be developed to realize specific goals. Finally, we show how these requirements and principles can be tied together with Heat templates to deliver application-aware clouds via smart placement.
Control Plane Architectures: Design Solutions
Building an OpenStack private cloud can be very complex. Underpinning the entire operational capabilities of your cloud is the Control Plane. It is commonly known that you should treat your cloud resources like “cattle” and not “pets”, yet most clouds treat the control plane like a favored pet.
How should the modern cloud architect choose to design the control plane aspect of your private cloud?
Backing Up OpenStack VM Workloads
It is widely recognized that backing up your data is as important as generating it. Disaster situations can potentially endanger data availability, and potentially cause loss of data. The ability to backup all data at the virtual machine level is immensely important for business continuity. Here, we present our system of backing up VM data using cinder primitives and protecting it.
Capacity Planning And Infrastructure Sizing: Stop Overpaying!
Capacity planning is one of the most important tasks of a cloud admin. Since a cloud can be built using diverse technologies in terms of compute servers, storage and networking, sizing each one correctly gets very complicated. Better planning can not only reduce costs, but also let you take advantage of hardware improvements over time. The cloud has to be correctly sized initially and the size needs to be revisited periodically to keep the footprint ahead of the growth. This problem manifests itself in the context of projects and the quotas may need to be adjusted based on usage.
In this talk, we will showcase a tool that sizes the cloud based on a set of workloads. The tool can do the initial sizing by using a set of hardware nodes and can compute the configuration in terms of minimal cost or minimal number of nodes. We will also discuss which metrics and aggregation of stats one should use to perform growth projections and get time available per resource.
Quantifying The Noisy Neighbor Problem In OpenStack
Two of the desirable features for private clouds are better control and predictable performance. Although public clouds have been extensively researched to characterize their unpredictable performance, private clouds have received less scrutiny.
In this talk, we will present how production workloads interfere with each other in an OpenStack-based cloud. We draw lessons from a several-month long study of running workloads in different configurations on highly-available implementation of OpenStack. We study the impact of noisy neighbors on the network and storage IO performance of applications. We also look at the performance metrics of the OpenStack control plane and how the API calls are impacted with more number of entities like networks, routers, VMs, volumes. Our study relies on a tool that we developed to create clean and noisy workload deployments, using micro-benchmarks as well as enterprise workloads such as Hadoop, Jenkins and Redis.
Learning From Integrating AD With Kilo
Active Directory and other LDAP solutions are prevalent in enterprises and most customers who deploy OpenStack would like to integrate with their existing AD/LDAP. This is easier said than done. This talk focuses on my personal journey as I worked through the process of learning about how Keystone worked with AD/LDAP. I want to share this so that others do not have to spend as much time as I had to.
Archiving And Restoring An OpenStack Project
An OpenStack Project is where the real work happens. That is where customers’ workloads live. Enterprise workloads need to be moveable from one part of the enterprise to another and spruced when sprawls occur (this happens more often than one may think). Deleting a project that has VMs, networks, routers, security groups, volumes, etc. is not a trivial job. The same applies to archiving and restoring. OpenStack does not have a set of tools that make it easy to do such tasks. This talk will focus on these use cases and a possible approach to solving them.