Loading…

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Design Summit [clear filter]
Monday, April 15
 

9:50am

Design Summit 101

Is this your first OpenStack summit? Unlike most conferences, you are invited to participate and play an active role. But... where to start? The rationale and the organization that allows such a unique collaborative design will be explained. This is your chance to get answers and get the best of it!

During this session we will let the attendants know some details about the summit, including who will be attending, different tracks and purposes, which sessions/talks are the most suitable for beginners and how they can participate. This short introduction will be followed by a a lively presentation of the most common situations and how to behave when facing them. It will be a miniature experience of a first participation to the summit.


Speakers
avatar for Victoria Martínez de la Cruz

Victoria Martínez de la Cruz

Software engineer at Red Hat, Red Hat
Victoria is a software developer at Red Hat and core member of Trove and Zaqar projects. She is a former GNOME Outreach Program for Women intern and Google Summer of Code intern. She is FOSS passionate and loves to help newcomers to get involved with OpenStack.
avatar for Loïc Dachary

Loïc Dachary

Developer, SecureDrop
Loïc Dachary has been involved with the Free Software Movement since 1987, when he started distributing GNU tapes to the general public. In 2017 he became a core developer for SecureDrop to help journalists communicate with their sources securily and anonymously. In the past de... Read More →


Monday April 15, 2013 9:50am - 10:30am
B119

9:50am

How to run multiple heat-engines and scaling

Currently we have an archetecture that should support scalling but
some code is missing.

How does the heat-api find the correct engine to talk to?
How would a "heat list" work?

https://etherpad.openstack.org/heat-multiple-engines

(Session proposed by Angus Salkeld)


Speakers

Monday April 15, 2013 9:50am - 10:30am
B110

9:50am

OpenStack Networking Development Process
The Grizzly cycle was a successful for the OpenStack Networking team as we added many new features and worked to resolve reported issues. In this session, we'll look back on Grizzly and then discuss the OpenStack Networking development process for Havana. We'll also review the existing common components of OpenStack Networking to ensure they meet the needs of the team.



(Session proposed by Mark McClain)


Monday April 15, 2013 9:50am - 10:30am
B114

9:50am

OpenStack-on-OpenStack Overview

In this session, I will talk about some of the problems we face in deploying openstack by using openstack. I'll point at areas where the bare metal driver needs better integration with other services (eg. Quantum and Cinder), how we really need an inventory database (I'm looking at you, HealthNMon), how the Nova scheduler needs to be aware of hardware, and how Heat is taking over the world. I might even propose that it's possible to bootstrap yourself right out of your own boots!


(Session proposed by Devananda van der Veen)


Monday April 15, 2013 9:50am - 10:30am
B113

9:50am

swift extensions for real world (operator's view)
swift extensions for real world !

while operating swift service in public environment, we are tackled several necessary feature extensions:
- s3 full compatibility
- server side encryption like aws
- large container problem
- near realtime sync for container to container sync

We would like to gather all swift related players (current commercial service provider, system integrator, solution builder, etc) to share what technical requirements they gets from the customers. Furthermore, selecting some high demand features to build Havana roadmap, and identifying who can work together for each feature development.


(Session proposed by jinkyung hwang)


Monday April 15, 2013 9:50am - 10:30am
B116

11:00am

Heat credentials management/delegation
Heat has two problem areas related to managing keystone identities:

1 - Storing user credentials when creating a stack, such that subsequently we can perform actions on behalf of the user who created the stack (HA actions, Autoscaling events etc)

2 - We allow credentials (keystone ec2 keypair) to be deployed inside each instance, such that authentication with our API's is possible, for the purposes of reading updated resource metadata, and writing metric data used for Alarm evaluation.

(1) is likely to be solved by the Trusts work recently merged into keystone, but I'd like to clarify the details/design of how we will use trusts to perform actions on behalf of the stack-owner in a secure way.

(2) We currently have a sub-optimal solution for, but no clear path to improving it - I'd like to present the current-state of our in-instance credentials management, and brainstorm the way forward, I'm expecting some requirements for additional keystone features to come out of this.

(Session proposed by Steven Hardy)


Monday April 15, 2013 11:00am - 11:40am
B110

11:00am

OpenStack Networking and Nova - Part 1
Let's get the OpenStack Networking and Nova teams together to discuss the future of OpenStack Networking being the default network provider for Nova. Part 1 will be a session held early in the week and will cover topics such as:

- Status of OpenStack networking / nova-network feature parity
- Goals for Havana
- Make OpenStack Networking the default? If so, when in the cycle?
- Documentation impact?
- How to migrate an existing deployment using nova-network?
- And more!

By establishing this as a high priority goal for both Nova and OpenStack Networking early in the week, everyone should have it in mind for the rest of the week while discussing other work. We'll get back together later in the week to recap and identify specific tasks to go work on.

(Session proposed by Russell Bryant)


Monday April 15, 2013 11:00am - 11:40am
B114

11:00am

python-novaclient
This session will include the following subject(s):

python-novaclient:

python-novaclient needs fixing, lets fix it.

Some current issues with the client:

* Poor testing (nova commands are commonly broken against latest API)
* No testing across multiple releases (should work against Essex, Folsom, Grizzely etc)
* Doesn't covers many nova-apis (including extensions)
* Should be able to list commands that current endpoint supports (by using list-extensions)


(Session proposed by Joe Gordon)

novaclient experience for end users:

There are some aspects of the novaclient experience that can be improved for users. This is similar to what's being proposed at http://summit.openstack.org/cfp/details/74 but with a focus on users.

A few issues:
* --help is overwhelming to a new user, and lists capabilities that may not be available from a particular provider.
* admin functions are listed even for a non admin user.
* Should it be possible to setup credentials in a config file?

(Session proposed by Andrew Laski)


Monday April 15, 2013 11:00am - 11:40am
B113

11:00am

Restructure documentation
The original documentation layout for OpenStack was designed a couple of years ago. We have created a lot of documentation, and learnt much about how our various users interact with it. We now also have a new book: the OpenStack Operations Guide. Due to these issues, we should restructure the documentation for greatest effectiveness.

This should address bugs like:

https://bugs.launchpad.net/openstack-manuals/+bug/1110137
#1110137 "running openstack" guide is exhaustive to the point of being past useful


Blueprint:
https://blueprints.launchpad.net/openstack-manuals/+spec/restructure-documentation

Fledgling ideas on the wiki:
https://wiki.openstack.org/wiki/Blueprint-restructure-documentation

(Session proposed by Tom Fifield)


Monday April 15, 2013 11:00am - 11:40am
B119

11:00am

Swift API Cleanup
In this session we will discuss the warts with the current Swift API (v1) to be fixed with a minor bump to the API.

The following wiki has been set up to begin identifying candidates for fixing:

https://wiki.openstack.org/wiki/SwiftNextAPI

(Session proposed by creiht)


Monday April 15, 2013 11:00am - 11:40am
B116

11:50am

Adding support for OASIS TOSCA to Heat
Heat provides an orchestration layer in OpenStack on-top of base compute, network and storage capabilities and allows for defining more advanced patterns on top of those core capabilities. Heat is currently based on Amazon CloudFormation syntax so users have to adopt that specific format. Adding support for another, standardized format such as the OASIS TOSCA standard would allow for also deploying patterns that users have created using that standardized format. TOSCA also provides features for expressing requirements and capabilities of pattern components, that also allow for composing patterns out of (i.e. by re-using) other patterns. Therefore, as part of ongoing refactoring and evolutionary work on Heat, we propose to adopt TOSCA as one supported pattern format, and to align enhancements with concepts found in TOSCA (such as requirements/capabilities).

(Session proposed by Thoms Spatzier)


Monday April 15, 2013 11:50am - 12:30pm
B110

11:50am

Local File System
Status of LFS

Work at Gluster and Nexenta is ongoing and finally code is at Github for both.

(Session proposed by Pete Zaitcev)


Monday April 15, 2013 11:50am - 12:30pm
B116

11:50am

OpenStack Networking API Update
This session will include the following subject(s):

Networking API update:

The usual update on status of the OpenStack Networking API, and the direction for Havana.

Agenda:
- New extensions introduced in Grizzly
- Pagination and sorting in the OpenStack Networking API
- Areas were the API needs improvements
- OpenStack Networking API documentation


(Session proposed by Salvatore Orlando)

Decouple AuthZ from business logic:

The aim of this session is to gather feedback on the already ongoing activities of the blueprint: https://blueprints.launchpad.net/quantum/+spec/make-authz-orthogonal

The final aim of this blueprint is to remove explicit authZ checking from Quantum code, and move it in a module which could potentially become a middleware of his own.

Also, discuss whether the resulting code modules, which should be independent of Quantum, should then be moved to the Oslo repository.

(Session proposed by Salvatore Orlando)


Monday April 15, 2013 11:50am - 12:30pm
B114

11:50am

Review Operations Manual and Plan Future Work
Now that we have an Operations Manual at http://openstack.booktype.pro/openstack-operations-guide/ I'd like to discuss it's current state and how we'd like to work on it. Covers these blueprints:
https://blueprints.launchpad.net/openstack-manuals/+spec/openstack-operations-manual
https://blueprints.launchpad.net/openstack-manuals/+spec/deployment-template

(Session proposed by Anne Gentle)


Monday April 15, 2013 11:50am - 12:30pm
B119

11:50am

Versioned internal objects
In order to support rolling upgrades, we need to get away from having our internal objects directly mirror the DB schema.

Up for discussion is versioned objects, automatic serial and deserialization on RPC, etc.

(Session proposed by Dan Smith)


Monday April 15, 2013 11:50am - 12:30pm
B113

1:50pm

Abstracting the AWS out of Heat
Heat adoption by cloud providers be slowed by the fact that users will be required to advertise for the competition just to write a template. While the format of Heat templates is about as basic as it comes, users should be able to express any capability of Heat with OpenStack/Heat resources entirely. It should also be possible for someone deploying heat to disable the usage of the AWS namespace entirely.

(Session proposed by Clint Byrum)


Monday April 15, 2013 1:50pm - 2:30pm
B110

1:50pm

Database Status Redux
A lot of work was done on Nova's database back-end during Grizzly, including but not limited to:

- no direct db access from nova-compute
- cleaning up the db API
- better use of sessions
- adding real unique keys
- archiving deleted rows
- better support for postgres

Let's all get together and discuss what went well, what we didn't finish, and what we want to do next!

(Session proposed by Devananda van der Veen)


Monday April 15, 2013 1:50pm - 2:30pm
B113

1:50pm

Making OpenStack Networking Simpler/Easier
We spend a lot of time making OpenStack Networking more complicated... adding new config options, agents, apis, etc. I'd like to spend some time on how to make it simpler.
- How can be make it more bullet-proof to get a reasonable multi-node deployment up and running?
- How can we make it easier for people to detect when they have made common errors (e.g., not running an l2-agent on the node where they are running the l3-agent?).
- Can we write tools to help people validate a basic setup?

This is a brainstorming session. I will have some wacky ideas to propose, but the goal is to have everyone present ideas.

(Session proposed by dan wendlandt)


Monday April 15, 2013 1:50pm - 2:30pm
B114

1:50pm

Swift with OpenStack what's next
Swift now is pretty happy talking with the other OpenStack services.

During this session we would review the current status of the integration and discuss where we can go from there.



(Session proposed by Chmouel Boudjnah)


Monday April 15, 2013 1:50pm - 2:30pm
B116

1:50pm

Translation management enhancement
1. Role based translation management

The access to OpenStack in Transifex is set as "Free for all". Any logged-in (registered) user can submit translations to it. Git review is used to review the translation. The current solution has some disadvantages:
Git reviewers might not understand non-English languages;
Even we can invite some non-English speakers to review the patch, the feedback of reviewers are not easily gotten by translators. The "Free to all" allows any registered user to upload a PO file with his translations to Transifex, which might cause regressions.

Setting limited access to OpenStack project in Transifex may be better. After changing to limited access, there will be a team for each language. There are three different roles in the translation management: translators, reviewers and coordinators.

We need to discuss:
- The requirements of a coordinator. The coordinators are the lead of a translation team. They are responsible for the translation team set up and the control of quality and progress.
- How to gather groups around with permissions the way translators want?`

2. how to leverage professional translations?

Since some companies start the productization of OpenStack, they may generate professional translation assets, both for messages and documents. If the companies want to contribute these translations, how to handle the relationship between community translation and professional translation?

Possible ways:
- Use the professional translation as translation memories
- Use the professional translation instead of community translation
- Anything else?

Which one is the best way?

(Session proposed by Ying Chun Guo)


Monday April 15, 2013 1:50pm - 2:30pm
B119

2:40pm

Autoscaling API for Heat
Heat has done an enormous amount of work towards feature-parity with AWS AutoScaling and CloudWatch. The intent of this session is to discuss what work remains left to be done, to fully explore the scope of that work, and to come to a consensus on a high-level implementation plan.

In particular, there is interest in creating an API for manipulating scaling groups, responding to alerts, scaling up/down, etc. There are many ways an API could be provided, and at least two different places it could live. This session aims to resolve any ambiguities around this.

Ideally, the outcome of these session would be a well-defined approach for 1) any additional implementation needed in Heat, and 2) a plan for creating an autoscaling API that a consensus agrees is the best approach.

(Session proposed by Duncan McGreggor)


Monday April 15, 2013 2:40pm - 3:20pm
B110

2:40pm

Documentation for Newly Integrated Projects
The Ceilometer and Heat projects have recently become integrated projects under the OpenStack umbrella. Both teams have a combined goal of starting to work more closely with the documentation team, now that we're official. I am proposing this session to ask for help in getting up to speed on doc processes, etc. as well as for us to discuss what documentation is already done.

(Session proposed by Doug Hellmann)


Monday April 15, 2013 2:40pm - 3:20pm
B119

2:40pm

OpenStack Networking DB improvements
The database layer, which acts as a backend for the OpenStack Networking API, has become a critical component as far as performability, scalability, and reliability of the Openstack networking service are concerned.

This session should cover the following points:
- Database access APIs
Database access is currently spread throughout the plugin. This leads to occasional errors such as atomicity not properly handled, or concurrent updates. The aim here is to discuss alternatives to improve DB logic in OpenStack Networking, possibly leveraging oslo and using other Openstack projects as an 'inspiration' (ie: shamelessly copy).
- Database Upgrades
During the Grizzly time frame, Mark McClain gave us DB upgrades; we would now like to collect feedback from the developer and user community about whether the current approach is suitable for deployment which chase OpenStack Networking trunk; also it might be worth discussing whether the current approach needs to be tweaked for handling service plugins.
- Usage of SqlAlchemy models
The aim of this point in the agenda is to discuss how DB models are used in OpenStack Networking, identifying what can be regarded as good practice, and what instead might not be a good practice, especially when the size of the data set increases dramatically.
- DB Profiling at scale
This point is related to the previous one and is aimed at discussing a set of procedure to asses how the OpenStack Networking database, and the modules operating on it, can be validated at scale.
Note: The list of referenced blueprints is provisional

(Session proposed by Salvatore Orlando)


Monday April 15, 2013 2:40pm - 3:20pm
B114

2:40pm

Swift drive workloads
We are analyzing the workload generated at the disk drive level on Swift clusters. The includes analysis such as the following.
- load versus time
- spatio-temporal locality
- heat density
- operation types
- operation correlation by spatial area (block address)
- operation lengths including simple sequential sequences

(Session proposed by Tim Feldman)


Monday April 15, 2013 2:40pm - 3:20pm
B116

2:40pm

Zero downtime service upgrades
Discuss strategies for working towards zero-downtime maintenance.

1) Stopping services gracefully, completing existing requests before quitting.
2) Make computes 'kill' friendly by enabling long running processes to resume after a restart. (e.g. resize)

(Session proposed by Brian Elliott)


Monday April 15, 2013 2:40pm - 3:20pm
B113

3:40pm

Concurrent Resource Scheduling in Heat
Heat currently uses a strategy of only creating/deleting/updating one resource per stack at a time. It does this in in dependency-topology order which keeps the code simple and guarantees things happen in the intended order. With large stacks, this will mean a significant amount of time spent waiting unnecessarily. We should discuss the problem space and various strategies to reduce waiting.

(Session proposed by Clint Byrum)


Monday April 15, 2013 3:40pm - 4:20pm
B110

3:40pm

Oslo Status and Plans
In this session, we will review the current status of Oslo and discuss improvements to the concept and process.

Topics up for discussion will include releasing individual library packages, versioning, pypi uploads, our "managed copy and paste" process and what shared code should be focusing our efforts on.

More topics are welcome, please come armed with your ideas!

(Session proposed by Mark McLoughlin)


Monday April 15, 2013 3:40pm - 4:20pm
B119

3:40pm

Reviewing OpenStack Networking Unit Tests
During the past year the OpenStack Networking codebase has dramatically increased in size, and the same happened to unit tests.

OpenStack Networking now has over 5000 unit tests which take about 10 minutes to execute (at least on my machine)

While at a first glance this might seem great, most of these unit tests are actually not unit tests, but rather integration tests performed against plugins (which sometimes mock their backends with fake drivers); also the coverage of such unit tests probably deserves to be reviewed as well, as proved by the fact that odd syntax errors are sometimes still found in OpenStack Networking

The aim of this session is to have a discussion around the current state of OpenStack Networking unit tests, and decide, together with the community, an attack plan to improve coverage, reduce execution time and resource usage, and define guidelines for writing unit tests for OpenStack Networking

Proposed agenda for the session:
1 - Assessment of the current state of unit tests, with emphaisis on Plugin unit tests
2 - Discuss and decide what is a unit test and what is a integration test
2.b - If we agree that many tests are actually integration tests, do we deem them still useful? Should they be part of a gate job?
3 - Consider alternatives for plugin unit tests
4 - Fake libraries. Are they a simple and handy way for simulating a complex system, or just added burden for unit tests?
5 - Parallel testing (this should probably not even need discussion!)
6 - Define/Assign blueprints and bugs

(Session proposed by Salvatore Orlando)


Monday April 15, 2013 3:40pm - 4:20pm
B114

3:40pm

Speeding up the object server
Due to the awesome* state of async disk IO on Linux, the object server doesn't do a very good job of keeping large numbers of disks busy. Let's figure out how to make it better on boxes with lots of disks.


* ly bad

(Session proposed by Samuel Merritt)


Monday April 15, 2013 3:40pm - 4:20pm
B116

3:40pm

The (continuing) path to live upgrade
Still more work needs to be done to provide a true path to rolling upgrade, such as RPC client N-1 version support.

(Session proposed by Dan Smith)


Monday April 15, 2013 3:40pm - 4:20pm
B113

4:30pm

Mini-Sessions: Network Proximity & Python Library and CLI
This session will include the following subject(s):

Network proximity:

Application performance can be enhanced by ensuring that images are deployed as close as possible to one another on the underlying physical network. The scheduler will need to be aware of the network "proximity" of hosts to one another. The session will propose an API to OpenStack Networking that will return the proximity between Hosts on a OpenStack Networking network to enable a "network proximity" group scheduling policy.

(Session proposed by Gary Kotton)

OpenStack Networking Client 3.0.0:

In this session, we'll discuss the underlying changes to the 3.0.0 python library and discuss how we can update the CLI to address some of the common issues experiences by users.

(Session proposed by Mark McClain)


Speakers
avatar for Gary Kotton

Gary Kotton

Staff Engineer at VMware, VMware
Gary is a core Neutron developer working at VMware who also spends a lot of his time these days writing + reviewing code for Nova. Prior to working at VMware Gary worked at Red Hat, Radware and at Algorithmic Research. Gary holds a Bs.C in Mathematics and Computer Science from the... Read More →
avatar for Mark McClain

Mark McClain

CTO, Akanda
Mark McClain is the Chief Technical Officer of Akanda Inc, a member of the OpenStack Technical Committee and a core reviewer for several teams (Neutron, Requirements and Stable). Mark was the Program Technical Lead for the OpenStack Networking during the Havana and Icehouse cycles... Read More →


Monday April 15, 2013 4:30pm - 5:10pm
B114

4:30pm

Nova v3 API
The Nova v2 API has several problems with it that can not be fixed by modifying the v2 API as it would cause compatibility problems with existing clients. The purpose of this summit session is to discuss the problems with the v2 API and how we will fix them with the v3 API. Issues include but are not restricted to:

* Clarification of what is to be core and non-core
* Decide on objective criteria for deciding how to classify this
* Promotion/Demotion of functionality into or out of core
* Consistency of return codes
* Fix extensions which don't follow REST principles
* Fix XML/JSON inconsistencies
* How to handle versioning for extensions (no more extending
extensions!)
* Copy v2 tree and rework in place?
* Timeline for work (h1/h2/h3 targets)
* Make sure we have enough time to bring up tests as well
as convert extensions developed during the Havana cycle so v2 only has to exist for the I cycle before being deprecated.
* Develop "good practice" guide for writing extensions


(Session proposed by Christopher Yeoh)


Monday April 15, 2013 4:30pm - 5:10pm
B113

4:30pm

Pecan/WSME Status
At the Grizzly summit I proposed replacing the WSGI framework in Oslo with a combination of Pecan and WSME. We have done that in ceilometer's v2 API, and this session will discuss lessons learned.

(Session proposed by Doug Hellmann)


Monday April 15, 2013 4:30pm - 5:10pm
B119

4:30pm

Rolling Updates and Instance Specific Metadata
Rolling updates is a proposed feature for havana that will allow updating metadata for instance groups in a controlled fashion.

There is also a need to have per-instance metadata for sharing things like database credentials, as it is more robust to have per-instance credentials than shared credentials.

We need to have a discussion on how those two features will interact.

(Session proposed by Clint Byrum)


Monday April 15, 2013 4:30pm - 5:10pm
B110

4:30pm

swift performance analysis
We did a deep dive Swift performance analysis. In this session, we will demonstrate our data, describe several possible performance bottleneck and propose relative optimization proposal.

(Session proposed by jiangang)


Monday April 15, 2013 4:30pm - 5:10pm
B116

5:20pm

Benchmarking Swift
Let's discuss benchmarking Swift.

The swift-bench tool that ships with Swift is a relatively simple load-generator, but deployers evaluating Swift need more.

I've written a new Swift benchmark tool, ssbench (https://github.com/swiftstack/ssbench/#what-is-this), but there's also COSBench, and probably some other home-grown benchmarking tools folks are using.

I'd like to put our heads together and discuss:

* What Swift benchmarks should evaluate/track (request duration, time-to-first-byte for GETs, etc.)
* How to scale benchmarking to Swift clusters containing many nodes (both from high req/s with small requests to effectively saturating the network bandwidth of many, many 10G-attached storage servers)
* How to track/report on server-side metrics during benchmark runs (e.g. a StatsD server integrated with the benchmarking tool which collects stats from Swift nodes during a benchmark and reports on those as well as client-side metrics in the first bullet, above)


(Session proposed by Darrell Bishop)


Monday April 15, 2013 5:20pm - 6:00pm
B116

5:20pm

Networking services' insertion, chaining, steering
In Grizzly we took a significant step forward with defining a network service and incorporating one service insertion model in the form of 'routed-service-insertion'. The proposed session aims to revisit this topic to move the discussion forward with exploring use cases requiring other modes of L2/L3 service insertion.
The other complementary aspect of this discussion is services' chaining and possibly the requirement for steering. Currently there is one service defined and implemented in OpenStack Networking in the form of Loadbalancer, but more services such as Firewalls will also need to be added. With the possibility of more than one service it becomes relevant to explore the model of how multiple services can be requested and sequenced.

(Session proposed by Sumit Naiksatam)


Monday April 15, 2013 5:20pm - 6:00pm
B114

5:20pm

No-downtime DB migrations
Operators want to keep services running while deploying upgrades, or at least close enough to running that clients don't notice. Shutting down all services, upgrading them all, then running migrations, then starting them all up is an uptime nightmare.

Lets detail the steps needed to allow migrations to happen without [non-trivial] visible downtime.

(Session proposed by Robert Collins)


Monday April 15, 2013 5:20pm - 6:00pm
B119

5:20pm

Nova API Extension framework
The development of the Nova v3 API will give us the opportunity to rework the extension framework. The current framework suffers from:

* extensions specific code required to exist in core code for specific extensions to work
* compromises such as non pep8 class names because of the extension framework

The purpose of the summit is to discuss proposals of how the newextension framework will work and hopefully gain from previous experiences on extension points used in other places such as scheduler hints/keypairs etc.

(Session proposed by Christopher Yeoh)


Monday April 15, 2013 5:20pm - 6:00pm
B113
 
Tuesday, April 16
 

11:00am

Load Balancing as a Service - in Havana
As of Grizzly, LBaaS is supported for HAProxy only.
There are multiple subjects summarized on etherpad https://etherpad.openstack.org/havana-quantum-lbaas that covers the subjects for discussion in the summit.


(Session proposed by Samuel Bercovici)


Tuesday April 16, 2013 11:00am - 11:40am
B114

11:00am

Nova scheduler features
This is a session on features people would like to add to the current scheduler in Nova. It's packed with ideas. We will do our best to discuss as much as we can in the time available. Topics that need more time can be scheduled for the unconference track later in the week.

This session will include the following subject(s):

Extend the Nova Scheduler view of mutable resource:

Currently the Nova host_manager keeps track of the Disk, Memory and vCPU resources of each host which are then available for use in scheduler filters. These are also reflected in the flavour definitions.

I’d like to discuss extending this to cover additional resource types such as:

Compute Units : An measure of CPU capacity, that is independent of the physical CPU performance of a server models and independent of the vCPU configuration of a VM . Each server would report its total Compute Unit capacity.
The flavor "vcpu_weight" would seem to meet the requirement in terms of definition, but it seems to be something of a hidden value (the instance_types create() method doesn;t support it for example) and its not currently tracked as a mutable resource in the host_manager.


Network Bandwidth: An measure of network bandwidth that is independent of the network capacity of a server. Each server would report its total network bandwidth capacity. The current rxtx_factor is the flavours looks like it could be logically used to represent this, but the current usage seems to be in conflict with being an arbitary measure of bandwidth sinceit represents a % the rxtx_base value of a network. A Nova system could include hosts with 1Gb, 10Gb, multiple 10Gb network configurations connected to the same logical network.


These are two examples of additional flavour and host attributes, there will probably be others either now or in the future. Flavors already provide an extra_spec that could in theory be used to define these, but there is no way to expose the corresponding host capacity to scheduler filters. The host manager does support a “capabilities” feature, but this is more of a binary value that a consumable resource.

Possible options are:
- Add the existing vcpu_weight and rxtx_factor as specific mutable resources in the host manager. May be a conflict here between current usage in xen and the more general definitions of resources.

- Add additional flavour & host manager resources, to avoid overload / conflict with current usage of vcpu_weight and rxtx_factor.

- Provide a generic mechanism for the host manager to support additional mutable objects that correspond to and can be consumed by flavour extra_spec values.

In addition to making this data available to the scheduler, it also needs to be consumable by a resource management layer that may be to some extent independent of the virtualisation library. For example it is an established design pattern to implement resource management via an agent running outside of Nova itself – for example an agent which is triggered via a libvirt hook when a VM is created. Currently such an approach only has access to the flavour aspects which are part of the VM definition. This proposal would (for libvirt) create an additional XML file per VM that contains the full flavour definition.


(Session proposed by Phil Day)

List supported Scheduler Hints via API:

https://etherpad.openstack.org/HavanaNovaSchedulerHintsAPI

The Nova API for instance creation supports a scheduler_hints mechanism whereby the user can pass additional placement related information into the Nova scheduler.

The implementation of scheduler_hints lies (mostly) in the various scheduler filters, and the set of hints which are supported on any system therefore depends on the filters that have been configured (this could include non-standard filters). It is not currently possible for a user of the system to determine which hints are available. Hints that are not supported will be silently ignored by the scheduler

We propose to add an API extension to make the list of supported hints available to users.

(Session proposed by Phil Day)

Rack aware scheduling in Nova:

https://etherpad.openstack.org/HavanaNovaRackAwareScheduling

A common requirement for analytical applications (amongst others) is to want to place related workloads on the same rack so as to take advantage of the increased network bandwidth. In order to support this we need:

This is similar to the existing affinity scheduler filter, but requires additional per server attributes to be exposed. We would like to discuss whether this can be achieved by extending the existing capabilities mechanism.

(Session proposed by Phil Day)

Add "whole host" allocation capability to Nova:

https://etherpad.openstack.org/HavanaNovaWholeHostAllocation

Allow a tenant to allocate all of the capacity of a host for their exclusive use. The host remains part of the Nova configuration, i.e. this is different from bare metal provisioning in that the tenant is not getting access to the Host OS - just a dedicated pool of compute capacity. This gives the tenant guaranteed isolation for their instances, at the premium of paying for a whole host.

We will present a proposal that could achieve this by building on existing host aggregate and scheduler filters.

Extending this further in the future could form the basis of hosted private clouds - i.e. schematics of having a private could without the operational overhead.

The required features are explored by stepping through the main use cases in the Etherpad.

(Session proposed by Phil Day)

Make scheduler host state extensible:

Overview:

The nova scheduler is periodically sent updates from each of the compute managers about the latest host capabilities / stats. This includes available ram, the amount of IOPS, the types of supported cpus, number of instances running, etc. Scheduler filters can then be defined, and guided using scheduler hints, to use this information to improve instance scheduling.

It would be useful if the information could be generalized so that services other than compute could also update the scheduler's host information.

Benefits:

* 3rd party extensions can feed information into the scheduler that their SchedulerFilters can interpret and make better scheduling decisions.

* This could be a good step in moving more scheduler code into oslo-incubator. The common scheduler code can accept updates for any source (e.g. cinder manager in cinder's case or compute manager in nova's case).

* If the scheduler does become a full independent service then this type of functionality will be required (i.e. the scheduler will need to make a decision based on information from both Cinder and Nova).

Related Summit Talks

* http://summit.openstack.org/cfp/details/121
Features like this could be implemented as small daemons on the hosts that can cast a message to the scheduler about the host rack ID.

* http://summit.openstack.org/cfp/details/120
Since we are opening up more customization to the type of data available to the scheduler, and the potential filters installed, this would be a good feature.

* http://summit.openstack.org/cfp/details/36
Updates the information that is sent to the Scheduler.


(Session proposed by David Scannell)

Coexistence of different schedulers:

Today in Nova only one scheduling algorithm can be active at any point in time. For environments that comprise multiple kinds of hardware (potentially optimized for different classed of workloads), it would be desirable to introduce flexibility in choosing different scheduler algorithms for different resource pools (aggregates). This can be done ei

Tuesday April 16, 2013 11:00am - 11:40am
B113

11:00am

Rootwrap improvements for the Havana cycle
During the Grizzly timeframe, Rootwrap moved to Oslo and that version was adopted by Nova and Cinder. This session will look into further work planned during the Havana timeframe:

- Make Quantum's rootwrap use the common version (including introducing the extra features from quantum-rootwrap into the common one)
- Add a PathFilter for obvious path-constrained operations
- Execute snippets of Python instead of shelling out
- Graduate oslo.rootwrap off Oslo incubation



(Session proposed by Thierry Carrez)


Tuesday April 16, 2013 11:00am - 11:40am
B119

11:00am

Tempest Scope
Tempest has grown from smoke tests for Nova, to incorporating most of the core projects, to including CLI testing.

purpose of summit session:
* get a clear definition of Tempest scope that we are good with for Havana (we can revisit at each future summit)
* figure out what other kinds of tests we'd welcome
* if we are increasing scope in a single project, do we have guidelines for reviewing / contributing to get people into core (make sure people are reviewing past just one subdir to ensure we don't have culture framentation)


(Session proposed by Sean Dague)


Tuesday April 16, 2013 11:00am - 11:40am
B110

11:00am

Testing in Horizon
The main aspect: the Horizon unit tests can be quite complex for new contributors and people extending Horizon to wrap their head around. Mocking with mox, the django unit testing framework, the openstack specific parts of the testing framework, selenium, fixtures/test data handling, qunit...

This session could work as a tutorial/tips and tricks on the different testing components. Common errors being thrown and how to debug them. If people could bring up their pain points, that would also be useful.


If there is time, it would be interesting to also address the issue from another angle and think on how to improve what we have, particularly on the Selenium front which has been quite unstable.

(Session proposed by Julie Pichon)


Tuesday April 16, 2013 11:00am - 11:40am
B116

11:50am

Common packaging support and code analysis tools
At the end of the Grizzly cycle, work was done to make openstack.common.setup/version and hacking.py stand alone projects - i.e. oslo.packaging and oslo.hacking.

This session will discuss the scope of these, whether they should actually be in Oslo or not, and how we should manage them going forward - as well as the technology choices involved in making them separate (such as d2to1 and flake8)

(Session proposed by Monty Taylor)


Tuesday April 16, 2013 11:50am - 12:30pm
B119

11:50am

Keystone v3 and multi endpoint/region support
With the introduction of the Keystone v3 API, there are numerous new features and complexities added to Identity Management. Let’s discuss how Horizon will present these added complexities and new constructs.

Additionally, let’s discuss how multiple service endpoints and multiple region support will be represented and utilized.


(Session proposed by David Lyle)


Tuesday April 16, 2013 11:50am - 12:30pm
B116

11:50am

Scheduling across nova/cinder/networking
This session will include the following subject(s):

Scheduling across nova/cinder/networking:

The goal is to be able to schedule resources across nova/cinder/networking efficiently. A simple example could be the 'boot from volume' scenario, when it would be highly desirable to make sure that the instance and the volume reside "close enough", as reflected by the underlying network topology (provided by OpenStack Networking, probably).

(Session proposed by Alex Glikson)

Unified resource placement for OpenStack:

Background:

The current Nova scheduler does not take into account the connectivity request among a bunch of VMs (bundled VMs) neither the physical topology. For example, there is no API that allows users to specify that all VMs should be placed in compute nodes that are not more than 2 hops away with at least 50Mbps residual bandwidth available on the path connecting these two VMs. Even though with current scheduler APIs, users could specify the exact hosts to place the VMs to achieve certain network affinity. But this requires very detailed physical topology information which might not be available to tenants. Another example is that the block storage manager (nova-volume or Cinder) currently selects nodes randomly. This could cause unnecessary load on network as well as latency to the application.

How to address these problem:

Ideally, a higher level manager should be developed to oversee and coordinate the compute, storage, and network manager for a more efficient resource usage. Another important benefit is that tenants now can specify their requests of compute, storage, and network in a bundled manner. They should not really care about the fine grained details on how to connect their application servers and worry about competing the network and storage resources with other tenants. There have been a lot of interests in this area (See for example https://etherpad.openstack.org/the-future-of-orch).

Goal of this session:
This session should serve as a good venue to discuss the problem definition, the related blue prints, potential solutions, pitfalls we want to avoid and the timeline to implement a sound placement module for OpenStack. We should also discuss whether we should separate the workflow management from the resource (node) selection for better modularity.

(Session proposed by Senhua Huang)


Tuesday April 16, 2013 11:50am - 12:30pm
B113

11:50am

Strategies for Gating in a growing project
Tempest reached an important milestone recently when the whole suite became a gating job for all projects. This is good, but as

1. more tests pour in to projects
2. more projects have multiple API versions
3. more projects become Integrated
4. Tempest moves towards a real acceptance test for OpenStack

it is unlikely that "gate all projects on the full test suite of every other project" is sustainable. We should be able to come up with a better strategy to decide which parts of Tempest should be gating and make gating test coverage more modular while minimizing the risk of regression.

(Session proposed by David Kranz)


Tuesday April 16, 2013 11:50am - 12:30pm
B110

11:50am

VPN-as-a-Service
As Quantum now is gearing towards supporting a Multi-Plugin Support Approach, one of the Service Types is VPN. In this session we will discuss how VPN's can be configured, provisioned and managed as a service through Quantum. This part 1 of 2.

(Session proposed by Mark McClain)


Tuesday April 16, 2013 11:50am - 12:30pm
B114

1:50pm

API Extension Mini Sessions
In this session we will discuss potential new APIs extensions to be implemented in Havana.

This session will include the following subject(s):

QoS API for OpenStack Networking:

A blueprint has been filed:
https://blueprints.launchpad.net/quantum/+spec/quantum-qos-api

QoS features can be complex. There are standards and there are vendor-specific bells and whistles. This session will bring interested parties together to discuss what we would like to see in the initial common QoS API for OpenStack Networking, and how extensions can be handled.

(Session proposed by Henry Gessau)

Port Isolation in Networks:

It should be interesting to proposed an option on the network creation that enable the isolation between ports in a same broadcast domain (network), similar to a common use of private VLANs with isolated port technologies (RFC 5517).

(Session proposed by Édouard Thuleau)


Tuesday April 16, 2013 1:50pm - 2:30pm
B114

1:50pm

Group scheduling
This session will include the following subject(s):

Placement groups in Nova:

EC2 recently introduced Placement Group APIs to ensure proximity of HPC instances within a group (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html).
The proposal is to introduce similar APIs in Nova, potentially supporting also additional placement policies/strategies, such as anti-affinity for availability purposes. As a next step, the goal is to refine the notion of proximity/anti-affinity to be able to reflect different levels of "proximity"/"isolation"/etc (e.g., "different racks"). As a result, it will be possible to reflect membership of instances in groups in a more efficient way (compared to using metadata today), and to drive placement decisions by the scheduler in a more flexible and intelligent way.

(Session proposed by Alex Glikson)

VM Ensembles (Group Scheduling) - API:

The session will propose and discuss the notion of group scheduling. An API will be proposed to enable the user to perform the scheduling of a group. The group scheduling will enable the tenant to benefit from filters that provide anti affinity and network proximity to provide elevated service levels in terms of (high) availability and performance to applications that run in the cloud.


(Session proposed by Gary Kotton)


Speakers
avatar for Alex Glikson

Alex Glikson

Manager, Cloud Operating System Technologies, IBM Research
Mr Alex Glikson is leading a research group at IBM Haifa Research Lab working on research and development of innovative cloud infrastructure management technologies - focusing on OpenStack (http://www.research.ibm.com/haifa/dept/stt/cloud_sys.shtml). In this role Alex closely col... Read More →
avatar for Gary Kotton

Gary Kotton

Staff Engineer at VMware, VMware
Gary is a core Neutron developer working at VMware who also spends a lot of his time these days writing + reviewing code for Nova. Prior to working at VMware Gary worked at Red Hat, Radware and at Algorithmic Research. Gary holds a Bs.C in Mathematics and Computer Science from the... Read More →


Tuesday April 16, 2013 1:50pm - 2:30pm
B113

1:50pm

Multi-node Openstack Testing
This session will include the following subject(s):

OpenStack in OpenStack:

In this presentation, we discuss how to deploy a multi-node OpenStack environment inside a typical openstack environment, where you usually get a VM with an internal IP and optionally a public floating IP via NAT. Such a virtual deployment option is desirable for developing and testing new features that require multiple node setup, such as networking, high availability, live upgrade, etc.

We demonstrate a few key enabling technologies:

* nested server virtualization;
* software-defined networking and policy routing for virtual private and public layer-2 networks;
* chef for automatic configuration and deployment.


(Session proposed by Yun Mao)

multi-node openstack testing:

Discuss how the CI system and the Gate can deploy and test multi-node setuips.

Specifically, the TripleO team has been working on a heat-based installation of OpenStack that is also hopefully going to be referenced/used as part of the refstack work. This work allows us to do multi-node deploys without needing to pull in non-OpenStack deployment technology that may be contentious such as puppet, chef, juju or maas. Additionally, work has already started on integrating this work with OpenStack CI.

Talk about the approach in general, and what the status of CI integration is.

(Session proposed by Monty Taylor)


Tuesday April 16, 2013 1:50pm - 2:30pm
B110

1:50pm

OpenStack Networking/Horizon integration
We did lots of great work with the OpenStack Networking team in Grizzly, and there's lots more to be done. Late-breaking features need polish and the area of "rich network topologies" is an exciting UX challenge. Let's see where else we can take things in the Havana cycle.

(Session proposed by Gabriel Hurley)


Tuesday April 16, 2013 1:50pm - 2:30pm
B116

1:50pm

RPC API review
The RPC API is one of the next APIs we propose to release as a new standalone library in the Oslo family.

Once released standalone, we will be making a firm commitment to the stability of the API. This session will review the current API and discuss areas where we're not comfortable about our ability to maintain the API into the future.

(Session proposed by Mark McLoughlin)


Tuesday April 16, 2013 1:50pm - 2:30pm
B119

2:40pm

Firewall as a Service
Quantum now has the ability to load multiple service plugins. Firewall features could be managed and exposed via a Firewall service plugin (similar to LBaaS service plugin).

Discussion topics:
- Defining the resource abstractions and CRUD operations
- SQLAlchemy data model
- Backend "fake" driver for testing

(Session proposed by Sumit Naiksatam)


Tuesday April 16, 2013 2:40pm - 3:20pm
B114

2:40pm

FITS testing of public clouds
We run tempest against our code on commit - but it's been suggested/requested that we configure something to be able to run tempest against existing public clouds. There are some logistical issues to consder - such as when to run (every commit to tempest?) and what to do with the results (publish them? tweet about them? carrier pigeon?)

(Session proposed by Monty Taylor)


Tuesday April 16, 2013 2:40pm - 3:20pm
B110

2:40pm

Heat GUI integration into Horizon
The thermal software has been created for Horizon at the start of Grizzly. Purpose of session is to discuss what Heat needs out of a GUI, what the Horizon community would like to see out of such an integration, and develop a plan for integrating the thermal functionality into Horizon.

Please see this 6 minute screencast:
http://fedorapeople.org/~radez/thermal20121205.ogv

(Session proposed by Steven Dake)


Tuesday April 16, 2013 2:40pm - 3:20pm
B116

2:40pm

Nova Compute Cells Status
This session will include the following subject(s):

Nova Compute Cells Status:

In this session, I plan to discuss the current status of cells in trunk. Now that a basic working version has landed, I think it would be useful to refresh people on what functionality trunk provides. There are a number of features that cells does not support. I plan to discuss those and other TODO items.


(Session proposed by Chris Behrens)

Cell support in Ceilometer:

We need to review the changes we may need to make in Ceilometer to support Nova Cells.


(Session proposed by Nick Barcet)


Tuesday April 16, 2013 2:40pm - 3:20pm
B113

2:40pm

ZeroMQ RPC for Ceilometer and OpenStack Networking
The ZeroMQ RPC driver was initially developed for Nova and its actor and scheduler driven design. Since moving into Oslo, new projects have begun using the RPC code for different messaging patterns. Some of those patterns are particularly un-actor-like, which poses challenges for the ZeroMQ driver. I've been attacking these challenges head-on and we've made good progress in Grizzly. I would like to discuss what challenges remain, identify new and upcoming challenges for Havana, set expectations, and identify gaps where new blueprints may be necessary.

(Session proposed by Eric Windisch)


Tuesday April 16, 2013 2:40pm - 3:20pm
B119

3:40pm

Bare Metal Networking Support
This session will cover two aspects of improving bare metal support within Quantum: Additional VLAN support and PXE DHCP.

This session will include the following subject(s):

bare metal HA PXE dhcp-agent support:

Bare metal machines need PXE fields set in DHCP to provision properly. We currently have someone working on that, but only at the minimum set of functionality - we hope to have that landed before the summit.

However, to be production ready, having active-active DHCP agents would be much better.

In this session we would work though all the required changes to run quantum-dhcp-agent on multiple nodes (for the same network) concurrently.

The linked blueprint is for the initial work.

(Session proposed by Robert Collins)

Bare Metal VLAN network support:

This session will cover several aspects of bare metal and Quantum.

VLAN network:

With bare metal nova we cannot add or remove ethernet cards in response to user network requests. We can however give nodes access to additional L2 networks using VLANs.

For triple-o, running openstack on openstack, we want VLANs as well to partition operator and tenant traffic more thoroughly.

In both cases we need to be able to hand information to the instance which will be booting and have only native-VLAN access [but that is sufficient to access the metadata service].

In this session we will discuss the available options and decide on the route forward.

HA PXE dhcp-agent:
Bare metal machines need PXE fields set in DHCP to provision properly. We currently have someone working on that, but only at the minimum set of functionality - we hope to have that landed before the summit.

However, to be production ready, having active-active DHCP agents would be much better.


(Session proposed by Robert Collins)


Tuesday April 16, 2013 3:40pm - 4:20pm
B114

3:40pm

Ceilometer/Horizon Integration
Discussion of what the best ways to present data from Ceilometer in Horizon are. How best can we empower admins and end users with more understanding of what's happening in the system and better visualizations of that information.

(Session proposed by Gabriel Hurley)


Tuesday April 16, 2013 3:40pm - 4:20pm
B116

3:40pm

Gating/Validation of OpenStack Deployments
Even though projects are being gated with Tempest, we've found that there are a number defects of that do not appear until OpenStack components are deployed in a larger, more realistic test environment. We'd like to discuss the level of rigor we've defined to gate our deployments and see how we can push that work upstream.

(Session proposed by Daryl Walleck)


Tuesday April 16, 2013 3:40pm - 4:20pm
B110

3:40pm

Message queue access control
AMQP server provides the message bus for openstack, so its security affects the overall of openstack big time. A lot of efforts have been spent on authentication of the sender/recipient, and confidentiality/integrity protection of the messages. However, a compromised Nova component, e.g. hypervisor, can pass authentication as normal (before the compromise is detected and corrected) and send malicious but legitimate messages to the bus and hence mess up the openstack system. Fine grained access control and throttling of messages etc for authenticated AMQP client is needed to counter this.

Oneway to do that is to implement the access control, authorization and throttling etc in the Nova code, but this implementation will be duplicated everywhere AMQP messages are examined and/or consumed.

This session proposes implementing access control with flexible authorization based on roles and other metrics, message authenticity, throttling/rate-limiting etc at the AMQP level via either an AMQP proxy or as a plugin to an AMQP server. It can also help on access control in multi-cluster scenarios as well.

If accepted, a 45-minute talk will be prepared or brainstorming session will be conducted to outline and discuss the details on how it works. Note that it's not just for openstack, any system that uses AMQP as message bus can leverage the capabilities provided here.

About the author: Jiangang Zhang, a.k.a. JZ, veteran in architecting and managing the whole development lifecycle of highly scalable, highly available and highly performant software and systems and in practicing pretty much all aspects of information security, currently Distinguished Architect at Yahoo. JZ can be reached via jz@yahoo-inc.com (business) or jgzhang@hotmail.com (personal).


(Session proposed by Jiangang JZ)


Tuesday April 16, 2013 3:40pm - 4:20pm
B119

3:40pm

The future of state management in nova
Likely this could be joined with http://summit.openstack.org/cfp/details/72 but I would like to focus more on the prototype that NTT and I have been working on and show the code and the architecture and discuss where to go from here to getting this code fully used throughout nova.

Talking points:
* Current architecture and thoughts on the current and future architecture of state management and how the future provides more benefits than the current status-quo.
* What has been prototyped and how far it has gotten.
* State transitions & rollbacks.
* Some fundamental changes that need to be applied...
* How this relates to conductor effort? Does it?
* What name to call this effort (people seem to get confused with anything call orchestration)?
* Where this can go from here and other feedback...
* State management in other components.
* How to get from here to there.
* How to help out!

Hopefully either code can be shown or even a demo of the prototyped code, depending on time allocated and how much time is free to show this. Mainly I would like to ensure the design that was prototyped is discussed, its benefits shown and get everyone on-board that this is the desired future way of doing things and get people involved in fixing up nova to make it be all it can be.

(Session proposed by Joshua Harlow)


Tuesday April 16, 2013 3:40pm - 4:20pm
B113

4:30pm

Are we ready for real-time data in Horizon?
Between Ceilometer and the Oslo common message bus, we're getting to the point where real-time data streams are looking feasible. Past summits laid out plans for what we'd like to be able to do, but I'd like to deep-dive on what we can realistically accomplish in Havana with the tools now becoming available.

(Session proposed by Gabriel Hurley)


Tuesday April 16, 2013 4:30pm - 5:10pm
B116

4:30pm

Nova Scheduler as a Ceilometer client?
This should be the continuation of a discussion with Vishvananda about how nova sheduler could use ceilometer to take its measurements from.

(Session proposed by Nick Barcet)


Tuesday April 16, 2013 4:30pm - 5:10pm
B113

4:30pm

RPC Message Signing and Encryption
With message signing on the horizon to provide confidence, it is expected that many will desire the next step: encrypted messaging to provide confidentiality. It is not expected we can have confidentiality in Havana, but we will need to plan for it in our changes to rpc envelope, matchmaker, etc.

With message envelopes introduced in Grizzly and to be enabled with Havana, we have the ability to make the envelope immutable. I am proposing a blueprint to work toward immutable messaging and would like to discuss the challenges of achieving this goal and the requirements for the next version of the RPC envelope.

(Session proposed by Eric Windisch)


Tuesday April 16, 2013 4:30pm - 5:10pm
B119

4:30pm

SDN controller improvement
Quite a few limitations in the current implementation prevend OVS or Linux Bridge to be used in large scale deployment. This session has for goal to
* present the result of a first analysis done by CloudWatt for their deployment
* propose a few solutions to solve those issues using various techniques
* collect other ideas and suggestions from the community.

This session will be led by Edouard Thuleau (CloudWatt) helped by Nick Barcet (eNovance).

(Session proposed by Nick Barcet)


Tuesday April 16, 2013 4:30pm - 5:10pm
B114

4:30pm

Upgrade testing and Grenade
Grenade is now running in a non-voting mode for some projects. It is time to firm up the plans for how to use it for gating and the process for tracking failures.

(Session proposed by Dean Troyer)


Tuesday April 16, 2013 4:30pm - 5:10pm
B110

5:20pm

Beyond the API - End to End Testing of OpenStack
While we've done a good job of testing of the APIs with Tempest project, the major of the defects have to do with the actual artifacts generated by API requests (the servers, their networking, etc). I'd like to discuss what a reasonable and achievable goal of extending our test coverage beyond the API would look like.

(Session proposed by Daryl Walleck)


Tuesday April 16, 2013 5:20pm - 6:00pm
B110

5:20pm

BoF: Building on Horizon
Extensiblility is a key design tenet of the Horizon project but just how easy is it to build and customize?

This session will aim to gather those who consume and develop Horizon for a discussion on pain points and where nessary flexibility in the architecture is missing.

(Session proposed by Cody A.W. Somerville)


Tuesday April 16, 2013 5:20pm - 6:00pm
B116

5:20pm

Hardware Driver interface for OVS
This blueprint describes a generic hardware driver interface within the OpenStack Networking plugins which will enable support for different hardware backends for L2 network segregation (VLANs, tunneling, etc.). This API is available as a common driver library under quantum/common/hardware_driver and may be used by any OpenStack Networking plugin. Currently we have modified only the popular OVSPlugin to use this driver API.

This may be useful for existing data centers with hardware switches which needs to be used along with Openstack infrastructure. In this case a hardware vendor may introduce a hardware driver which confirms to this driver API, which will allow using vendor's hardware within Openstack along with Open vSwitch virtual switches to provide L2 network segregation.

This will allow automatic L2 network provisioning on the hardware devices alongside with the open vswitch provisioning in the compute node hypervisors.

We have implemented the driver API proposed in this blueprint and are providing the source code for an Arista Driver which supports provisioning of Arista TOR (top-of-the-rack) switches alongside with open vswitches.

(Session proposed by Satish Mohan)


Tuesday April 16, 2013 5:20pm - 6:00pm
B114

5:20pm

Unified Event Notifications
We have:
1. The AMQP notifications,
2. The DB InstanceFault table,
3. Exception handling decorators,
4. The conductor mechanisms you've highlighted,
5. and, to a lesser extent, logging

Can we unify some or all of these efforts to clean up the code?

(Session proposed by Sandy Walsh)


Tuesday April 16, 2013 5:20pm - 6:00pm
B113

5:20pm

Zipkin tracing in OpenStack
At y! we have been working in integrating zipkin into openstack to get a 'live' tracing mechanism hooked into the various openstack components.

This session would be talking about how we did this and what the benefits are and how it can be expanded in the future to provide more in depth tracing with more context, something sorely lacking in openstack.

We'd like to share this code with others and let others see the potential of such a system and also be able to use it for themselves...

See: http://engineering.twitter.com/2012/06/distributed-systems-tracing-with-zipkin.html
See: http://research.google.com/pubs/pub36356.html

If time permits a demo would be applicable, showing zipkin live tracing nova/keystone calls (or a subset of).

(Session proposed by Joshua Harlow)


Tuesday April 16, 2013 5:20pm - 6:00pm
B119
 
Wednesday, April 17
 

11:00am

Adding new integrated projects to Tempest
The Ceilometer and Heat projects are now integrated projects, and need to be included in the Tempest test suite. We need guidance on how to approach that (what it means, what has to happen, how to avoid pitfalls, whatever). Share your wisdom with us!

https://etherpad.openstack.org/havana-adding-projects-to-tempest

(Session proposed by Doug Hellmann)


Wednesday April 17, 2013 11:00am - 11:40am
B110

11:00am

Getting Glance Ready for Public Clouds
Currently Glance is exposed to users through Nova; this is becoming a problem because new Glance features require a Nova extension. It would be better to have Glance as a first-class member of the OpenStack ecosystem. But in order for this to happen, we (as in OpenStack cloud providers) would need at least:
- more robust user roles to allow per-user:
- quotas
- (anything else?)
- protected image properties
- image-related restrictions
- e.g., there may be contractual reasons why you wouldn't want to allow download of specific images based not on the user, but on the image itself; might be the case for other actions)
- other API changes from increased load
Protected properties is scheduled for Havana; blueprint but no details yet.
There are currently blueprints for rate limits, but an alternative approach would be to think that rate limiting should be done in front of Glance by Repose or a similar system that understands Keystone.

(Session proposed by Iccha Sethi)


Wednesday April 17, 2013 11:00am - 11:40am
B116

11:00am

i18n strategy for OpenStack services
This session will start with a quick overview of the current approach that OpenStack services take to i18n and some of the challenges faced.

We will then step back and look at the bigger picture - OpenStack services currently do immediate translation of messages to the local server locale, yet there are two use cases for the translation of messages from OpenStack services:

1) As an OpenStack technical support provider, I need to get log messages in my locale so that I can debug and troubleshoot problems.

2) As an OpenStack API user, I want responses in my locale so that I can interpret the responses.

If we want to translate log messages (i.e. use case 1), they should be in a separate translation domain from the messages we return to users of the REST API (i.e. use case 2). The problem with having them both in the same translation domain is that translators have no way of prioritizing the REST API messages nor do administrators have any way of disabling the translation of log messages without the translation of the REST API messages.

Another tactic that may help is to delay the translation of messages by creating a new Oslo object that saves away the original text and injected information to be translated at output time. When these messages reach an output boundary, they can be translated into the server locale to mirror today's behavior or to a locale determined by the output mechanism (e.g. log handler or HTTP response writer).

As part of this session we will look at some of the difficulties encountered in the implementation of delayed messages: use of gettext.install(…) to install the _() function into Python's __builtin__ namespace (also known as "domain change issue") and the expectation that _() is returning a string.

(Session proposed by Mark McLoughlin)


Wednesday April 17, 2013 11:00am - 11:40am
B119

11:00am

Modular L2 and L3
In this session we will look at modularization efforts for both L2 and L3 services.

This session will include the following subject(s):

L3 API modularization:

The purpose is to discuss issues related to L3 routing provided by separate service plugins as opposed to integrated in core plugins. Even though L3 routing is focus,
this will have relevance to Quantum SI in general.

Things to cover could be API related like if L3 resources become part of core, how will Quantum support those to be provided by other than core plugins? Also, would there be a
benefit in breaking apart some of the API/resources today bundled together as "L3" (e.g. NAT)? Another thing to cover could be handling of state dependencies between plugins (one plugin has state dependent on resource handled by another plugin, floatingip one example), can Quantum support this in generic way?

A service plugin for L3 routing (but also other network services implemented as part of Quantum's SI framework) could be implemented by relying on Nova managed VMs. Connecting and de-connecting such service VMs to different subnets/networks could be simplified if Quantum supported something analogous to VLAN trunks on physical switches. This has been proposed in a blueprint. But how should that be represented/implemented in Quantum, by extending ports or as new resource?

(Session proposed by Bob Melander)

Modular L2 Plugin - Design, Status and Future:

A modular layer 2 plugin (ml2) was proposed and discussed during the grizzly design summit. In contrast to existing monolithic plugins, ml2 uses drivers to simultaneously support an extensible set of network types and to simultaneously support an extensible set of mechanisms for accessing networks of those types.

During grizzly, an initial design for ml2 was created, implementation was begun, and a work-in-progress patch set was reviewed. Meanwhile there has been growing community interest in achieving the goals that the ml2 plugin attempts to address, such as:

* better support for mixing heterogeneous networking technologies
* supporting complex network topologies (i.e. multi-segment L2 networks)
* reducing the amount of code that needs to be written and maintained for each supported networking technology

This session will start with a brief overview of the ml2 plugin's initial goals, its current design, and its development status. This overview will be followed by open discussion of its future in havana and beyond. Possibilities include:

* replacing the existing non-controller plugins (linuxbridge, openvswitch, hyperv) in havana, using their existing L2 agents
* replacing the existing L2 agents with a driver-based modular L2 agent
* supporting new networking technologies in havana via drivers rather than adding new monolithic plugins
* replacing existing controller-based monolithic plugins with ml2 drivers
* replacing the quantum core monolithic plugin interface with a set of more-modular driver interfaces
* extending the quantum API to control physical network topology and other deployment details currently handled via configuration files
* addressing orchestration of higher-level activities that cross the various networking layers and the available mechanisms at each layer

(Session proposed by Robert Kukura)


Wednesday April 17, 2013 11:00am - 11:40am
B114

11:00am

Richer and more flexible APIs for block devices
A number of blueprints relate to block devices configuration and controling the cinder volumes. They not only propose enhancements but also raise questions that suggest a deep rework of the internal API between compute manager and compute drivers.

This session would be about

* what should be reworked in the internal API between compute manager and compute drivers
* explaining how the blueprints all fit together
* defining what can be achieved for Havana
* what should be next although it's unrealistic for Havana

The API currently implements an abstraction (inherited from EC2) that does not map to the abstraction of the virtualisation layer. The external API should be modified first and then it will be easier to rework the internals. For backward compatibility, the old API would be preserved.

The virtualization abstraction follows the EC2 api exactly ( a hash from the EC2 abstraction is passed to the virtualization driver which requires guessing from the driver ). The virtualization abstraction needs to be modified to be able to take advantage of a better API.

Josh Durgin and Nikola Đipanov are implementing the blueprints.


(Session proposed by Loic Dachary)


Wednesday April 17, 2013 11:00am - 11:40am
B113

11:50am

Common XenAPI libary
Nova and Cinder now both have code to access XenAPI that builds on the standard XenAPI library. They can make assumptions about the way OpenStack uses xenapi.

It would be good have this code shared in a managed way between the projects. Something like oslo-xenapi seems like one possible good place. The code that has been added into Cinder is probably a good place to start when bringing together the code in both Nova and Cinder.

(Session proposed by John Garbutt)


Wednesday April 17, 2013 11:50am - 12:30pm
B119

11:50am

Image Interchange
We (as in OpenStack) want to allow users to build in various clouds, but in order to do this we need to provide image conversion tools (or organize existing tools to make it easier on users). We'd like to get a community consensus on a starting point for this. Key issues are:
- what formats to support
- what format(s) for transfer
- where the conversion will happen:
- offline (on user side) before upload
- in flight
- on download
- offline (on cloud side, background job)
- interactions with image caching, snapshots and backups
- where the code to do this should live
- in Glance
- in another service
- as a toolset
In addition to the image format problem, there's the additional problem of additional software (e.g., cloud init, Xen agent, drivers) necessary for good VM performance. We need some discussion on whether/how this can be injected into the image or what's the best way to do this.

(Session proposed by Alex Meade)


Wednesday April 17, 2013 11:50am - 12:30pm
B116

11:50am

Nova Updates for Disk Encryption
There are two blueprints for doing disk encryption with Nova: 1)
Encryption of Attached Cinder Volumes (has a working prototype) and 2)
Encryption of Local Ephemeral Disks. This session presents the
architecture of our prototype implementation to solicit feedback from
the community and to discuss open questions regarding the current
implementation prior to its submission for Havana. Consensus from the
community is also desired regarding how to extend the existing prototype to support encrypted ephemeral storage.

(Session proposed by Laura Glendenning)


Wednesday April 17, 2013 11:50am - 12:30pm
B113

11:50am

PNI&VNI Pluggable architecture
In this session we will propose a new pluggable architecture model for OpenStack Networking back-end architecture, where Physical Networking Infrastructure (PNI) and Virtual Networking Infrastructure (VNI) are managed by different type of plugins. These plugins will be technology specific and their scope is limited to the capabilities of the domain where they belong. OpenStack Networking users will decide which PNI and/or VNI plugin include in their deployment, OpenStack Networking should allow to use one plugin per type but including multiple plugins could be discussed.



(Session proposed by Edgar Magana Perdomo)


Wednesday April 17, 2013 11:50am - 12:30pm
B114

11:50am

testr / testtools feedback/next-steps
A number of projects have adopted testrepository / testtools at this point. ODS seems like a good point to take stock, gather pain points and plan how to address them.

(Session proposed by Robert Collins)


Wednesday April 17, 2013 11:50am - 12:30pm
B110

1:50pm

Calling all Interns and Mentors: Ideas to improve
We know there's always a time crunch, but we want to find good ways to bring in newcomers like interns during the time frame of their availability as students and the open windows for coding. There's a lot of needed bug triage in all the projects, would interns be interested in fixing bugs for the entire internship? How about interns for QA, doc, infrastructure, is there a need and does that fit better with the timeframes we have? Let's discuss ways to improve our internship programs across multiple projects. We'll bring our experiences from the GNOME Outreach Program for Women. We can also brainstorm ways to strengthen our Google Summer of Code application.

(Session proposed by Anne Gentle)


Wednesday April 17, 2013 1:50pm - 2:30pm
B119

1:50pm

Image Cloning to Other Regions
Many Cloud providers have clouds in segregated "regions". AWS just announced the ability for customers to copy images to other regions -- we want to implement the same capability in OpenStack. One difference would be that we'd like to have the same UUID in every region (because the "bits" of the image would be the same).

It would make sense for this to be a Swift-to-Swift transfer, so we'd need to cooperate with Swift on this. (Swift doesn't currently have anything like this for individual files; it's currently a full-container transfer.) The reason we're proposing it as a Glance topic is:
- it would make sense for Glance to be the endpoint for this service (once Glance is ready for exposure in public clouds)
- we want this to be a user operation, not an admin operation
- it would make sense that people will want some kind of metadata sync (determining exactly what this would be is part of the focus of this session)
- Glance may need some enhancements with respect to API calls and notifications to support this

(Session proposed by nikhil komawar)


Wednesday April 17, 2013 1:50pm - 2:30pm
B116

1:50pm

OpenStack Networking and Ceilometer
In this session, we will discuss how OpenStack Networking and Ceilometer can work together to better meter certain aspects of an OpenStack deployment.

(Session proposed by Mark McClain)


Wednesday April 17, 2013 1:50pm - 2:30pm
B114

1:50pm

Roadmap for Ceph integration with OpenStack
The content of this session is collaboratively edited at https://etherpad.openstack.org/roadmap-for-ceph-integration-with-openstack

Although Ceph is already integrated with OpenStack for object and block storage, it can be improved for an easier and more flexible configuration, authentication, metering, monitoring, security, placement, encryption etc.
For instance, volume encryption ( https://blueprints.launchpad.net/nova/+spec/encrypt-cinder-volumes ) and the associated blueprint ( https://wiki.openstack.org/wiki/VolumeEncryption ) should probably be aligned with the experimental dmcrpyt support for Ceph ( http://marc.info/?l=ceph-devel&m=136089346527439&w=4 )
The OpenStack components are either have an intimate knowledge of Ceph ( nova, cinder, ... ) or are loosely coupled with it ( ceilometer, ... ).
Tentative agenda:
* The state of Ceph integration in
* Nova
* Cinder
* Ceilometer
* ...
* High level overview of the existing blueprints related to Ceph.
* Overview of the current state of the Ceph integration with OpenStack
* Presentation of the blueprints that are related to Ceph
* What should be in Havana ?
* What should be in I* ?
* Who can commit to what ?
Related blueprints
* https://blueprints.launchpad.net/nova/+spec/encrypt-cinder-volumes
* https://blueprints.launchpad.net/nova/+spec/flexible-block-device-config
* https://blueprints.launchpad.net/nova/+spec/block-mapping-model
* https://blueprints.launchpad.net/nova/+spec/improve-block-device-handling
* https://blueprints.launchpad.net/nova/+spec/improve-boot-from-volume
* RBD backups to secondary clusters within Openstack
** Geo-replication with RADOS GW http://marc.info/?l=ceph-devel&m=135939566407623&w=4
** Geographic DR for RGW http://marc.info/?l=ceph-devel&m=136191479931880&w=4


(Session proposed by Loic Dachary)


Wednesday April 17, 2013 1:50pm - 2:30pm
B113

1:50pm

Tempest Best Practice Guide
Problem:
* we seem to have conflicting reviewer opinions on what a good patch is. We should try to consolidate on a single culture for good contributions, as it will confuse new contributors less.
* personally would like to formalize that we're starting from the Nova guidelines and moving on from there.

purpose of summit session:
* get aggrement on HACKING rules that we want to enforce
* figure out what additional rules / collateral / documentation we need to help new contributors onboard
* come up with guidelines on what an ideal test looks like
* what refactorings we should do early in Havana to get closer to ideal
* what good changesets look like (agreement on scope and patch series)


(Session proposed by Sean Dague)


Wednesday April 17, 2013 1:50pm - 2:30pm
B110

2:40pm

Firewall-as-a-Service.
Quantum now has the ability to load multiple service plugins. Firewall features could be managed and exposed via a Firewall service plugin (similar to LBaaS service plugin). This is a follow up session from the prior day to discuss the path to implementing FWaaS in Havana.


(Session proposed by Aaron Rosen)


Wednesday April 17, 2013 2:40pm - 3:20pm
B114

2:40pm

Image Performance
A fair number of proposals have been floated for increasing image download/upload performance by giving Nova direct access to the underlying image storage for Glance. In this session we will discuss the right way for Glance to enable these kinds of image access and concrete improvements that can be made in Havana.

One particularly hairy issue is exposing underlying image locations, which sometimes contain sensitive information that cannot be revealed to end users of Glance.

Additional possible performance-boosting topics:
* booting from volumes-as-images
* Image diffs
* Data transfer service

(Session proposed by Mark Washenberger)


Wednesday April 17, 2013 2:40pm - 3:20pm
B116

2:40pm

Improve Host Maintenance and Migrate APIs
This session will include the following subject(s):

Improve Host Maintenance and Migrate APIs:

If there is a pending server failure (i.e. several disks have failed in RAID array, etc) then you want to evacuate the host to perform the maintenance.

If you are patching your hypervisor, you might want to suspend all the VMs onto the local disk, upgrade, then resume those VMs.

To do this manually, it would be good to list all VMs on a host. Ideally we would have some orchestration to help do the full operation.

A related issue is that the Migration and live-migration grew up independently. Lets come up with a plan for an API that unify the migrate and live-migrate behind a single (extended) migrate API.

There is an Etherpad:
https://etherpad.openstack.org/HavanaUnifyMigrateAndLiveMigrate

(Session proposed by John Garbutt)

Streamlining mobility-related scenarios in Nova:

Today in Nova there are multiple scenarios involving VM instance mobility, each following different design approach (target selection, resource tracking, ability to verify success and/or undo, etc). As a first step, these operations must be refactored, to follow a single design approach, and to provide consistent and robust capabilities. Then, higher-level scenarios such as host maintenance, ongoing optimization, VM HA can be properly implemented, leveraging the improved individual mobility operations (also involving certain orchestration logic on top). The goal of this session would be to discuss and agree on these design principles, and outline a roadmap to make the necessary changes during Havana release cycle.

(Session proposed by Alex Glikson)


Wednesday April 17, 2013 2:40pm - 3:20pm
B113

2:40pm

Python3 in OpenStack
Eventually, Python3 projects should be supported in OpenStack. We can't do this all in Havana, but we can get a start.

This session seeks to organize the efforts of the projects toward Python3 compatibility and to identify how and where we can and should ask for help from the Foundation, TCs, and the library, infrastructure projects.

(Session proposed by Eric Windisch)


Wednesday April 17, 2013 2:40pm - 3:20pm
B119

2:40pm

Rackspace testing engine Case Study/Overview
RAX is open sourcing it's internal testing framework used to test our deployed OpenStack instances. We would like to have a session to demo and discuss our approach, how it could be used in the community, etc..

(Session proposed by Sam Danes)


Wednesday April 17, 2013 2:40pm - 3:20pm
B110

3:40pm

Layer 3 Networks
OpenStack Networking networks are currently flat L2 domains - every machine can reach every other machine in one hop, the domain is a broadcast domain and OpenStack Networking's task is to ensure this holds true regardless of where VMs are placed. Limitations on L2 networks hold true - L2 networks are not great at scaling, which is why there's an 'inter' in 'internet', and joining these networks together involves various sorts of encapsulation.

We can add simple layer 3 networks to OpenStack Networking. The network is no longer flat, scales better and has significant advantages for public-facing services, and standard internet routing holds sway with no encaps; it has significant advantages when you're willing to lose L2 and the features that depend upon it.

We will discuss how we could do this with a minimum of complexity, and why it's a good idea.

(Session proposed by Kyle Mestery)


Wednesday April 17, 2013 3:40pm - 4:20pm
B114

3:40pm

Mothball a server
This session will include the following subject(s):

Mothball a server:

Today when you stop a server, it remains on the hypervisor host.

Many users, with less cloud like workloads, want the ability to stop servers when they don't need them, but to retain IP addresses and current disk state (including ephemeral), so they can start it up at some time in the future.

From an operator perspective, this should take up minimal resources. Preferably, only storage space.

Lets look at how best to implement this within Havana. See Etherpad for discussions:
https://etherpad.openstack.org/HavanaMothballServer


(Session proposed by John Garbutt)

Make Nova Stop/ Start operations release physical :

The current Nova stop/start schematics leave an instance associated to the same host, so the scheduler has to keep the resources on that host allocated so that the instance can be re-started. From a service provided perspective this means that it’s hard to offer any financial advantage to stopped instances since they are in effect still consuming the same amount of logical resources.

It’s already noted in the code that the current stop/start should really be renamed power-off/power-on since that is what they actually do.

We would like to be able to exploit boot from volume so that users can “stop” an instance preserving all of its network settings, metadata, instance_id, etc but remove the consumption of any physical resources. On start the instance would be re-scheduled to a new host.

The basic operations would be:
Stop: Power down the instance, and remove any local storage associated with the instance

Start: Reschedule the instance to a new host, and start the instance on that host as if it had been newly created (from a local disk perspective)

Restart: A combined Stop/Start command

Stop should always complete, even if the host the instance is currently running on is down.

Any data in volumes (including boot volumes)would be preserved. Any data in ephemeral discs would be lost.

It would seem logical to take this opportunity to rename the current stop/start to power-off/power-on and re-use the stop/start verbs for this operation.

(Session proposed by Phil Day)


Wednesday April 17, 2013 3:40pm - 4:20pm
B113

3:40pm

Preparation for Rolling DB migrations in Glance
Thanks to some refactoring during Grizzly, Glance is getting closer to supporting rolling db upgrades. But it will still take a lot of work.

Let's take a session to discuss what working support for db upgrades would look like in glance, and how we can move towards it during Havana.

(Session proposed by Mark Washenberger)


Wednesday April 17, 2013 3:40pm - 4:20pm
B116

3:40pm

Stable Branch
The Grizzly cycle was the third cycle where we maintained a stable branch for the previous release. We will look back over the stable/folsom maint efforts and identify successes and failures.

The stable-maint process still has some rough edges and so we will discuss ideas for improvements, ideally setting some specific goals for stable/grizzly maintenance.


(Session proposed by Mark McLoughlin)


Wednesday April 17, 2013 3:40pm - 4:20pm
B119

3:40pm

Tempest - Gap Analysis - Identify new testsdevelop

1.Identify test gaps for all core services Swift,Nova, Keystone, Cinder and Quantum projects .
2.From gaps, identify new tests to be written to have coverage.
3.Discuss as part of design session that leads to new blueprints and blueprint ownership for Havana release

Etherpad: https://etherpad.openstack.org/havana-gap-analysis

(Session proposed by Ravikumar Venkatesan)


Wednesday April 17, 2013 3:40pm - 4:20pm
B110

4:30pm

Cinder API v2: a new hope for better validation
In this session I'll be presenting the state of the Cinder rest API in v2 for G. In addition, I'll be going over some small things that need to happen in H.

Duncan Thomas will then be leading improvements that need to happen in H for better validation in the rest API. Overall, accepting the fact that users don't usually have access to the logs, so we need to present errors up front when possible.

(Session proposed by Mike Perez)


Wednesday April 17, 2013 4:30pm - 5:10pm
B110

4:30pm

Handling long running periodic tasks
nova-periodic caused a lot of discussion on the mailing list. I now have a proposed implementation, and I'd like to discuss with others if we like the proposed approach, or have a better plan.

(Session proposed by Michael Still)


Wednesday April 17, 2013 4:30pm - 5:10pm
B113

4:30pm

Intro to Ceilometer Architecture
This session will provide a walk-through of the existing pieces of ceilometer as an introduction for new contributors and a refresher for the existing team, to serve as a basis for the rest of the discussions during the summit.

(Session proposed by Doug Hellmann)


Wednesday April 17, 2013 4:30pm - 5:10pm
B116

4:30pm

OpenStack Networking Mini Sessions
This session will include the following subject(s):

Network: Multi-VLAN Registration Protocol support:

MVRP is a standards-based layer-2 protocol for the automatic configuration of VLANs. With MVRP, VLANs can be configured at the end-port, or the network device connected to the switch port, and the VLAN trunks are dynamically created on MVRP enabled switches. Support for MVRP gives OpenStack Networking a standards based alternative to proprietary VLAN provisioning and configuration systems, and simplifies the work of configuring nova with VLANnetworking.

(Session proposed by ChristopherMacGown)

SR-IOV NIC support :

Although generic SR-IOV support is mostly nova story, SR-IOV NIC has some specific features from networking point of view. For example, how to keep the network isolation when the VF NIC is assigned to VM directly, how to live migrate the VM with VF NIC. These deserve careful discussion in H summit.

(Session proposed by jiang, yunhong)

Integration tests based on NaaS requirements:

co-author: Mahankali, Sridhar
co-author: Luo, Xuan

* requires one full session

The problem with Quantum tests in Tempest is that the test cases are tightly coupled to what the Quantum API provides. Therefore the test cases have no significant difference with unit tests other than not having stubbed objects.

In order to prove that Quantum and its plugins are production ready, we need a thorough collection of integration tests based on NaaS requirements.

I've gathered some NaaS features as a checklist in the document below.

https://docs.google.com/document/d/1y8RoCPoYMTT8l6oUgzIRBz8S_zDjfXYQY3v6IOc_CYM/edit?usp=sharing

I would like to talk with our members and go through the list and discuss about the test scheme and priority on each of the features.

(Session proposed by Zhongyue Luo)


Wednesday April 17, 2013 4:30pm - 5:10pm
B114

4:30pm

Vulnerability management: infra needs, scoring...
This session will be focused on improvements to the Vulnerability Management process.

In particular we'll review infrastructure plans to better support the Vulnerability Management Team (VMT) workflow (simplified and more reliable patch testing, patch review/approval...), as well explore the option of rating the vulnerabilities (CVSS scoring...).

If there is time remaining, we'll look into other improvements we could make to the process.

(Session proposed by Thierry Carrez)


Wednesday April 17, 2013 4:30pm - 5:10pm
B119

5:20pm

Cinder Smart Shutdown
Cinder needs a nice way to gracefully shutdown it's services and queue up new tasks for the next restart. Once services come back up any queued tasks should be kicked off. Any existing services during the shutdown request should be allowed to finish before termination of the cinder process.

(Session proposed by Walt)


Wednesday April 17, 2013 5:20pm - 6:00pm
B110

5:20pm

Feedback from Ceilometer users
This session has for goal to gather feedback from Ceilometer users. We should invite anyone whom has deployed Ceilometer to join this session to quickly explain to us (5 min per user):
* their architecture
* their pains
* their successes
So that we can learn an improve.

Users can be admins, devops, dev, anyone that had to deploy or interface with Ceilometer.

(Session proposed by Nick Barcet)


Wednesday April 17, 2013 5:20pm - 6:00pm
B116

5:20pm

Nova API validation framework
Overview:
Nova has many RESTful APIs, and not all API parameters are completely validated.
To implement comprehensive validations, I'd like to propose Nova API validation framework.
The benefits of this framework will be the followings:
* Validate every API parameters.
* Unify an error message format of the response, if the same cause.
ex) ".. is too short.", ".. is too long.", ".. is not integer."
* Clarify the API parameter definitions.
* Clean up codes, because of merging error handling methods.

Talking points:
I created a prototype for API validation framework, and found some points which need discussions.
I'd like to talk/discuss about the following in this session:
* Overview of API validation framework
* Plans for Havana release
** Current development status
** What are TODOs for Havana?
** Migration plans (How to apply the framework to all APIs smoothly)
Need Nova community agreement because of many implementations and reviews.
Need useful framework to implement each API validation step by step.
Need some implementation rules to achieve all APIs by Havana release.
* Next features of the framework
* And more

Related Summit Talk:
There was a similar session in Grizzly Design Summit, and the session contained good discussions for API validation:
* Refactoring API Input Validation:
https://etherpad.openstack.org/grizzly-nova-input-validation
In the session proposed by me, I'd like to discuss how to achieve implementations of all APIs based on some prototype.


(Session proposed by Ken'ichi Ohmichi)


Wednesday April 17, 2013 5:20pm - 6:00pm
B113

5:20pm

Technical Committee membership evolution
The Technical Committee is a representation of the technical contributors to the OpenStack project, tasked with solving cross-project issues, serve as appeals board and consider new projects for inclusion. The membership is currently set to all PTLs + 5 directly-elected members.

This session is a workshop to discuss future evolution of this membership for the "I" cycle, with two goals in mind: avoid committee bloat as we grow the number of projects in OpenStack, and have good representativity.

(Session proposed by Thierry Carrez)


Wednesday April 17, 2013 5:20pm - 6:00pm
B119

5:20pm

VPN-as-a-Service
As Quantum now is gearing towards supporting a Multi-Plugin Support Approach, one of the Service Types is VPN. In this session we will discuss how VPN's can be configured, provisioned and managed as a service through Quantum. This is the follow up session to continue the previous day's discussion with the goal of determining the work to be completed during Havana.

(Session proposed by Mark McClain)


Wednesday April 17, 2013 5:20pm - 6:00pm
B114
 
Thursday, April 18
 

9:00am

Cinder Update for Disk Encryption
There is a Nova blueprint called Encryption of Attached Cinder Volumes
that has a working prototype. Encrypted volumes add new challenges for
Cinder operations such as snapshots and cloning, particularly with
regard to key management. The goal of this session is to enumerate the
Cinder operations that must become "encryption-aware" in Havana and to
outline different implementation strategies to support these operations.

(Session proposed by Laura Glendenning)


Thursday April 18, 2013 9:00am - 9:40am
B110

9:00am

Continuous-deployment for upstream Openstack
Continuous deployment - the production version of the well known continual integration process - is very appealing to large deployers: it reduces the time to roll out security patches and reduces the risk of each production push.

However, CD isn't something that can be bolted on - like CI it requires the upstream code change process to support it (for instance, with CI the test suite has to be fast enough to run per-commit).

With CD trunk has to be always deployable, be able to run stably with skewed service versions and for DB migrations to be extremely fast.

In this session I want to get the ball rolling on us stepping up to CD from our current CI system, for at least all the core projects.

I expect this to involve assessing how much interest there is from contributors, as well as technical discussion on the logistics and overheads of delivering a CD ready trunk.

(Session proposed by Robert Collins)


Thursday April 18, 2013 9:00am - 9:40am
B119

9:00am

Incremental improvement grab-bag
This session will include the following subject(s):

Incremental improvement grab-bag:

There are several incremental improvements that we should talk about, but that won't require a full hour session to discuss (I hope).



(Session proposed by Doug Hellmann)

Enable/Disable/Configure a pollster in runtime:

When using ceilometer for monitoring, sometimes the users want to enable/disable some pollsters which are only for testing/debugging purpose in runtime, without modifying the configuration file and restarting the agent.

Besides, some users might want to ask a pollster only to monitor part of the resources available to it, e.g. only to monitor one specific nova instance. The users need to pass the instance UUID as a configuration-paramter to the pollster in runtime.

We might need to design a framework to allow the user to use the "management-API" to do the following things in the run-time:
- enable/disable a pollster
- get/set configuration parameter for a pollster
- ask a pollster to immediately start polling, instead of waiting for other pollsters in the same polling task to finish before it can start poll.

That framework could also be extended to manage publishers.

(Session proposed by Lianhao Lu)


Thursday April 18, 2013 9:00am - 9:40am
B116

9:00am

OpenStack Networking and Nova - Part 2
OpenStack Networking and Nova, part 2. See part 1 here, in the Networking track:

http://summit.openstack.org/cfp/Details/161

Part 2 is where we will get back together near the end of the design summit to recap our plans for OpenStack Networking+Nova and identify specific work that needs to get done by both projects to accomplish our goals.

(Session proposed by Russell Bryant)


Thursday April 18, 2013 9:00am - 9:40am
B113

9:00am

SAML, OAuth 2, and SCIM
This session will include the following subject(s):

SAML, OAuth 2, and SCIM - Overview and Application:

A discussion of how identity standards may apply to keystone, and how keystone may wish to align itself with these standards through Havana and beyond.

A brief tour will be given to level set the room on the following current and approaching standards. It is recommended that anyone wishing to participate in the discussion read the attached links for background information in order to prepare.

- SAML
An XML-based identity assertions commonly used for cross-domain single sign-on
(A.K.A Federation) for Web SSO and Web Services (WS-*).
IETF drafts describe use with OAuth 2.0.

Executive Overview: http://bit.ly/16Hn35X

- OAuth 2 - token based authentication for web applications and APIs. Defines the client
software as a role. Separates issuing tokens from how you use a token. Token issuance is
defined both for browsers and for REST clients using a username/password. Token
format is not defined by OAuth2, but one proposed standard format is JWT.

OAuth2 Simplified: http://bit.ly/14aaH6U

- JWT - JSON Web Tokens, an upcoming standard format for structured tokens
(containing data) which are integrity protected and optionally encrypted.

JWT spec: http://bit.ly/15YAKMx

- SCIM - cross-domain user account creation and management. REST API for CRUD
operations around user accounts

Overview: http://www.simplecloud.info/

(Session proposed by David Waite)


Thursday April 18, 2013 9:00am - 9:40am
B114

9:50am

Dependency Management
In light of the recent threads and issues surrounding PyPI, openstack/requirements and version pinning, I'd like to discuss dependency management as it relates to OpenStack froma much higher vantage point. I think we need to go all the way back to use cases and requirements and set them out very clearly. Then, armed with that, we can assess various technology solutions for dealing with them both for CI and for people consume our software.

(Session proposed by Monty Taylor)


Thursday April 18, 2013 9:50am - 10:30am
B119

9:50am

Double Entry Auditing of collected metrics in CM
In order to offer SEC-Compliant billing we need to validate collected metrics from two sources. The needs to be an audit trail for important metrics such as instance lifecycle, bandwidth and storage usage. How might this be accomplished with CM?

(Session proposed by Sandy Walsh)


Thursday April 18, 2013 9:50am - 10:30am
B116

9:50am

Generic support for external authn/authz
We already support some mechanisms for plugging in authentication methods, but we need to evolve the keystone architecture to enable deployers and cloud providers to use their chosen set of authn/authz. This is not just about plugging things into the back of keystone, but needs to take into account how tokens might be validated (e.g. we do PKI today, but how would we cleanly enable some other standard token format?)

Goals for session:
- Agree proposal for how we split the authentication & authorization in terms of API, laying the groundwork for us to expand the supported set of technologies
- Agree how alternative authn/z and their token generation fits into the above structure
- Agree where and how plugin points will be provided for such alternatives, including within auth_token middleware
- Show an example proposal for OAuth (Authorization)with OpenID Connect (Authentication) that match the above.

(Session proposed by Henry Nash)


Thursday April 18, 2013 9:50am - 10:30am
B114

9:50am

Multi-Attach and Read Only Volumes
There's been some discussions about introducing a special Read Only volume that can be attached to multiple instances simultaneously. Initial thoughts are something like creating a R/O volume from an existing volume, or converting an existing volume.

Additionally a number of folks would like to see R/W multi-attach which can be discussed as well.

(Session proposed by John Griffith)


Thursday April 18, 2013 9:50am - 10:30am
B110

9:50am

VMware compute driver roadmap session
This session will include the following subject(s):

VMware compute driver roadmap session:

Brainstorming session to help coordinate among everyone who is planning on contributing the to VMware compute driver during the havana session.

Goal is to avoid any more duplicate work, identify key gaps, and to prioritize what new features will be most valuable.


(Session proposed by dan wendlandt)

Nova Proxy compute driver for vCenter:

This session is to cover the following blue prints.

1. Multiple VMware vCenter Clusters managed using single compute service
2. VMware Resource Pools as compute node
3. VMware vCenter compute driver and scheduler enhancements to publish and consume cloud capacity of clusters and resource pools
4. VMware vCenter templates available as glance images
5. Nova changes for VMware vCenter templates

(Session proposed by Srinivasa Acharya)


Thursday April 18, 2013 9:50am - 10:30am
B113

11:00am

API improvements for Ceilometer
This session will include the following subject(s):

API improvements for Ceilometer:

The API needs to evolve in order to solve more advanced questions from billing engines such as:

- Give me the maximum usage of a resource that lasted more than 1h
- Give me the use of a resource over a period of time, listing changes by increment of X volume over a period of Y time
- Provide a GROUP BY function
- Provide additional statistical function (Deviation, Median, Variation, Distribution, Slope, etc...) which could be given as multiple results for a given data set collection


(Session proposed by Nick Barcet)

Ceilometer API extensions:

Some enhancements to the API would allow support for a broader set of use cases.

(Session proposed by Phil Neal)


Thursday April 18, 2013 11:00am - 11:40am
B116

11:00am

libvirt driver in Havana
This session is to discuss improvements we would like to make to the libvirt driver in the Havana cycle. Feel free to show up with additional ideas you would like to discuss.

This session will include the following subject(s):

libvirt console logging:

libvirt console logging is still broken, and it has been for ages. We're re-written it at least three times. Let's come up with a plan for fixing it that will actually work this time.


(Session proposed by Michael Still)

LXC Block Devices:

Use the libvirt-lxc functionality to mount block devices rather than have NOVA do it for us. Newer versions of libvirt allow block devices inside the container in order for it to be accessible so we should change the way NOVA mounts the images before we start the container.


(Session proposed by Chuck Short)


Thursday April 18, 2013 11:00am - 11:40am
B113

11:00am

refactor attach code
Cinder now has copy/paste code from Nova's libvirt to do attach of a volume to the cinder node. This is done so cinder can copy volume contents into an image and then stuff the image into glance. This currently only works for iSCSI, but we need the same capability for Fibre Channel.

We shouldn't copy/paste the code from Nova's libvirt code. We should look at migrating the attach code from Nova into oslo and reuse that code both in Nova and Cinder


Other options are using a worker VM to do the copying or a library in cinder.

(Session proposed by Walt)


Thursday April 18, 2013 11:00am - 11:40am
B110

11:00am

Scaling & Performance
This session will include the following subject(s):

Scaling/Performance of Keystone:

A number of recent reports have indicated that we have potential scaling and performance issues with Keystone. Examples:
1) Sequential execution when hit by a large batch of requests (200 x create user)
2) Inefficient use of SQL for individual requests (e.g. we don't use PK access for filtering by a PK variable)

The solution is likely multi-fold:
a) Should Keystone use the same multi-process wsgi approach as other projects?
b) Re-evaluation of the backend drivers to enable most efficient use of SQL/LDAP

(Session proposed by Henry Nash)


Thursday April 18, 2013 11:00am - 11:40am
B114

11:00am

Sorting out test runners, wrappers and venvs
We've gotten ourselves into a pickle. Across the projects, we have a combination of tox and run_tests.sh and some projects have migrated to testr from nose and some havent. People are confused about how to run tests, and about how jenkins runs tests - especially since most projects have at least 2 different ways IN THE TREE.

We need to sit down, in a room, make a actual plan for moving forward, and then do it.

(Session proposed by Monty Taylor)


Thursday April 18, 2013 11:00am - 11:40am
B119

11:50am

Alarm Threshold Evaluation

This session will include the following subject(s):

Distributed & scalable alarm threshold evaluation:

A simple method of detecting threshold breaches for alarms is to do so directly "in-stream" as the metric datapoints are ingested. However this approach is overly restrictive when it comes to wide dimension metrics, where a datapoint from a single source is insufficient to perform the threshold evaluation. The in-stream evaluation approach is also less suited to the detection of missing or delayed data conditions.

An alternative approach is to use a horizontally scaled array of threshold evaluators, partitioning the set of alarm rules across these workers. Each worker would poll for the aggregated metric corresponding to each rule they've been assigned.

The allocation of rules to evaluation workers could take into account both locality (ensuring rules applying to the same metric are handled by the same workers if possible) and fairness (ensuring the workload is evenly balanced across the current population of workers).

Logical combination of alarm states:

A mechanism to combine the states of multiple basic alarms into overarching meta-alarms could be useful in reducing noise from detailed monitoring. 

We would need to determine: 

* whether the meta-alarm threshold evaluation should be based on notification from basic alarms, or on re-evaluation of the underlying conditions 

* what complexity of logical combination we should support (number of basic alarms; &&, ||, !, subset-of, etc.) 

* whether an extended concept of simultaneity is required to handle lags in state changes

The polling cycle would also provide a logical point to implement policies such as:

* correcting for metric lag
* gracefully handling sparse metrics versus detecting missing expected datapoints
* selectively excluding chaotic data.

This design session will discuss & agree the best approaches to manage this distributed threshold evaluation, while seemlessly handling up- and down-scaling of the worker pool (i.e. fairly re-balance and avoid duplicate evaluation).




Speakers
avatar for Eoghan Glynn

Eoghan Glynn

Principal Engineer, Red Hat
Eoghan is a Principal Software Engineer at the Red Hat OpenStack Infrastructure group, and is serving as Technical Lead for the OpenStack Telemetry Program over the Juno & Kilo cycles. Prior to OpenStack, Eoghan was at Amazon working on AWS monitoring services,.


Thursday April 18, 2013 11:50am - 12:30pm
B116

11:50am

Availability zone and region management
Scenario:

A large deployment has a single Keystone endpoint serving multiple regions, each with multiple availability zones.

A user wishes to get a list of regions in a deployment. This is currently not possible.

A user wishes to get a list of availability zones in each region in a deployment. This also is currently not possible.

A user wishes to use novaclient (or any of the openstack clients besides keystone) to operate against availability zone 2 in region A of a deployment. The user would like to be able to specify --os-availability-zone on the command line like they do when specifying --os-region-name on the command line and have novaclient negotiate to the correct availability zone's compute endpoint automatically, in much the same way as is currently done if there is only a single endpoint for compute returned in the service catalog part of the authentication response.

Unfortunately, this, too, is currently not possible. The user must know ahead of time which compute endpoint URI to supply to novaclient for availability zone 2 in region A and pass that URI to novaclient with the --uri CLI option.

Just because this is how Amazon Web Services requires one to target compute or volume operations against different availability zones does not mean this is either user-friendly or good design -- it probably has more to do with the homegrown way that Amazon Web Services grew up than any architectural design the AWS developers made.

The natural place for information about a deployment's regions and availability zones is Keystone. It is NOT Nova, since availability zones and regions affect all services, not just compute.

I'd like to discuss adding support for region and availability zone CRUD to the v3 or v4 API of Keystone.

(Session proposed by Jay Pipes)


Thursday April 18, 2013 11:50am - 12:30pm
B114

11:50am

Enhanced Platform Awareness – For PCIe Devices
This session will include the following subject(s):

Enhanced Platform Awareness – For PCIe Devices:

Background:
There is a growing movement in the telecommunications industry to transform the network. This transformation includes the distinct, but mutually beneficial disciplines of Software Defined Networking and Network Functions Virtualization. One of the challenges of virtualizing appliances in general, and virtualizing network functions in particular, is to deliver near native (i.e. non-virtualized) performance. Many virtual appliances have intense I/O requirements, many also could benefit from access to high performance accelerators for workloads such as cryptography, and others would like direct access to GPUs.

There is also a growing demand for the cloud OS to have greater awareness of the capabilities of the platforms it controls. The Enhanced Platform Awareness (EPA) related updates proposed to OpenStack aim to enable better informed decision making related to VM placement and help drive tangible improvements for cloud tenants. This EPA proposal focuses on how to leverage PCIe devices in cloud infrastructure, and looks in particular at Single Root IO Virtualization (SR-IOV) as one technology that can be used to dramatically improve the performance in the virtual machine.

Proposal:
During this design session, the proposal is that the following topics will be covered:
• Discuss the use cases for OpenStack managed PCI devices and allocation of Virtual Functions with SR-IOV.
• Discuss the design proposal for enabling an SR-IOV solution. This shall include:
o Nova enhancements to include a level of awareness of the PCI devices that are included in the platforms.
o Scheduler/filter extensions to find platforms with a specified PCI device.
o Hypervisor driver (libvirt in this first instance) additions to provision a VM with an SR-IOV PCI device.
• The majority of the focus will be on accelerator type of devices. However, some concept proposals relating to how this could be extended to the allocation of SR-IOV Virtual Functions for NIC devices will be discussed.
• Agree on a direction for a solution.

This design session and the related blueprint builds upon ideas already proposed in forum discussions and the following blueprints:
nova/xenapi-gpu-passthrough
nova/pci-passthrough
nova/pci-passthrough-and-sr-iov


(Session proposed by Adrian Hoban)

libvirt pcipassthru support:

Add libvirt pcipassthru support for nova.

(Session proposed by Chuck Short)


Thursday April 18, 2013 11:50am - 12:30pm
B113

11:50am

Failure management in the gate
In the grizzly run up we had a few really bad days where gate resets were fast and furious, and it would take, on average, 6 or 8 hours to merge. This led us to a conversation where the nova dev team was seriously considering turning off the gate checking entirely.

This session would be on brainstorming the ways to find and get to the bottom of failures in the gate faster, and hopefully reduce them over time.

It would include:
* ways to optimize gate resets. When we know a test has failed, can we reset early, instead of waiting for the train wreck to complete?
* ways to get to the bottom of fails fast - the recheck page was a good start, but it turned out to be pretty static info, and people really corrupted the data by picking bugs poorly
* ways to analyze fails (some sort of failure dashboard), figure out the infra restriction on tooling for this.
* ways to alert users so the answer to "is there a problem" isn't keep an -infra tab open and scroll back.

(Session proposed by Sean Dague)


Thursday April 18, 2013 11:50am - 12:30pm
B119

11:50am

Standardizing vol type extra spec as driver input
Cinder back-end drivers can provide differanitate service by looking into volume type extra specs. Right now how driver treat extra specs is still up to driver developer. In order to make sure volume type/extra specs defnition more portable, standardizing the way driver extract requirements from volume type extra spec is very important. In this session, we'll discuss possible solutions.

(Session proposed by Huang Zhiteng)


Thursday April 18, 2013 11:50am - 12:30pm
B110

1:30pm

Bare Metal Testing
This session will include the following subject(s):

Bare Metal Testing:

Discussion around testing full-bare metal deploys on every commit. We've got a lot of other stuff figured out at this point, and we got bare metal stuff in nova for grizzly - we need to test it, and test using it.

(Session proposed by Monty Taylor)

refstack - a description of an OpenStack cloud:

Recent discussions around FITS testing led to the idea of a thing called "refstack", which would allow us to describe and deploy a reference OpenStack against which we can test. There have been some discussions on implementation, with the leading contended being a heat template which has a clear api contract with the nodes it's creating and the metadata it's going to pass them, such that various technologies (chef, puppet, juju) could be designed to expect the same per-node metadata. However, we should probably actually talk about that in person.

(Session proposed by Monty Taylor)


Thursday April 18, 2013 1:30pm - 2:10pm
B119

1:30pm

Cinder Capability Standardization
Currently volume types are used for two main purposes: for the capability filter scheduler to decide where to place volumes, and for drivers to create volumes with certain parameters/capabilities.

It is important for volume types to be both standardized and flexible. For example, if two different back-ends support feature foo, then they should report it in the same way. The scheduler should choose either back-end, and that back-end should have a key to enable feature foo for the given volume.

We propose to:
- Maintain a list of mandatory capabilities that all drivers must report (one current example is free space). These capabilities must be generic and make sense for all back-ends.
- Maintain a list of recommended capabilities that drivers may report. These should still be generic, but well-defined, and used across back-ends.
- Drivers may report any additional capabilities that they want where they are specific to that back-end.
- Administrators should be able to specify capabilities for storage via the configuration file if the driver doesn't report them.

The goal of this session is to discuss the mechanisms for managing capabilities (e.g., proposing new ones, listing existing ones), and perhaps to come away with an initial list.

(Session proposed by Avishay Traeger)


Thursday April 18, 2013 1:30pm - 2:10pm
B110

1:30pm

Key Manager
OpenStack services such as Cinder are offering encryption for volumes, an off-shoot implementation for Swift exists, Glance is a logical next candidate. Encryption involves keys, their creation, access control, and secure maintenance. Several blueprints touch on it. Let us design and develop a high availability solution. Perhaps a sub-service of Keystone (on par with Identity). With PKI tokens and X509 certificates in OpenStack now, the encryption keys could be encoded before being saved, for example, a volume encryption key would be "owned" by Cinder, so it could be encoded using Cinder's public-key.
//key//. Would it be useful to have a reference count associated with keys, when all objects associated with it deleted, the key may be deleted. Support key caching on the services to reduce chattiness with Keystone.

(Session proposed by Malini Bhandaru)


Thursday April 18, 2013 1:30pm - 2:10pm
B114

1:30pm

Time series data manipulation in nosql stores

This session will include the following subject(s):

Time series data manipulation in nosql stores:

Ceilometer currently supports multiple storage drivers (mongodb, sqlalchemy, hbase) behind a well-defined abstraction.

The purpose of this design session is to discuss how well suited the existing nosql stores are to the efficient manipulation of time-series data.

The questions to be decided would include:

* whether we could optimize/improve our existing schemas in this regard

* whether we should consider a storage driver based on Cassandra in order to take advantage of it's well-known suitability for TSD

(Session proposed by Eoghan Glynn)

The dotted line between metering and metric/alarms:

There is clear commonality in the data acquisition & transformation layers for gathering metering and metric data.

However the further we venture through the pipeline, there are also operation concerns around over-sharing of common infrastructure in the transport and storage layers.

We need to tie to down exactly where we see the dotted line between the handling of metering and metric data, deciding whether:

* a common conduit in the form of AMQP should be used for publication (for example given that during a brownout in the RPC layer, we would need a timely metric flow more than ever)

* a common storage layer should be used for persistence (for example given that data retention requirements may differ significantly)

* a common API layer should provide aggregation (for example given that certain aggregations such as percentile may make far more sense for metric rather than metering data)




Speakers
avatar for Eoghan Glynn

Eoghan Glynn

Principal Engineer, Red Hat
Eoghan is a Principal Software Engineer at the Red Hat OpenStack Infrastructure group, and is serving as Technical Lead for the OpenStack Telemetry Program over the Juno & Kilo cycles. Prior to OpenStack, Eoghan was at Amazon working on AWS monitoring services,.


Thursday April 18, 2013 1:30pm - 2:10pm
B116

1:30pm

XenAPI Roadmap for Havana
Get all developers working on XenAPI related features in the same room. Share ideas and set priorities for Havana.

Ideas are forming on etherpad:
https://etherpad.openstack.org/HavanaXenAPIRoadmap

(Session proposed by John Garbutt)


Thursday April 18, 2013 1:30pm - 2:10pm
B113

2:20pm

ACL(access control list) Rule for Cinder Volumes
The volume can only be accessed by a certain user in a certain project. There is no ACL rule for cinder volume. Adding ACL configration can make the volume read or written by other users or other projects. The volume creator has the capability to edit the ACL rule. The ACL model can be similar to the one in Amazon S3.
Use case: several users can share the data in one volume.

(Session proposed by houshengbo)


Thursday April 18, 2013 2:20pm - 3:00pm
B110

2:20pm

Baremetal Hypervisor Driver - Next Steps
Nova now includes a functional baremetal driver with support for PXE-based imaging and IPMI-based power control. This has generated a lot of interest from datacenter operations' teams and many questioning when it will support: hardware discovery, configuration, and BIOS / firmware management; other types of hardware, such as ARM processors and PDU power control; and the possibility to interface with proprietary control modules (iLO / DRAC / etc).

I believe the baremetal driver should provide a means for hardware vendors to easily extend it and take advantage of their control modules, while also providing generic tooling to accomplish these common operational tasks. As the complexity of the baremetal driver grows, its database needs will also grow, and the possible interactions between this "hardware inventory database" and quantum, cinder, healthnmon, etc, will need to be understood.

(Session proposed by Devananda van der Veen)


Thursday April 18, 2013 2:20pm - 3:00pm
B113

2:20pm

LDAP Integration
This session will include the following subject(s):

Keystone LDAP Integration for Enterprises:

In this proposal we present use cases for how Keystone needs to integrate with an enterprise's existing LDAP infrastructure. We have identified the following as valid use cases for which Keystone should be extended to ensure a seamless integration with existing LDAP environments:

1. A Keystone LDAP integration model whereby Keystone is integrating with a read-only LDAP and leveraging the contents of this LDAP for both authenticating users and also for authorization (e.g. project, roles, etc).

2. A Keystone LDAP integration model whereby Keystone is integrating into multiple LDAPs such that it utilizes a read-only LDAP for authenticating users and then leverages a separate read/write LDAP for performing authorization.

3. A Keystone LDAP integration model whereby Keystone is integrating into multiple LDAPs such that it utilizes a read-only LDAP for authenticating users and group information and then leverages a separate read/write LDAP for accessing projects and role information

4. A Keystone LDAP integration model whereby Keystone is integrating into multiple LDAPs such that it utilizes a read-only LDAP for authenticating users and then leverages a separate read/write SQL backend for performing authorization.

5. A Keystone LDAP integration model whereby a separate LDAP or SQL backend can be chosen for authentication/authorization for each domain and mechanism are in place to prevent leakage of sensitive data from one domain to another. Note: This topic will be covered in more detail in http://summit.openstack.org/cfp/edit/158.

6.Multiple Keystone Active Directory integration models that are similar to the ones listed above except integration is with Active Directory instead of other implementations of LDAP.

In this session our expectation is to discuss these use cases, receive feedback, and identify other use cases during the session as well.

(Session proposed by Brad Topol)


Thursday April 18, 2013 2:20pm - 3:00pm
B114

2:20pm

OpenStack CI logging
OpenStack continuous integration generates a fair amount of logs. Devstack, tempest, and testr are quite verbose in what they have done. This information is very valuable, but is also very dense. Let's discuss which logs are important, how long they should be archived for, how they should be archived, and how they should be indexed.

Current ideas are that we might try to archive six months of logs and use logstash/elasticsearch to index a smaller subset of the logs. However, this is not set in stone and feedback will be very useful hence this design session.

This session is motivated by the recent issues with the log server running out of disk space and disagreement on which logs we actually need. The infra team wants to make sure we are logging information that is useful.

(Session proposed by Clark Boylan)


Thursday April 18, 2013 2:20pm - 3:00pm
B119

2:20pm

Simple messaging action for Alerting
As we develop alerting in Ceilometer, it might be a good idea to provide a simple destination endpoint for alerts to be forwarded as:
- events on the oslo RPC bus
- emails (SMTP)
- SMS
- Nagios alerts

(Session proposed by Nick Barcet)


Thursday April 18, 2013 2:20pm - 3:00pm
B116

3:20pm

Alarm state and history management

We need to tie down the requirements for managing the state and history of alarms, for example providing:

* an API to allow users define and modify alarm rules

* an API to query current alarm state and modify this state for testing purposes

* a period for which alarm history is retained and is accessible to the alarm owner (likely to have less stringent data retention requirements than regular metering data)

* an administrative API to support across-the-board querying of state transitions for a particular period (useful when assessing the impact of operational issues in the metric pipeline)




Speakers
avatar for Eoghan Glynn

Eoghan Glynn

Principal Engineer, Red Hat
Eoghan is a Principal Software Engineer at the Red Hat OpenStack Infrastructure group, and is serving as Technical Lead for the OpenStack Telemetry Program over the Juno & Kilo cycles. Prior to OpenStack, Eoghan was at Amazon working on AWS monitoring services,.


Thursday April 18, 2013 3:20pm - 4:00pm
B116

3:20pm

Centralized Quotas
The quota situation is getting out of control. There are already a number of quota implementations for Swift. I expect that Glance will get its own implementation of quotas next followed closely by Quantum and probably Cinder. Each service will have its own implementation and different API to boot. This is definitely suboptimal.

We need centralized quotas, otherwise it will become an operations nightmare.

This session will be focused on how that goal can be achieved and who would be willing and able to contribute to it.

I've started an Etherpad at https://etherpad.openstack.org/HavanaCentralizedQuotas for this session and to track the existing efforts on quotas. Feel free to add any implementations I've missed.

To keep the session focused we won't be discussing Nova quotas. If we can nail down a way forward for centralized quotas, perhaps one day we can roll Nova quotas into it.

(Session proposed by Everett Toews)


Thursday April 18, 2013 3:20pm - 4:00pm
B114

3:20pm

Cinder FC SAN Zone/Access Control Manager
Fibre Channel (FC) block storage support was added in Grizzly release (cinder/fibre-channel-block-storage blueprint).

Currently, there is no support for automated SAN zoning and requires FC SANs to be either pre-zoned or open zoned.

We propose to:

- Add support for FC SAN Zone/Access Control management feature - allowing automated zone lifecycle management in the attach/detach entry points of volume manager (when fabric zoning is enabled).
- Introduce FibreChannelZoneManager plug-in interface API for automated zone lifecycle management - enables SAN vendors to add support for pluggable implementations.

Use Cases

- Defaults and capabilities - support for FC SAN configuration settings (e.g. zoning mode, zoning capabilities etc.)
- Add active zone interface to add the specified zone to the active zone set
- Remove active zone interface to remove the specified zone from the active zone set
- Support for provisioning and enumerating SAN/Fabric contexts

In a future OpenStack release, this can be extended as a general SAN access control mechanism to support both iSCSI and FC.

The goal of this session is to discuss the mechanisms for automated zone lifecyle management in OpenStack/Cinder.

(Session proposed by Varma Bhupatiraju)


Thursday April 18, 2013 3:20pm - 4:00pm
B110

3:20pm

Multi-hypervisor-clouds
Operators deploying openstack using openstack currently have to run two clouds: one undercloud using bare metal, and one overcloud that is fully virtualised and runs on the underlying cloud. This is a bit of a pain, as things like quantum really are common - and operator workloads that could be virtualised cannot be.

So - lets figure out whats needed to run a single cloud combining baremetal and virtual hypervisors. (Obvious issues are different default networks depending on hypervisor, and scheduling workloads onto virtual/real machines).

(Session proposed by Robert Collins)


Thursday April 18, 2013 3:20pm - 4:00pm
B113

3:20pm

Review of non-infra-managed tooling
A number of tools we use (and run on our infrastructure) are not under direct Core Infrastructure team management. We should review them and see if they are still necessary, what is the plan to bring them under control (if any), and what improvements can be made to them over the next cycle.

This includes:
* ODSREG
* Release status, bugdaystats
* openstack.org website
* ...

(Session proposed by Thierry Carrez)


Thursday April 18, 2013 3:20pm - 4:00pm
B119

4:10pm

Ceilometer support for advanced billing models
We need to ensure the metering architecture can support advanced billing models like direct Nova billing, Windows instance and PaaS service controller-owned Nova instances, application billing, and variations on quantity and overage billing

(Session proposed by Phil Neal)


Thursday April 18, 2013 4:10pm - 4:50pm
B116

4:10pm

Cinder Volume Migration
We propose to add a volume migration feature to OpenStack - allowing for a volume to be transferred to a location and/or have its type changed. The following use cases will clarify the intent:

1. An administrator wants to bring down a physical storage device for maintenance without interfering with workloads. The admin can first migrate the volumes to other back-ends, which would result in moving volume data to the other back-ends.
2. An administrator would like to modify the properties of a volumes (volume_type). In this use case, the back-end that stores the volume supports the new volume type, and so the driver may choose to make the change internally, without moving data to another back-end.
3. Same as case #2, but where the current back-end does not support the new volume type. In this case, the data will be moved to a back-end that does support the new type, in a manner similar to case #1.
4. A Cinder component (be it the scheduler or a new service) may optimize volume placement for improved performance, availability, resiliency, or cost.

The goal is for the migration to be as transparent as possible to users and workloads - it should work for both attached and detached volumes, as well as (possibly) attaching/detaching mid-operation.

Volume migration will be enabled for all back-ends (for example, by using a generic copy function for detached volumes, and QEMU's live block copy feature for attached volumes). In addition the API will allow for storage back-ends to implement optimizations depending on source and target support (for example, using the storage controllers' remote mirroring function); this will require additions to the driver interface to report on the capabilities of the storage (based upon existing mechanisms) and to invoke storage functions.

(Session proposed by Avishay Traeger)


Thursday April 18, 2013 4:10pm - 4:50pm
B110

4:10pm

Endpoint Filtering
Currently Keystone returns all endpoints in the service catalog, regardless whether user have access to them or not. This is neither necessary nor efficient.
We need to establish project-endpoints relationship so we can effectively assign endpoints to a given project, and be able to filter endpoints returned in the service catalog based on the token scope. Furthermore, we need to be able to optionally ask for service catalog on token validation instead of always returning them.


(Session proposed by Guang Yee)


Thursday April 18, 2013 4:10pm - 4:50pm
B114

4:10pm

Hyper-V Havana roadmap
Discuss Havana features for Hyper-V OpenStack integration.

(Session proposed by Alessandro Pilotti)


Thursday April 18, 2013 4:10pm - 4:50pm
B113

4:10pm

Projects (re)naming
We have two names for each of our projects. One is the official name ("OpenStack Compute") and the other is the codename ("Nova").

Codenames have a number of drawbacks: we don't actively protect their trademark, they are confusing to newcomers, and they tend to shadow their more official counterparts. But codenames also have benefits: they are highly-convenient short names (to be used in conversations, executables, modules...), and they separate the project itself from its functional scope, so they remain valid even if that scope evolves.

This session is about whether we should proactively drop usage of codenames in OpenStack, or reduce our dependency over them, or just keep them the way they are. In case of a trademark attack, should we switch to another codename, or abandon usage of a codename and switch to some short version of the "official" name ? Finally, we'll look at how we should proceed to fully rename a project.

(Session proposed by Thierry Carrez)


Thursday April 18, 2013 4:10pm - 4:50pm
B119

5:00pm

Cinder Library for Local Storage
Problems
========
a) storage related code is in multiple locations
b) available space and quotas are managed in two locations
c) locality specification is limited to availability zones
d) volume and instances can't be scheduled together

Goals
=====

a) share code relating to volumes
b) manage storage related quotas and data in one place
c) enable better locality specification
d) allow scheduling of instances and volumes as a unit


(Session proposed by Duncan Thomas)


Thursday April 18, 2013 5:00pm - 5:40pm
B110

5:00pm

Fine-grained access control
In a large implementation, there can be many users each having some level of access to a shared pool of resources. Not all users need that much access though and there are cases where access must be restricted further. V3 introduces policies and that works for restricting access to certain capabilities (only a user with the role "admin" or group "foo" can create server in nova, etc). Policies bloat up though if they need to get down the resource level (only joe can delete server "ABC").

This session is to discuss possible solutions to provide fine-grained access control to OpenStack services


(Session proposed by Joe Savak)


Thursday April 18, 2013 5:00pm - 5:40pm
B114

5:00pm

Havana release schedule
Let's review the proposed schedule for the Havana cycle, as well as any potential changes in freezes, intermediary milestones, release candidates management, etc.

(Session proposed by Thierry Carrez)


Thursday April 18, 2013 5:00pm - 5:40pm
B119

5:00pm

RHEV-M/oVirt Clusters as compute resources
This session proposes customers who have RHEV-M/oVirt Management solutions in their data centers to make use of RHEV-M/oVirt Cluster as an elastic compute pool in OpenStack.

The proposal is to have a nova compute proxy connects to a RHEV-M/oVirt through oVirt Rest APIs and expose configured clusters as a compute hosts.

• To allow a single RHEV-M/oVirt driver to model multiple Clusters in RHEV-M/oVirt as multiple nova-compute nodes.
• To allow the RHEV-M/oVirt driver to be configured to represent a set of clusters as compute nodes.
• To dynamically create / update / delete nova-compute nodes based on changes in RHEV-M/oVirt Clusters.

With this driver, there is no change expected in the scheduler. The RHEV-M templates which are available as glance images can be used to create instances.

(Session proposed by Srinivasa Acharya)


Thursday April 18, 2013 5:00pm - 5:40pm
B113

5:00pm

Supporting rich data types and aggregation ...
This session will review some of the real-world metrics collected at Rackspace and discuss how this data might be stored in Ceilometer.

The potential ramifications on the query interface and multi-publisher will also be discussed.

(Session proposed by Sandy Walsh)


Thursday April 18, 2013 5:00pm - 5:40pm
B116