OpenFlow v0.9

From OpenFlow Wiki

Jump to: navigation, search

Contents

Overview

The plan is for v0.9 is to add a final set of features for OpenFlow that complete its functionality and enable production deployments. Unless critical missing features are found in OpenFlow 0.9, it will have the same features as OpenFlow 1.0. In this sense, v0.9 serves as a release candidate for OpenFlow 1.0.

To be ready for release, the official OpenFlow v0.9 release code must:

  • pass all regression tests (black box and learning switch) on all supported Linux distros (Debian unstable, Ubuntu 8.04, CentOS 5.2) in a VM with virtual Ethernet pairs, for both types (kernel and user-space)
  • be easily installed by someone with no experience doing it (sanity check)

Until all these are done, any v0.9 code will be clearly identified as a release candidate. Full implementation of all features in the NetFPGA is not a criterion for release.

Current Status

All tests/features were due on Jun 15th, AT THE LATEST.

Status of specific OpenFlow components that must be updated for v0.9 is:

Task Status             Owner        Reviewer       
Specification: Chk.png Done Brandon
Reference Implementation:
  3.1 Failover (preliminary) Chk.png Done Mikio/Bob Bob
  3.2 Emergency Flow Cache Chk.png Done Mikio/Tatsuya David E
  3.4.1 Barrier Command Chk.png Done Mikio David E
  3.4.2 Match VLAN Priority Chk.png Done Tatsuya David E
  3.4.3 Selective Flow Exp. Chk.png Done Glen David E
  3.4.4 Flow Mod Behavior Chk.png Done Yiannis Mikio
  3.4.5 Flow Exp. Ambiguity Chk.png Done Glen David E
  3.4.6 Flow Expiration from Deletes Chk.png Done Glen David E
  3.4.7 Rewrite DSCP/ToS field Chk.png Done Jean/Brandon/Tatsuya Mikio
  3.4.8 Start Port Enumeration at 1 Chk.png Done Mikio N/A
  4.1.1 Protocols statistical information Chk.png Done Mikio David E
Wireshark Plugin: Chk.png Done Everyone
Regression tests: Chk.png Done Everyone
Rigging/Outfitting: Chk.png Done Brandon/Mikio
Test for Linux distros:
  VETH/Ubuntu 9.04 (kernel 2.6.30) Chk.png Done
  VETH/Debian 5 unstable (kernel 2.6.2x) Chk.png Done
  VETH/CentOS 5.3 (kernel 2.6.2x) Chk.png Done
  VETH/Fedora 11 (kernel 2.6.2x) Chk.png Done
  BOX/Ubuntu 9.04 (kernel 2.6.2x) Chk.png Done
  BOX/Debian 5 unstable (kernel 2.6.2x) Chk.png Done
  BOX/CentOS 5.2 (kernel 2.6.2x) Chk.png Done
  BOX/Fedora 10 (kernel 2.6.2x) Chk.png Done
Applying user feedback:
  Emergency flow cache: flow table consistency Chk.png Done Brandon/DavidE
  Emergency flow cache: default set of emergency flows Chk.png Done Brandon/DavidE
  Emergency flow cache: flow removal message Chk.png Done Brandon/DavidE
  Emergency flow cache: error code name Chk.png Done Mikio/Brandon
  VLAN priority match: reordering match field Chk.png Done Mikio/Brandon
  Clarification of datapath ID Chk.png Done Brandon
  Clarification of miss-send-len/max-len of output action Chk.png Done Brandon
  OFPP_NONE and FLOW_MOD w/o actions No issue N/A
  Clarification of error notification scheme Chk.png Done Brandon
  Simultaneous multiple controllers Postponed until 1.0 N/A
  Stats handling with specific port number Postponed until 1.0 N/A
  PACKET_IN suppression Postponed until 1.0 N/A
  PACKET_IN with ofp_match Postponed until 1.0 N/A
NetFPGA: Chk.png Done Tatsuya
Instructions on Wiki: Chk.png Done Mikio
Final Packaging and Release:       Chk.png Done

Feature List

This list focuses on items that require implementation changes. Minor spec changes are listed at the bottom.

Failover (preliminary)

Note: This feature is not finalized yet and may change

What: A simple but robust failover mechanism

Why: The vast majority of real-world OpenFlow deployments will run with redundant controllers. Technically failover is not part of the OpenFlow protocol, but it seems close enough to include it in the OpenFlow Protocol Specification.

How: An OpenFlow switch should be configurable with multiple IP addresses of available controllers. How these IP addresses are entered in the switch is not specified by the protocol. The switch will connect to one IP address at a time. If that controller fails (as a result of a ping timeout, or SSL session timeout), the switch will attempt to connect to another IP address in the pool. The ordering of the controller IP addresses is not specified by the protocol.

File:0.9.0-simple-failover.pdf


Emergency Flow Cache

What: Define behavior of the switch if all connections to controllers are lost

Why: If a switch becomes disconnected, it should fall back to a well known state. Behavior in this state may depend on what role the OpenFlow switch is fulfilling (e.g. L2, L3, other protocol). The most general mechanism is to have the controller define the emergency state dynamically.

How: At a high level:

  • Add one status bit to flow mod messages: emergency.

Emergency-specific flow entries are inactive until a switch loses connectivity from the controller. If this happens, the switch invalidates all normal flow table entries and copies all emergency flows into the normal flow table.

Upon connecting to a controller again, all entries in the flow cache stay active. The controller then has the option of resetting the flow cache if needed.

This mechanism enables a controller to specify the following switch behaviors in emergency mode:

  • Fade Away - leave flow entries as-is but continue to expire them as normal
  • Lockdown - drop all packets, except for expected controller connections
  • Hub - broadcast all traffic with single wildcard broadcast entry
  • Normal - single wildcard entry with OFPAT_NORMAL action type

Note that many switches have as the normal behavior to act as a learning L2 switch. Thus with the last option fail-over to a learning switch can be achieved.

How: At a low level:

  • Add single-bit field to struct ofp_flow_mod for table type

In-Band Control (not in standard)

Note: After much discussion we concluded to not including this is the 0.9 specification. While it is a very important feature and implemented in the majority of existing implementations, the specific implementation mechanism varies and we did not yet feel comfortable to mandate a specific method.


Minor Features

Barrier Command

What: Mechanism to get notified when an OpenFlow message has finished executing on the switch.

Why: Faster handover, testing, and monitoring

Let's say you're making a mobility controller. You want this controller to support handover between APs with no dropped packets and a minimum of latency, and you want to modify the flow entry in the crossover switch right when you've confirmed that flow entries have been added to the switches downstream of the crossover switch. Or you're making a network monitor, and want an accurate view of the network, without having to constantly poll the switches. Or you're running an experiment where you want predictable performance, and want to verify that an entry has been inserted before beginning. Or you're writing a continuous testing program and want to send packets as soon as you're sure a flow has been added to hardware.

How: OpenFlow 0.9 adds two new messages to the protocol:

  • Barrier Request (from controller to switch)
  • Barrier Reply (from switch to controller)

When a switch receives a Barrier message it must first complete all commands sent before the Barrier message before executing any commands after it. When all commands before the Barrier message have completed, it must send a Barrier Reply message back to the controller.

Match on VLAN Priority Bits

Currently, the VLAN id is a field used in identifying a flow, but the priority bits in the VLAN tag are not. The proposal is simply to include the priority bits as a separate field to identify flows. Matching is possible as either an exact match on the 3 priority bits, or as a wildcard for the entire 3 bits.

All hardware can not support this feature, so this feature is optional and its support published through OpenFlow capabilities.

How: Add dl_vlan_pcp field into struct ofp_match and add OFPFW_DL_VLAN_PCP into enum ofp_flow_wildcards. Each potion of openflow.h is going to be as follows:

/* Fields to match against flows */
struct ofp_match {
   uint32_t wildcards;           /* Wildcard fields. */
   uint16_t in_port;             /* Input switch port. */
   uint8_t dl_src[OFP_ETH_ALEN]; /* Ethernet source address. */
   uint8_t dl_dst[OFP_ETH_ALEN]; /* Ethernet destination address. */
   uint16_t dl_type;             /* Ethernet frame type. */
   uint16_t dl_vlan;             /* Input VLAN id. */
   uint8_t dl_vlan_pcp;          /* Input VLAN priority. */
   uint8_t nw_proto;             /* IP protocol. */
   uint32_t nw_src;              /* IP source address. */
   uint32_t nw_dst;              /* IP destination address. */
   uint16_t tp_src;              /* TCP/UDP source port. */
   uint16_t tp_dst;              /* TCP/UDP destination port. */
};
OFP_ASSERT(sizeof(struct ofp_match) == 36);
/* Flow wildcards. */
enum ofp_flow_wildcards {
   OFPFW_IN_PORT     = 1 << 0,  /* Switch input port. */
   OFPFW_DL_VLAN     = 1 << 1,  /* VLAN id. */
   OFPFW_DL_VLAN_PCP = 1 << 2,  /* VLAN priority. */
   OFPFW_DL_SRC      = 1 << 3,  /* Ethernet source address. */
   OFPFW_DL_DST      = 1 << 4,  /* Ethernet destination address. */
   OFPFW_DL_TYPE     = 1 << 5,  /* Ethernet frame type. */
   OFPFW_NW_PROTO    = 1 << 6,  /* IP protocol. */
   OFPFW_TP_SRC      = 1 << 7,  /* TCP/UDP source port. */
   OFPFW_TP_DST      = 1 << 8,  /* TCP/UDP destination port. */
 [...]

Selective Flow Expirations

What: Make flow expiration messages happen on a per-flow, rather than per-switch granularity.

Why: This could reduce flow expiration traffic for both single-owner OpenFlow networks and shared ones. In a single-owner OpenFlow network, you might only care about logging expiration events for specific flows. In a shared network, if anyone wants a flow expiration then everyone has to receive them. With per-flow expirations a virtualized controller sharing OpenFlow instances between two students does not need to enable expirations on a switch and filter those to specific students. In general, state should be per-flow rather than per-switch wherever possible, for flexibility and easier virtualization.

Another reason is to reduce processor load. The CPUs in embedded switches are not the fastest, and selectively sending flow expirations may reduce dropped packets.

How: Remove OFPC_SEND_FLOW_EXP in ofp_config_flags and add a newly defined send_flow_exp bit to each flow mod. When the switch software component is timing out flows, it checks this bit.

Flow Mod Behavior

What: Define behavior for mods of conflicting flows which have the same priority.

Why: Currently the spec doesn't allow this, however implementing this efficiently is difficult for switch vendors. Yet for debugging all switch forwarding behavior should be defined.

How: Add a CHECK_OVERLAP flag to flow mods which requires the switch to do the (potentially more costly) check that there doesn't already exist a conflicting flow with the same priority. If there is one, the mod fails and an error code is returned. Support for this flag is required in an OpenFlow switch.

Fix Flow Expiration Duration Ambiguity

What: Fix an ambiguity regarding the "duration" field in the Flow Expiration message.

Why: The code and most of the spec indicate that it is the amount of time the flow has been in the flow table. However, on page 31, it states that it is the amount of time the flow received traffic.

Just in case that description wasn't clear, I'll give an example. Let's say a flow expired because the idle timer went off. If traffic was received for 45 seconds and the idle timer was set to 30 seconds. In the "duration" field, we can either return 45 or 75. Right now, the code returns 75 ("the amount of time the flow was active"). The alternative is to send 45 (the amount of time the flow received traffic").

Clearly, if the controller always sets the idle timeout to 30 seconds, it's trivial to derive one from the other. If the controller uses different idle timeouts, the controller will need to store the idle timeouts for each flow. And if a hard timeout expired, then you have no idea if you've received traffic the entire time or it stopped at some time before an idle timeout would have expired.

How: In the "ofp_flow_expired" structure, add a 16-bit "idle_timeout" field in the location of "pad2" (this will reduce the pad by two bytes, but not grow the overall structure's size). The "idle_timeout" value will be whatever value the controller used when the flow was originally added. The hard timeout value will be obvious if it is the reason for the deletion, since the "reason" will be "OFPER_HARD_TIMEOUT" and the "duration" will be the hard timeout value.

Flow expiration from Deletes

What: Notify the controller of flows that are forcefully deleted

Why: Currently, the switch only notifies the controller when flows expire. When a switch is told to delete flows, it just silently discards them. By sending them, the controller will be able to get precise information on the packet and byte counts from the flow. Without this, the controller must retrieve flow statistics and then delete the flow, during which time the counts could change. This could also aid in debugging if multiple applications or programs controlling the same switch are deleting each others' flows.

How: Add a "OFPER_DELETE" reason to "ofp_flow_expired_reason". When a flow is explicitly deleted and the "send flow expiration bit" is set, then send a flow expiration message to the controller. Since this is no longer limited to expirations, the message type should be renamed "OFPT_FLOW_REMOVED" and the structure should be renamed "ofp_flow_removed".


Rewrite DSCP in IP ToS header

What: Flow action to rewrite the DiffServ CodePoint bits part of the IP ToS field in the IP header

Why: Enable to support basic QoS with OpenFlow in the short term, without requiring a full QoS framework. Many existing switches can prioritize traffic based on the VLAN PCP or IP ToS fields.

How: To do basic QoS the operator of a network could set up rules for different DiffServ classes directly on the switch, mapping the various DiffServ CodePoints to various QoS service, such as different priorities. Then, it can use openflow to set DCSP bits correctly on a per-flow basis.

The same can be done with the VLAN PCP, using the IP ToS gives more flexibility (6 bits instead of 3) and easier integration into a existing DiffServ environments. Because not all hardware can support it, this feature should be optional.

An additional possibility would be to add IP ToS matching to OpenFlow. Most hardware seems able to do that (HP, Broadcom), but it was not considered at this point.

Start Port Enumeration at 1

What: Start port ID's at 1 instead of 0

Why: The current reference design begins assigning OpenFlow port identifiers at zero. A number of protocols such as SNMP and STP start counting ports at one. To increase compatibility, we should consider starting at one instead of zero. This requires no protocol change, and is only a spec change and implementation detail.


Spec Changes

Official OpenFlow Port

What: Make 6633/TCP the official OpenFlow Port

Why: Older versions of OpenFlow used 975/TCP for plain TCP and 976/TCP for SSL-wrapped OpenFlow. Since the 0.8.9 release, we've been using port 6633/TCP for both the TCP and SSL versions of OpenFlow. This is not reflected in the spec yet.

Note: Long term the goal is to get a IANA approved port for OpenFlow.

Remove Type(s)

What: Remove all references to "Type 0" and "Type 1"

Why: People tend to be confused about "Type 0" and "Type 1" vs. "Version 1.0" and "Version 2.0". To simplify we'll drop the reference to types and instead label features as REQUIRED or OPTIONAL in future versions of OpenFlow.

Clarify Matching Behavior for Flow Modification and Stats

The spec does not currently define how wildcard matching should behave for flow modification (specifically DELETE and MODIFY commands) and stats. A match will occur when a flow entry exactly matches or is more specific than the description in the flow_mod command. For example, if a flow delete command says to delete all flows with a destination port of 80, then a flow entry that is all wildcards will not be deleted. However, a flow delete command that is all wildcards will delete an entry that matches all port 80 traffic.

Clarify Spanning Tree

Modify spec to make explicit that packets received on ports that are disabled by spanning tree must follow the normal flow table processing path.

Clarify Transaction ID in Error Messages

Add to section 5.4.4, error messages: "If the error message is in response to a specific message from the controller, e.g., OFPET_BAD_REQUEST, OFPET_BAD_ACTION, OFPET_FLOW_MOD_FAILED, then the transaction ID in the header should match that of the offending message."

Clarify Format for Strip VLAN Action

Clarify that the OFPAT_STRIP_VLAN action takes no argument and strips the VLAN tag if one is present. The header is the generic action header.

Clarify Buffer Behavior

The spec doesn't have a policy about how buffered packets should be handled in a switch. In the reference implementation, packets are held in a circular buffer and guaranteed not to be reused for one second or until the controller specifies an action, whichever happens first. The spec should specify requirements and recommendations for buffered packets. For example: Switches MUST gracefully handle not getting a response from the controller about a buffered packet. The switch SHOULD prevent a buffer from being reused until its been handled by the controller or some amount of time has passed.

Add EPERM Error Type

An OpenFlow hypervisor might choose to reject an OpenFlow request, but currently has no message type defined for this error. It is not a vendor-specific thing, thus we should define a new error code, OFPFMC_EPERM. Also define similar error code additions for OFPET_BAD_REQUEST and OFPET_BAD_ACTION.

Fix Flow Table Matching Diagram

Figure 3 in the spec incorrectly shows Ethernet type = 0x8000 for IP matching, instead of 0x800.

Reference Implementation

This list focuses on items that purely implementation changes, not a spec feature.

Minor Featuers

Protocol statistical information

What: Collect protocol statistical information

Why: In case of protocol development and/or plug festa, OpenFlow protocol statistical information is useful for debugging.

How: Add a protocol stats collector and PRIVATE_EXTENSION messages to communicate between OpenFlow control plane process and user interface process.

File:0.9.0-protocol-stat.pdf