Data model requirements

From OpenFlow Wiki

Jump to: navigation, search

Note:

The goal of this writeup is to try and find a minimum feature set for a config language which is still generally useful to controller developers for OpenFlow. While it does describe particulars (e.g. data types, function names), this is mainly for illustration. What is important is the expressibility of the API.

The goal of the configuration interface is to provide a mechanism by which the controller can (efficiently) maintain state consistency with the configuration (and possible stats) values of the switch.

Input collected from the following developers:

 Ben Pfaff, Teemu Koponen, Justin Pettit, KK Yap, Dave Erickson,
 Kyriakos Zarifis, Brandon Heller, Murphy McCaully, Peter Balland,
 Natasha Gude, Keith Amidon, Martin Casado, Arsalan Tavakoli


Contents

Data Model

We have agreement that the config protocol should go beyond simple key value and should support more complex relationships.

It is also generally agreed that a basic table model would be sufficient for expressing the data. Based on the transactional, and trigger requirements (below) this is probably the correct model to ease implementation (or rely on an existing db such as sqlite). We suggest the following data model:

Named tables

Tables can have multiple named attributes (columns) Attribute values can be:

   * scalars (64 bit signed int, doubles, strings, NULL values)
   * sets 
   * maps
   * record_uuid 

The record_uuids allow for inter-relation references without requiring the specification of primary keys (described in more detail later). There have been multiple suggestions that this be an RFC 4122 UUID which simplifies the control logic. Arguments for this are: hashing is trivial, comparing object identity is trivial even from different relations, foreign keys don't need to include a table reference, and it simplifies durability during crashes (rather than having to carefully create unique OIDs)

It would also be nice to have the database enforce uniqueness for the values of certain attributes. We consider this a want.

e.g

 Table: switch
 ----------------------------------------------
 | OID | name | dpid      |    description    |
 +-----+------+-----------+--------------------
 |0x99 |"sw1" |0xdeadbeef | "a great switch!" |
 +-----+------+-----------+--------------------
 |0x98 |"sw2" |0xdeadd0d0 | "a great switch!" |
 +-----+------+-----------+--------------------

Schema

We've identified no strong need for dynamic schemas. Further supporting a dynamic schema could limit the performance of the implementation.

Therefore, the schema (still to be determined) will be static and not modifiable at runtime (that is, no create_table). There are a number of requests for the schema to contain tables with meta-information which allow enumeration and description of the current schema.

The schema should be versioned for backwards compatibility.

Data Access Model

Because this is intended for configuration management, it is assumed the data access model supports read/write semantics. That is, the API should provide the ability to ready all rows in a table, modify rows, and delete rows.

Query Model

The query model is used to read the tables, to determine whether to run a transaction (described below) and to set triggers. There is unanimous agreement that the query interface should support comparator operators, and greater than/less than.

There has been some suggestions for supporting boolean operators but this is not unanimously supported due to potential implementation difficulties. Therefore, we characterize this as a "want". One suggestion is to support comparators and AND.

  • Note that there appears to be a strong desire for more complex query operations specifically with regards to statistics. For the purposes of this writeup, we assume that the configuration interface does not necessarily extend to statistics (though it very well may) and therefore we forgo the additional implementation complexity needed to support more complex queries. Once the query mechanism is in place, it may be extended (in the future) to support a more sophisticated interface.

Transactional Model

In order to ensure consistency on write, the API must support a transactional model. This is especially critical given the nested nature of the data being managed. We suggest the following API:

Access/Modify/Delete commands can be submitted in batch and applied as a transaction (either all at once or not at all). The list of commands can be committed as the result of a query. All transactions are uniquely identified by a transaction ID.

For example

submit_batch(query, list_of_commands_if_true, list_of_commands_if_false).

The semantics of this function are to execute the query, and if the result is true to transactionally apply the list of true commands, otherwise it transactionally applies the list of false commands.

The transaction ID can be used to determine what has changed in the DB since the transaction was executed.

Trigger Model

To make state synchronization efficient, it would be useful to have a mechanism by which the controller can determine when state has changed on the switch without polling or doing periodic full state replication.

Therefore, at minimum, the API should support the placement of ephemeral triggers. Triggers are represented by a query which will fire if data covered by that query *might* have changed (false positives are OK). Once triggers have fired, they are removed from the DB and must be reapplied by the application.

Note that missed updates have to be handled by the application. To aid in this, the API can provide a way to get all changed data since a given transaction ID (representing only the current state). Even if triggers are persistent the code will have to handle lost triggers so persistent and non-persistent triggers do not effect code complexity. (However, non-persistent triggers do simplify state replication if needed)

Based on discussions with Teemu (who has done multiple iterations on implementing exactly this sort of API), this trigger model nicely balances implementation complexity and expressibility.

  • Note there is some concern that implementing triggers may tax some hardware platforms due to the overhead of polling. Therefore, we suggest making this an optional feature. Switches which do not support triggers will return an error when the controller tries to register the trigger.

Referential Integrity Model

The database may (but is not required to) enforce all references within tables (attributes whose values are OIDs). This means that a modify or a delete cannot leave a dangling pointer anywhere in the database.

  • Note: this isn't a strict requirement, but it would help prevent errant controllers from creating inconsistent data at the switch and is inline with having a static schema

Durability

The only durability requirements are that all written state must retain transactional and referential integrity (a failure should not create an inconsistent db). However, there is no requirement to support durability per transaction. One recommendation is to provide an explicit "sync" command to the API so that the controller writer can specify their durability needs explicitly. If a switch is unable to persist the data (for example if it is rate limiting writes to persistent store) it can refuse a sync command with an error code.

Versioning

No data versioning information (db, table, etc) needs to be available through the API. The purpose of the config protocol is for synchronization of the current state, and there is no (compelling) use case to iterate over intermediate states.

As mentioned previously, schemas should be versioned for backwards compatibility.