<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-li-idr-mpls-path-programming-03"
     ipr="trust200902">
  <front>
    <title abbrev="BGP Extensions for MPLS Path Programming">BGP Extensions
    for Service-Oriented MPLS Path Programming (MPP)</title>

    <author fullname="Zhenbin Li" initials="Z. " surname="Li">
      <organization>Huawei Technologies</organization>

      <address>
        <postal>
          <street>Huawei Bld., No.156 Beiqing Rd.</street>

          <city>Beijing</city>

          <code>100095</code>

          <country>China</country>
        </postal>

        <email>lizhenbin@huawei.com</email>
      </address>
    </author>

    <author fullname="Shunwan Zhuang" initials="S. " surname="Zhuang">
      <organization>Huawei Technologies</organization>

      <address>
        <postal>
          <street>Huawei Bld., No.156 Beiqing Rd.</street>

          <city>Beijing</city>

          <code>100095</code>

          <country>China</country>
        </postal>

        <email>zhuangshunwan@huawei.com</email>
      </address>
    </author>

    <author fullname="Sujian Lu" initials="S." surname="Lu">
      <organization>Tencent</organization>

      <address>
        <postal>
          <street>Tengyun Building,Tower A ,No. 397 Tianlin Road</street>

          <city>Shanghai</city>

          <region>Xuhui District</region>

          <code>200233</code>

          <country>China</country>
        </postal>

        <phone/>

        <facsimile/>

        <email>jasonlu@tencent.com</email>

        <uri/>
      </address>
    </author>

    <date day="05" month="May" year="2016"/>

    <abstract>
      <t>Service-oriented MPLS programming (SoMPP) is to provide customized
      service process based on flexible label combinations. BGP will play an
      important role for MPLS path programming to download programmed MPLS
      path and map the service path to the transport path. This document
      defines BGP extensions to support service-oriented MPLS path
      programming.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>The label stack capability of MPLS would have been utilized well to
      implement flexible path programming to satisfy all kinds of service
      requirements. But in the distributed environment, the flexible
      programming capability is difficult to implement and always confined to
      reachability. As the introducing of central control in the network, the
      flexible MPLS programming capability becomes possible owing to two
      factors: 1. It becomes easier to allocate label for more purposes than
      reachability; 2. It is easy to calculate the MPLS path in a global
      network view. Moreover, the MPLS path programming capability can be
      utilized to satisfy more requirements of service bearing in the service
      layer which is defined as Service-oriented MPLS path programming. BGP
      will play an important role for MPLS path programming to download
      programmed MPLS path and map the service path to the transport path.
      This document defines BGP extensions to support Service-oriented MPLS
      path programming.</t>
    </section>

    <section title="Terminology">
      <t>BGP: Border Gateway Protocol</t>

      <t>EVPN: Ethernet VPN</t>

      <t>L2VPN: Layer 2 VPN</t>

      <t>L3VPN: Layer 3 VPN</t>

      <t>MPP: MPLS Path Programming</t>

      <t>MVPN: Multicast VPN</t>

      <t>RR: Route Reflector</t>

      <t>SR-Path: Segment Routing Path</t>

      <t>NLRI: Network Layer Reachability Information</t>
    </section>

    <section title="Architecture and Usecases of SoMPP">
      <t/>

      <section title="Architecture">
        <t>The architecture of BGP-based MPLS path programming is shown in the
        Figure 1. Central control plays an important role in MPLS path
        programming. It can extend the MPLS path programming capability
        easily. The central controller can calculate path in a global network
        view and implement the MPLS path programming to satisfy different
        requirements of services. The result of MPLS path programming can be
        advertised from the central controller to the client nodes through BGP
        extensions to the ingress PEs. When client nodes receives the result
        of MPLS path programming, it will install the MPLS forwarding entry
        for the specified BGP prefix to implement the service process.</t>

        <figure>
          <artwork><![CDATA[               +-------------------+              
               |      Central      |              
               |     Controller    |              
    |----------|(Path Calculation  |--------|     
    |          | /Path Programming)|        |     
    |          +-------------------+        |     
    |                                       |     
MPLS Path                                MPLS Path
    |                                       |     
    |                                       |     
    |                                       |     
 +--------+         +--------+         +--------+ 
 | CLIENT |         | CLIENT |         | CLIENT | 
 |        | ......  |        | ......  |        | 
 |  (PE)  |         |  (P)   |         |  (PE)  | 
 |        |         |        |         |        | 
 +--------+         +--------+         +--------+ 
                                                  
     Figure 1 BGP-based MPLS Path Programming     
]]></artwork>
        </figure>

        <t/>
      </section>

      <section title="Usecases">
        <t/>

        <section title="Deterministic ECMP">
          <t>Entropy Label<xref target="RFC6790"/> is introduced to improve
          the ECMP capability by encapsulate the entropy label in the MPLS
          label stack. The existing implementation is always to calculate the
          entropy label based on the header of packets by specific hash
          algorithm in the ingress node. That is, the entropy label is
          determined locally by the ingress node. The method can improve the
          hash of packets in the network for load-sharing. But since the
          ingress node lacks the knowledge of the global traffic pattern of
          the network and calculates the entropy label by itself it may be not
          able to improve the ECMP capability accurately and in some cases it
          may deteriorate the imbalance of load-sharing.</t>

          <t>With the central controlled MPLS path programming, the central
          controller can collect the global traffic pattern information of the
          network and based on the information deterministically calculate the
          entropy label for specific flows to help improve the load-sharing of
          the network. Then the central controller can download the label
          stack information with the deterministic entropy label to the
          ingress PEs for the specific BGP prefix. The ingress node can
          install the MPLS forwarding entry shown in the following figure to
          help optimize the ECMP of the flow specified by the BGP prefix, then
          optimize the ECMP of the whole network.</t>

          <figure>
            <artwork><![CDATA[+----------+      +----------+----------+               
|   BGP    | ---> |  Entropy |BGP Prefix| ---> Transport
|  Prefix  |      |   Label  |   Label  |        Tunnel 
+----------+      +----------+----------+               
]]></artwork>
          </figure>

          <t/>
        </section>

        <section title="Centralized Mapping of Service to Tunnels">
          <t>In the network there can be multiple tunnels to one specific
          destination which satisfy different constraints. In the traditional
          way, the tunnel is set up by the distributed forwarding nodes. As
          the PCE-initiated LSP setup <xref
          target="I-D.ietf-pce-pce-initiated-lsp"/> is introduced, the tunnel
          with different constraints can be set up in the central controlled
          way. In order to satisfy different service requirements, it is
          necessary to provide the capability to flexibly map the service to
          different tunnels which constraints can satisfy the required service
          requirement. Since the central controller has enough information of
          the whole network view, it can be an effective way to map the
          service (such as L3VPN and L2VPN) to the tunnel by the central
          controller and advertise the mapping information to the ingress PE
          of the service to guide the mapping in the forwarding node.</t>

          <t>There can be two types of behaviors to map service to the
          tunnel:</t>

          <t>1. Specify the tunnel type: with the method BGP will carry the
          tunnel type information for the BGP prefix. When the ingress PE
          receives the information, it will use the tunnel type and the
          nexthop address (or other specified target IP address) to search the
          corresponding tunnels to bear the flow specified by the BGP prefix.
          If there are more than one tunnels, the ingress PE will load share
          the traffic across all the tunnels.</t>

          <t>2. Specify the specific tunnel: For MPLS TE/SR-TE tunnel, there
          can be multiple MPLS TE tunnels from one ingress PE to a specific
          destination with different constraints. BGP can carry the tunnel
          identifier information for the BGP prefix from the controller to the
          ingress node. When the ingress PE receives the information, it will
          use the tunnel identifier information to search the corresponding
          tunnels to bear the flow specified by the BGP prefix. If there are
          multiple tunnel identifiers, the ingress PE will load share the
          traffic across all the tunnels.</t>
        </section>
      </section>
    </section>

    <section title="Advertising Label Stacks in BGP">
      <t>According to the service requirements, the central controller can
      combine MPLS labels flexibly. Then it can download the service label
      combination for specific prefix. BGP extensions are necessary to
      advertise label stacks for the prefix in NLRI field.</t>

      <figure>
        <artwork align="center"><![CDATA[  +---------------------------+
  |   Length (1 octet)        |
  +---------------------------+
  |   Label (3 octets)        |
  +---------------------------+
  .............................
  +---------------------------+
  |   Prefix (variable)       |
  +---------------------------+
Figure 2: NLRI Definition in RFC3107

]]></artwork>
      </figure>

      <t><xref target="RFC3107"/> defines above NLRI to advertise label
      binding for specific prefix. The label field can carry one or more
      labels. Each label is encoded as 3 octets, where the high-order 20 bits
      contain the label value, and the low order bit contains "Bottom of
      Stack". But for the other AFI/SAFIs using label binding such as IPv4
      Flowspec, IPv6 Flowspec, VPNv4, VPNv6, EVPN, MVPN, etc., it dose not
      support the capability to carry more labels for the specific prefix.
      Moreover for the AFI/SAFIs which do not support label binding capability
      originally, but may possibly adopt MPLS path programming now, there is
      no label field in the NLRI. In order to support flexible MPLS path
      programming, this document defines and uses a new BGP attribute called
      the "Extended Label attribute". This is an optional transitive BGP
      attribute. The attribute type code is (TBA by IANA), the value field of
      this attribute is defined as follows:</t>

      <figure>
        <artwork align="center"><![CDATA[  +---------------------------+
  |   Label 1 (3 octets)      |
  +---------------------------+
  |   Label 2 (3 octets)      |
  +---------------------------+
  .............................
  +---------------------------+
  |   Label n (3 octets)      |
  +---------------------------+
Figure 3: Extended Label Attribute

]]></artwork>
      </figure>

      <t>The Label field carries one or more labels (that corresponds to the
      stack of labels [<xref target="RFC3032"/>]). Each label is encoded as 3
      octets, where the high-order 20 bits contain the label value, and the
      low order bit contains "Bottom of Stack" (as defined in [<xref
      target="RFC3032"/>]). In the last label, the S bit MUST be "1"; in the
      other labels, the S bit MUST be "0".</t>

      <t>The "Extended Label attribute" can be used for various BGP address
      families. Before using this attribute, firstly, it is necessary to
      negotiate the capability between two nodes to support MPLS path
      programming for a specific BGP address family. If negotiation fails, a
      node MUST NOT send this attribute and MUST discard this attribute when
      it receives.</t>

      <section title="Download of MPLS Path">
        <t>The Central Controller for MPLS path programming could build a
        route with Extended Label attribute and send it to the ingress
        routers.</t>

        <t>Upon receiving such a route from the Central Controller, the
        ingress router SHOULD select such a route as the best path. If a
        packet comes into the ingress router and uses such a path, the ingress
        router will encapsulate the stack of labels which is derived from the
        Extended Label Attribute of the route into the packet and forward the
        packet along the path.</t>
      </section>

      <section title="Mapping Traffic to MPLS Path">
        <t>The Extended Label attribute can be used for BGP Flowspec address
        families. BGP advertises the Flowspec with the Extended Label
        attribute, so the flow packets can be redirected to the MPLS Path
        which is derived from the Extended Label Attribute.</t>
      </section>
    </section>

    <section title="Download of Mapping of Service Path to Transport Path">
      <t/>

      <section title="Specify Tunnel Type">
        <t><xref target="I-D.ietf-idr-tunnel-encaps"/> proposes the Tunnel
        Encapsulation Attribute which can be used without BGP Encapsulation
        SAFI to specify a set of tunnels. It defines a series of Encapsulation
        Sub-TLVs for particular tunnel types. It also defines the Remote
        Endpoint Attributes Sub-TLV to specify the remote tunnel endpoint
        address for each tunnel which can be different the BGP nexthop. The
        Tunnel Encapsulation Attributes can be reused for the MPLS path
        programming to specify the tunnel types, the encapsulation and the
        remote tunnel endpoint address which can determine a set of tunnels
        which the service can map to. Now the limited MPLS tunnel types are
        defined for the Tunnel Encapsulation Attributes. In order to support
        MPLS path programming, the following MPLS tunnel types are to be
        defined:</t>

        <figure>
          <artwork><![CDATA[
     Value                  Tunnel Type
    -------      ---------------------------------------------------
      TBD        LDP LSP
      TBD        RSVP-TE LSP
      TBD        MPLS-based Segment Routing Best-effort Path
      TBD        MPLS-based Segment Routing Traffic Engineering Path

]]></artwork>
        </figure>

        <t/>
      </section>

      <section title="Specify Specific Tunnel">
        <t>Besides specifying the tunnel types to determine the set of tunnels
        which the service traffic can map to, the specific tunnels can be
        specified directly by the tunnel identifiers when map the service
        traffic to the path. BGP extensions is necessary that through the
        community attribute of BGP the identifier of the transport path can be
        carried when advertise the specific prefix.</t>

        <t>In order to support the application, this document defines a new
        BGP attribute called the "Extended Unicast Tunnel attribute". This is
        an optional transitive BGP attribute. The attribute type code is (TBA
        by IANA), the value field of this attribute is defined as follows:</t>

        <t><figure>
            <artwork align="center"><![CDATA[+--------------------------------------------------+
| First Tunnel entry (variable)                    |
+--------------------------------------------------+
| Second Tunnel entry (variable)                   |
+--------------------------------------------------+
| ...                                              |
+--------------------------------------------------+
| N-th Tunnel entry (variable)                     |
+--------------------------------------------------+

]]></artwork>
          </figure>The Tunnel entry is defined as follows:</t>

        <figure>
          <artwork align="center"><![CDATA[+------------------------------------------------+
|  Flags (1 octet)                               |
+------------------------------------------------+
|  Tunnel Type (1 octets)                        |
+------------------------------------------------+
|  Tunnel Identifier (variable)                  |
+------------------------------------------------+
| Tunnel Specific Attributes (Variable)(Optional)|
+------------------------------------------------+

]]></artwork>
        </figure>

        <t>The Flags is reserved and must be set as zero. The Tunnel Type
        identifies the type of the tunneling technology used for the unicast
        service path. The tunnel type determines the syntax and semantics of
        the Tunnel Identifier field. This document defines following Tunnel
        Types:</t>

        <t><list style="empty">
            <t hangText="+">+ 0 - No tunnel information present</t>

            <t hangText="+">+ 1 - RSVP-TE LSP</t>

            <t hangText="+">+ 2 - MPLS-based Segment Routing Traffic
            Engineering Path</t>
          </list>Tunnel Specific Attributes contains the attributes of the
        tunnel. The field is optional. The value depends on the tunnel type.
        It will be defined in the future versions.</t>

        <t>When the Tunnel Type is set to "No tunnel information present", the
        Tunnel attribute carries no tunnel information (no Tunnel Identifier).
        when the type is used, the tunnel used for the service path is
        determined by the ingress router.</t>

        <t>When the Tunnel Type is set to RSVP - Traffic Engineering (RSVP-TE)
        Label Switched Path (LSP), the Tunnel Identifier is &lt;C-Type, Tunnel
        Sender Address, Tunnel ID, Tunnel End-point Address&gt; as specified
        in <xref target="RFC3209"/> If C-Type = 7, Tunnel Sender Address and
        Tunnel End-point Address are IPv4 address in 4 octets. If C-Type = 8,
        Tunnel Sender Address and Tunnel End-point Address are IPv6 address in
        16 octets. The other fields in the RSVP-TE LSP Identifier are the same
        as specified in [RFC3209].</t>

        <t>When the Tunnel Type is set to MPLS-based Segment Routing Traffic
        Engineering Path, the Tunnel Identifier is &lt;C-Type, Tunnel Sender
        Address, Tunnel ID, Tunnel End-point Address&gt;. If C-Type = 7,
        Tunnel Sender Address and Tunnel End-point Address are IPv4 address in
        4 octets. If C-Type = 8, Tunnel Sender Address and Tunnel End-point
        Address are IPv6 address in 16 octets. The tunnel identifier is
        similar as that of RSVP-TE LSP.</t>

        <t>BGP can carry multiple Tunnel entries in one Extended Unicast
        Tunnel attribute for specific prefix. If there are multiple tunnel
        entries, the ingress PE can load share the traffic across all the
        specified tunnels for the service traffic determined by the specific
        BGP prefix, or selects the primary / Backup tunnels from the multiple
        tunnel entries.</t>

        <t>The "Redirect-to-Tunnel Action" for BGP Flowspec has been described
        in<xref target="I-D.hao-idr-flowspec-redirect-tunnel"/>. This document
        reuses the tunnel identifier and defines it in the Extended Unicast
        Tunnel attribute which can be used for "Redirect-to-Tunnel
        Action".</t>
      </section>
    </section>

    <section title="Route Flag Extended Community">
      <t>In order to make the MPLS path programming to take effect, the route
      advertised by the central controller after the MPLS Path Programming
      should be selected by the ingress PE over other routes for the same BGP
      prefix. There are two options of BGP extensions for the purpose:</t>

      <t>Option 1: A new BGP Extended Community called as the "Route Flag
      Extended Community" can be introduced. The Type value is to be assigned
      by IANA.</t>

      <t>The Route Flag Extended Community is used to carry the flag appointed
      by the BGP central controller.</t>

      <t>The format of this extended community is defined as follows:</t>

      <t><figure>
          <artwork><![CDATA[    0     1     2     3     4     5     6     7   
 +-----+-----+-----+-----+-----+-----+-----+-----+
 |    Type   |  Reserved                   |Flag |
 +-----+-----+-----+-----+-----+-----+-----+-----+
                                                  
 Flag = 0, Treat as normal route
 Flag = 1, Treat as best route

]]></artwork>
        </figure>When a router receives a BGP route with a Route Flag Extended
      Community and the Flag set to "1", it SHOULD use the route as the best
      route when select the route from multiple routes for a specific
      prefix.</t>

      <t>Option 2: <xref target="I-D.ietf-idr-custom-decision"/> defines a new
      Extended Community, called the Cost Community, which can be used in tie
      breaking during the best path selection process. The Cost Community can
      be reused by the MPLS path programming to set the "Point of Insertion"
      as 128 to make the route advertised by the central controller to be
      chosen.</t>
    </section>

    <section title="Destination Node Attribute">
      <t>This document defines and uses a new BGP attribute called as the
      "Destination Node attribute" which Type value is to be assigned by IANA.
      The Destination Node attribute is an optional non-transitive attribute
      that can be applied to any address family.</t>

      <t>The Destination Node attribute is used to carry a list of node
      addresses, which are intended to be used to determine the nodes where
      the route with such attribute SHOULD be considered. If a node receives a
      BGP route with a Destination Node attribute, it MUST check the node
      address list. If one address of the list belongs to this node, the route
      MUST be used in this node. Otherwise the route MUST be ignored
      silently.</t>

      <t>The format of this attribute is defined as follows:</t>

      <t><figure>
          <artwork><![CDATA[ 0                   1                   2                   3   
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               AFI             |       SAFI    |    Reserved   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~                                                               ~
~               Destination Node Address List                   ~
~                                                               ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


]]></artwork>
        </figure>AFI: Address Family Identifier (16 bits).</t>

      <t>SAFI: Subsequent Address Family Identifier (8 bits).</t>

      <t>Reserved: One octet reserved for special flags</t>

      <t>Destination Node Address List: The list of IPv4 (AFI=1) or IPv6
      (AFI=2) address.</t>
    </section>

    <section title="Capability Negotiation">
      <t>It is necessary to negotiate the capability to support MPLS path
      programming. The MPLS-Path-Programming Capability is a new BGP
      capability <xref target="RFC5492"/>. The Capability Code for this
      capability is to be specified by the IANA. The Capability Length field
      of this capability is variable. The Capability Value field consists of
      one or more of the following tuples:</t>

      <t><figure>
          <artwork align="center"><![CDATA[+--------------------------------------------------+
|  Address Family Identifier (2 octets)            |
+--------------------------------------------------+
|  Subsequent Address Family Identifier (1 octet)  |
+--------------------------------------------------+
|  Send/Receive (1 octet)                          |
+--------------------------------------------------+
]]></artwork>
        </figure>The meaning and use of the fields are as follows:</t>

      <t>Address Family Identifier (AFI): This field is the same as the one
      used in <xref target="RFC4760"/>.</t>

      <t>Subsequent Address Family Identifier (SAFI): This field is the same
      as the one used in <xref target="RFC4760"/>.</t>

      <t>Send/Receive: This field indicates whether the sender is (a) willing
      to receive programming MPLS paths from its peer (value 1), (b) would
      like to send programming MPLS paths to its peer (value 2), or (c) both
      (value 3) for the &lt;AFI, SAFI&gt;.</t>
    </section>

    <section title="Acknowledgments">
      <t>The authors of this document would like to thank Lucy Yong, Susan
      Hares, Eric Wu, Weiguo Hao, Pinan Li and Jie Dong for their reviews and
      comments of this document.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>TBD.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The security considerations of <xref target="RFC4271"/> and <xref
      target="RFC5575"/> are applicable.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.4271'?>

      <?rfc include='reference.RFC.3032'?>

      <?rfc include='reference.RFC.3209'?>

      <?rfc include='reference.RFC.5036'?>

      <?rfc include='reference.RFC.4760'?>

      <?rfc include='reference.RFC.5492'?>

      <?rfc include='reference.RFC.5575'?>

      <?rfc include='reference.I-D.ietf-idr-tunnel-encaps'?>

      <?rfc include='reference.I-D.ietf-idr-custom-decision'?>

      <?rfc include='reference.I-D.hao-idr-flowspec-redirect-tunnel'?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.3107"?>

      <?rfc include='reference.RFC.6790'?>

      <?rfc include='reference.I-D.ietf-pce-pce-initiated-lsp'?>
    </references>
  </back>
</rfc>
