<?xml version="1.0" encoding="UTF-8"?>
  <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
  <!-- generated by https://github.com/cabo/kramdown-rfc2629 version 1.0.28 -->

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
]>

<?rfc toc="yes"?>
<?rfc tocindent="yes"?>
<?rfc sortrefs="yes"?>
<?rfc symrefs="yes"?>
<?rfc strict="yes"?>
<?rfc compact="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>

<rfc ipr="trust200902" docName="draft-kerwin-http2-encoded-data-08" category="exp">

  <front>
    <title abbrev="http2-encoded-data">HTTP/2 Gzipped Data</title>

    <author initials="M." surname="Kerwin" fullname="Matthew Kerwin">
      <organization></organization>
      <address>
        <email>matthew@kerwin.net.au</email>
        <uri>http://matthew.kerwin.net.au/</uri>
      </address>
    </author>

    <date year="2016"/>

    <area>Applications and Real-Time</area>
    
    <keyword>HTTP</keyword> <keyword>H2</keyword> <keyword>GZIP</keyword>

    <abstract>


<t>This document introduces a new frame type for transporting gzip-encoded data between peers in the
Hypertext Transfer Protocol Version 2 (HTTP/2), and an associated error code for handling
invalid encoding.</t>

<t><spanx style="strong">Note to Readers</spanx></t>

<t>The issues list for this draft can be found at <eref target="https://github.com/phluid61/internet-drafts/labels/HTTP%2F2%20Gzipped%20Data">https://github.com/phluid61/internet-drafts/labels/HTTP%2F2%20Gzipped%20Data</eref></t>

<t>The most recent (often unpublished) draft is at <eref target="http://phluid61.github.io/internet-drafts/http2-encoded-data/">http://phluid61.github.io/internet-drafts/http2-encoded-data/</eref></t>



    </abstract>


  </front>

  <middle>


<section anchor="intro" title="Introduction">

<t>This document introduces a mechanism for applying gzip encoding <xref target="RFC1952"/> to data
transported between two endpoints in the Hypertext Transfer Protocol Version 2 (HTTP/2) <xref target="RFC7540"/>,
analogous to Transfer-Encoding in HTTP/1.1 <xref target="RFC7230"/>.</t>

<section anchor="notational-conventions" title="Notational Conventions">

<t>The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in <xref target="RFC2119"/>.</t>

</section>
</section>
<section anchor="additions" title="Additions to HTTP/2">

<t>This document introduces a new HTTP/2 frame type (<xref target="RFC7540"/>, Section 11.2),
a new HTTP/2 setting (<xref target="RFC7540"/>, Section 11.3),
and a new HTTP/2 error code (<xref target="RFC7540"/>, Section 7), to allow the compression
of data.</t>

<t>Note that while compressing some or all data in a stream might affect the total length of the
corresponding HTTP message body, the <spanx style="verb">content-length</spanx> header, if present, should continue to
reflect the total length of the <spanx style="emph">uncompressed</spanx> data. This is particularly relevant when detecting
malformed messages (<xref target="RFC7540"/>, Section 8.1.2.6).</t>

<section anchor="accept-gzipped-data" title="SETTINGS_ACCEPT_GZIPPED_DATA">

<t>SETTINGS_ACCEPT_GZIPPED_DATA (0xTBA) is used to indicate the sender’s ability and
willingness to receive GZIPPED_DATA frames. An endpoint MUST NOT send a GZIPPED_DATA
frame unless it receives this setting with a value of 1.</t>

<t>The initial value is 0, which indicates that GZIPPED_DATA frames are not supported. Any
value other than 0 or 1 MUST be treated as a connection error (<xref target="RFC7540"/>, Section 5.4.1)
of type PROTOCOL_ERROR.</t>

<t>An endpoint may advertise support for GZIPPED_DATA frames and later decide that it no longer
supports them.  After sending an ACCEPT_GZIPPED_DATA setting with the value 0, the endpoint
SHOULD continue to accept GZIPPED_DATA frames for a reasonable amount of time to account for
frames that may already be in flight.</t>

</section>
<section anchor="gzipped-data" title="GZIPPED_DATA">

<t>GZIPPED_DATA frames (type code=0xTBA) are semantically identical to DATA frames
(<xref target="RFC7540"/>, Section 6.1), but their payload is encoded using gzip compression.
Significantly: the order of DATA and GZIPPED_DATA frames is semantically significant; and
GZIPPED_DATA frames are subject to flow control (<xref target="RFC7540"/>, Section 5.2).
Gzip compression is an LZ77 coding with a 32 bit CRC that is commonly produced
by the gzip file compression program <xref target="RFC1952"/>.</t>

<t>Any compression or decompression context for a GZIPPED_DATA frame is unique to that frame.
An endpoint MAY interleave DATA and GZIPPED_DATA frames on a single stream.</t>

<figure title="GZIPPED_DATA Frame Payload"><artwork><![CDATA[
  +---------------+
  |Pad Length? (8)|
  +---------------+-----------------------------------------------+
  |                            Data (*)                         ...
  +---------------------------------------------------------------+
  |                           Padding (*)                       ...
  +---------------------------------------------------------------+
]]></artwork></figure>

<t>The GZIPPED_DATA frame contains the following fields:</t>

<t><list style="symbols">
  <t>Pad Length:
An 8-bit field containing the length of the frame padding in units
of octets. This field is optional and is only present if the
PADDED flag is set.</t>
  <t>Data:
Encoded application data. The amount of encoded data is the remainder of the frame
payload after subtracting the length of the other fields that are
present.</t>
  <t>Padding:
Padding octets that contain no application semantic value. Padding
octets MUST be set to zero when sending and ignored when receiving.</t>
</list></t>

<t>The GZIPPED_DATA frame defines the following flags:</t>

<t><list style="symbols">
  <t><spanx style="verb">END_STREAM</spanx> (0x1):
Bit 1 being set indicates that this frame is the last that the
endpoint will send for the identified stream. Setting this flag
causes the stream to enter one of the “half closed” states or the
“closed” state (<xref target="RFC7540"/>, Section 5.1).</t>
  <t><spanx style="verb">PADDED</spanx> (0x8):
Bit 4 being set indicates that the Pad Length field is present.</t>
</list></t>

<t>A GZIPPED_DATA frame MUST NOT be sent if the ACCEPT_GZIPPED_DATA setting
of the peer is set to 0.  See <xref target="experiment"/>.</t>

<t>An intermediary, on receiving a GZIPPED_DATA frame, MAY decode the data and forward it to its
downstream peer in one or more DATA frames. If the downstream peer has not advertised support
for GZIPPED_DATA frames (by sending an ACCEPT_GZIPPED_DATA setting with the value 1) the
intermediary MUST decode the data before forwarding it.</t>

<t>If an endpoint detects that the payload of a GZIPPED_DATA frame is not encoded correctly,
for example with an incorrect checksum, the endpoint MUST
treat this as a stream error (see <xref target="RFC7540"/>, Section 5.4.2) of type
DATA_ENCODING_ERROR (<xref target="error"/>). The endpoint MAY then choose to immediately send an
ACCEPT_GZIPPED_DATA setting with the value 0.</t>

<t>If an intermediary propagates a GZIPPED_DATA frame from the source peer to the destination peer
without modifying the payload or its encoding, and receives a DATA_ENCODING_ERROR from the
receiving peer, it SHOULD pass the error on to the source peer.</t>

<t>GZIPPED_DATA frames MUST be associated with a stream. If a GZIPPED_DATA frame is received whose
stream identifier field is 0x0, the recipient MUST respond with a connection error
(<xref target="RFC7540"/>, Section 5.4.1) of type PROTOCOL_ERROR.</t>

<t>GZIPPED_DATA frames are subject to flow control and can only be sent when a stream is in the
“open” or “half closed (remote)” states. The entire GZIPPED_DATA frame payload is included in flow
control, including the Pad Length and Padding fields if present. If a
GZIPPED_DATA frame is received whose stream is not in “open” or “half closed (local)” state, the
recipient MUST respond with a stream error (<xref target="RFC7540"/>, Section 5.4.2) of type
STREAM_CLOSED.</t>

<t>GZIPPED_DATA frames can include padding.  Padding fields and flags are identical to those defined
for DATA frames (<xref target="RFC7540"/>, Section 6.1).</t>

</section>
<section anchor="error" title="DATA_ENCODING_ERROR">

<t>The following new error code is defined:</t>

<t><list style="symbols">
  <t><spanx style="verb">DATA_ENCODING_ERROR</spanx> (0xTBA):
The endpoint detected that its peer sent a GZIPPED_DATA frame with an invalid encoding.</t>
</list></t>

</section>
</section>
<section anchor="experiment" title="Experimental Status">

<t>This extension is classified as an experiment because it alters the base semantics of HTTP/2;
a change that, if specified insufficiently or implemented incorrectly, could result in data loss
that is hard to detect or diagnose.</t>

<t><xref target="RFC7540"/>, Section 5.5, mandates that “implementations MUST discard frames that have unknown
or unsupported types”; so if an endpoint or intermediary mishandles GZIPPED_DATA frames, for
example by incorrectly emitting an ACCEPT_GZIPPED_DATA setting or propagating GZIPPED_DATA
frames, and those frames are subsequently discarded, data will be lost.  There is no reliable
mechanism to detect such a loss[*].</t>

<t>The experiment therefore is to explore the robustness of the HTTP/2 ecosystem in the presence of
such potential failures.</t>

<t>[*] For some unreliable mechanisms (i.e. not guaranteed to be in use in all cases, and/or
requiring inspection of HTTP headers) see:</t>

<t><list style="symbols">
  <t>Section 8.1.2.6 of <xref target="RFC7540"></xref>, for using the content-length header to detect malformed messages</t>
  <t><xref target="RFC3230"></xref>, for HTTP instance digests</t>
</list></t>

</section>
<section anchor="security" title="Security Considerations">

<t>Further to the Use of Compression in HTTP/2 (<xref target="RFC7540"/>, Section 10.6),
intermediaries MUST NOT apply compression to DATA frames, or alter the compression of
GZIPPED_DATA frames other than decompressing, unless additional information is available
that allows the intermediary to identify the source of data. In particular, frames that
are not compressed cannot be compressed, and frames that are separately compressed cannot
be merged into a single compressed frame.</t>

</section>
<section anchor="iana" title="IANA Considerations">

<t>This document updates the registries for frame types, settings, and error codes in
the “Hypertext Transfer Protocol (HTTP) 2 Parameters” section.</t>

<section anchor="http2-frame-type-registry-update" title="HTTP/2 Frame Type Registry Update">

<t>This document updates the “HTTP/2 Frame Type” registry
(<xref target="RFC7540"/>, Section 11.2).  The entries in the
following table are registered by this document.</t>

<texttable>
      <ttcol align='left'>Frame Type</ttcol>
      <ttcol align='left'>Code</ttcol>
      <ttcol align='left'>Section</ttcol>
      <c>GZIPPED_DATA</c>
      <c>TBD</c>
      <c><xref target="gzipped-data"/></c>
</texttable>

</section>
<section anchor="http2-settings-registry-update" title="HTTP/2 Settings Registry Update">

<t>This document updates the “HTTP/2 Settings” registry
(<xref target="RFC7540"/>, Section 11.3).  The entries in the
following table are registered by this document.</t>

<texttable>
      <ttcol align='left'>Frame Type</ttcol>
      <ttcol align='left'>Code</ttcol>
      <ttcol align='left'>Initial Value</ttcol>
      <ttcol align='left'>Specification</ttcol>
      <c>ACCEPT_GZIPPED_DATA</c>
      <c>TBD</c>
      <c>0</c>
      <c><xref target="accept-gzipped-data"/></c>
</texttable>

</section>
<section anchor="http2-error-code-registry-update" title="HTTP/2 Error Code Registry Update">

<t>This document updates the “HTTP/2 Error Code” registry
(<xref target="RFC7540"/>, Section 11.4).  The entries in the
following table are registered by this document.</t>

<texttable>
      <ttcol align='left'>Name</ttcol>
      <ttcol align='left'>Code</ttcol>
      <ttcol align='left'>Description</ttcol>
      <ttcol align='left'>Specification</ttcol>
      <c>DATA_ENCODING_ERROR</c>
      <c>TBD</c>
      <c>Invalid encoding detected</c>
      <c><xref target="error"/></c>
</texttable>

</section>
</section>
<section anchor="acknowledgements" title="Acknowledgements">

<t>Thanks to Keith Morgan for his advice, input, and editorial contributions.</t>

</section>


  </middle>

  <back>

    <references title='Normative References'>





<reference  anchor='RFC1952' target='http://www.rfc-editor.org/info/rfc1952'>
<front>
<title>GZIP file format specification version 4.3</title>
<author initials='P.' surname='Deutsch' fullname='P. Deutsch'><organization /></author>
<date year='1996' month='May' />
<abstract><t>This specification defines a lossless compressed data format that is compatible with the widely used GZIP utility.  This memo provides information for the Internet community.  This memo does not specify an Internet standard of any kind.</t></abstract>
</front>
<seriesInfo name='RFC' value='1952'/>
<seriesInfo name='DOI' value='10.17487/RFC1952'/>
</reference>



<reference  anchor='RFC2119' target='http://www.rfc-editor.org/info/rfc2119'>
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials='S.' surname='Bradner' fullname='S. Bradner'><organization /></author>
<date year='1997' month='March' />
<abstract><t>In many standards track documents several words are used to signify the requirements in the specification.  These words are often capitalized. This document defines these words as they should be interpreted in IETF documents.  This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract>
</front>
<seriesInfo name='BCP' value='14'/>
<seriesInfo name='RFC' value='2119'/>
<seriesInfo name='DOI' value='10.17487/RFC2119'/>
</reference>



<reference  anchor='RFC7540' target='http://www.rfc-editor.org/info/rfc7540'>
<front>
<title>Hypertext Transfer Protocol Version 2 (HTTP/2)</title>
<author initials='M.' surname='Belshe' fullname='M. Belshe'><organization /></author>
<author initials='R.' surname='Peon' fullname='R. Peon'><organization /></author>
<author initials='M.' surname='Thomson' fullname='M. Thomson' role='editor'><organization /></author>
<date year='2015' month='May' />
<abstract><t>This specification describes an optimized expression of the semantics of the Hypertext Transfer Protocol (HTTP), referred to as HTTP version 2 (HTTP/2).  HTTP/2 enables a more efficient use of network resources and a reduced perception of latency by introducing header field compression and allowing multiple concurrent exchanges on the same connection.  It also introduces unsolicited push of representations from servers to clients.</t><t>This specification is an alternative to, but does not obsolete, the HTTP/1.1 message syntax.  HTTP's existing semantics remain unchanged.</t></abstract>
</front>
<seriesInfo name='RFC' value='7540'/>
<seriesInfo name='DOI' value='10.17487/RFC7540'/>
</reference>




    </references>

    <references title='Informative References'>





<reference  anchor='RFC7230' target='http://www.rfc-editor.org/info/rfc7230'>
<front>
<title>Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing</title>
<author initials='R.' surname='Fielding' fullname='R. Fielding' role='editor'><organization /></author>
<author initials='J.' surname='Reschke' fullname='J. Reschke' role='editor'><organization /></author>
<date year='2014' month='June' />
<abstract><t>The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems.  This document provides an overview of HTTP architecture and its associated terminology, defines the &quot;http&quot; and &quot;https&quot; Uniform Resource Identifier (URI) schemes, defines the HTTP/1.1 message syntax and parsing requirements, and describes related security concerns for implementations.</t></abstract>
</front>
<seriesInfo name='RFC' value='7230'/>
<seriesInfo name='DOI' value='10.17487/RFC7230'/>
</reference>



<reference  anchor='RFC3230' target='http://www.rfc-editor.org/info/rfc3230'>
<front>
<title>Instance Digests in HTTP</title>
<author initials='J.' surname='Mogul' fullname='J. Mogul'><organization /></author>
<author initials='A.' surname='Van Hoff' fullname='A. Van Hoff'><organization /></author>
<date year='2002' month='January' />
<abstract><t>HTTP/1.1 defines a Content-MD5 header that allows a server to include a digest of the response body.  However, this is specifically defined to cover the body of the actual message, not the contents of the full file (which might be quite different, if the response is a Content-Range, or uses a delta encoding).  Also, the Content-MD5 is limited to one specific digest algorithm; other algorithms, such as SHA-1 (Secure Hash Standard), may be more appropriate in some circumstances.  Finally, HTTP/1.1 provides no explicit mechanism by which a client may request a digest.  This document proposes HTTP extensions that solve these problems.  [STANDARDS-TRACK]</t></abstract>
</front>
<seriesInfo name='RFC' value='3230'/>
<seriesInfo name='DOI' value='10.17487/RFC3230'/>
</reference>




    </references>


<section anchor="changelog" title="Changelog">

<t>Since -07:</t>

<t><list style="symbols">
  <t>define “reliable” in the ‘experimental’ section, and provide pointers
to potential workarounds</t>
  <t>remove fragmentation, since the text added no value</t>
</list></t>

<t>Since -06:</t>

<t><list style="symbols">
  <t>change document title from “Encoded” to “Gzipped”</t>
  <t>improve text under GZIPPED_DATA (<xref target="gzipped-data"/>)</t>
  <t>clarify that GZIPPED_DATA and DATA can be interleaved</t>
  <t>explain experimental status and risks of broken implementations</t>
</list></t>

<t>Since -05:</t>

<t><list style="symbols">
  <t>changed ACCEPT_ENCODED_DATA back from a frame to a setting, since it
carries a single scalar value now</t>
</list></t>

<t>Since -04:</t>

<t><list style="symbols">
  <t>reduced encoding options to only gzip (suggested by Martin Thomson)</t>
  <t>remove fragmentation and segment stuff, including reference to ‘http2-segments’ I-D</t>
  <t>updated HTTP/2 reference from I-D to (freshly published) RFC7230</t>
</list></t>

<t>Since -03:</t>

<t><list style="symbols">
  <t>added ‘identity’ encoding; removed ‘compress’ and ‘zlib’ (suggested by PHK)</t>
  <t>added SEGMENT flag, for segments that don’t continue</t>
  <t>clarified that ACCEPT is for a connection, and ENCODED_DATA is for a stream</t>
  <t>copied “padding” text from HTTP/2 draft</t>
</list></t>

<t>Since -02:</t>

<t><list style="symbols">
  <t>moved all discussion of fragmentation and segments to its own section</t>
</list></t>

<t>Since -01:</t>

<t><list style="symbols">
  <t>referenced new draft-kerwin-http2-segments to handle fragmentation</t>
</list></t>

<t>Since -00:</t>

<t><list style="symbols">
  <t>changed ACCEPT_ENCODED_DATA from a complex setting to a frame</t>
  <t>improved IANA Considerations section (with lots of input from Keith Morgan)</t>
</list></t>

</section>


  </back>
</rfc>

