HTTP M. Nottingham Internet-Draft 18 November 2024 Intended status: Standards Track Expires: 22 May 2025 Retrofit Structured Fields for HTTP draft-ietf-httpbis-retrofit-latest Abstract This specification nominates a selection of existing HTTP fields whose values are compatible with Structured Fields syntax, so that they can be handled as such (subject to certain caveats). To accommodate some additional fields whose syntax is not compatible, it also defines mappings of their semantics into Structured Fields. It does not specify how to convey them in HTTP messages. About This Document This note is to be removed before publishing as an RFC. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-httpbis-retrofit/. Discussion of this document takes place on the HTTP Working Group mailing list (mailto:ietf-http-wg@w3.org), which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/. Working Group information can be found at https://httpwg.org/. Source for this draft and an issue tracker can be found at https://github.com/httpwg/http-extensions/labels/retrofit. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 22 May 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 1.1. Using Retrofit Structured Fields 1.2. Notational Conventions 2. Compatible Fields 2.1. Caveats 3. Mapped Fields 3.1. URLs 3.2. Dates 3.3. ETags 3.4. Cookies 4. IANA Considerations 5. Security Considerations 6. Normative References Author's Address 1. Introduction Structured Field Values for HTTP [STRUCTURED-FIELDS] introduced a data model with associated parsing and serialization algorithms for use by new HTTP field values. Fields that are defined as Structured Fields can bring advantages that include: * Improved interoperability and security: precisely defined parsing and serialisation algorithms are typically not available for fields defined with just ABNF and/or prose. * Reuse of common implementations: many parsers for other fields are specific to a single field or a small family of fields. * Canonical form: because a deterministic serialisation algorithm is defined for each type, Structure Fields have a canonical representation. * Enhanced API support: a regular data model makes it easier to expose field values as a native data structure in implementations. * Alternative serialisations: While [STRUCTURED-FIELDS] defines a textual serialisation of that data model, other, more efficient serialisations of the underlying data model are also possible. However, a field needs to be defined as a Structured Field for these benefits to be realised. Many existing fields are not, making up the bulk of header and trailer fields seen in HTTP traffic on the internet. This specification defines how a selection of existing HTTP fields can be handled as Structured Fields, so that these benefits can be realised -- thereby making them Retrofit Structured Fields. It does so using two techniques. Section 2 lists compatible fields -- those that can be handled as if they were Structured Fields due to the similarity of their defined syntax to that in Structured Fields. Section 3 lists mapped fields -- those whose syntax needs to be transformed into an underlying data model which is then mapped into that defined by Structured Fields. 1.1. Using Retrofit Structured Fields Retrofitting data structures onto existing and widely-deployed HTTP fields requires careful handling to assure interoperability and security. This section highlights considerations for applications that use Retrofit Structured Fields. While the majority of field values seen in HTTP traffic should be able to be parsed or mapped successfully, some will not. An application using Retrofit Structured Fields will need to define how unsuccessful values will be handled. For example, an API that exposes field values using Structured Fields data types might make the field value available as a string in cases where the field did not successfully parse or map. The mapped field values described in Section 3 are not compatible with the original syntax of their fields, and so cannot be used unless parties processing them have explicitly indicated their support for that form of the field value. An application using Retrofit Structured Fields will need to define how to negotiate support for them. For example, an alternative serialization of fields that takes advantage of Structured Fields would need to establish an explicit negotiation mechanism to assure that both peers would handle that serialization appropriately before using it. See also the security considerations in Section 5. 1.2. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Compatible Fields The HTTP fields listed in Table 1 have values that can be handled as Structured Field Values according to the parsing and serialisation algorithms in [STRUCTURED-FIELDS] corresponding to the listed top- level type, subject to the caveats in Section 2.1. The top-level types are chosen for compatibility with the defined syntax of the field as well as with actual internet traffic. However, not all instances of these fields will successfully parse as a Structured Field Value. This might be because the field value is clearly invalid, or it might be because it is valid but not parseable as a Structured Field. An application using this specification will need to consider how to handle such field values. Depending on its requirements, it might be advisable to reject such values, treat them as opaque strings, or attempt to recover a Structured Field Value from them in an ad hoc fashion. +==================================+=================+ | Field Name | Structured Type | +==================================+=================+ | Accept | List | +----------------------------------+-----------------+ | Accept-Encoding | List | +----------------------------------+-----------------+ | Accept-Language | List | +----------------------------------+-----------------+ | Accept-Patch | List | +----------------------------------+-----------------+ | Accept-Post | List | +----------------------------------+-----------------+ | Accept-Ranges | List | +----------------------------------+-----------------+ | Access-Control-Allow-Credentials | Item | +----------------------------------+-----------------+ | Access-Control-Allow-Headers | List | +----------------------------------+-----------------+ | Access-Control-Allow-Methods | List | +----------------------------------+-----------------+ | Access-Control-Allow-Origin | Item | +----------------------------------+-----------------+ | Access-Control-Expose-Headers | List | +----------------------------------+-----------------+ | Access-Control-Max-Age | Item | +----------------------------------+-----------------+ | Access-Control-Request-Headers | List | +----------------------------------+-----------------+ | Access-Control-Request-Method | Item | +----------------------------------+-----------------+ | Age | Item | +----------------------------------+-----------------+ | Allow | List | +----------------------------------+-----------------+ | ALPN | List | +----------------------------------+-----------------+ | Alt-Svc | Dictionary | +----------------------------------+-----------------+ | Alt-Used | Item | +----------------------------------+-----------------+ | Cache-Control | Dictionary | +----------------------------------+-----------------+ | CDN-Loop | List | +----------------------------------+-----------------+ | Clear-Site-Data | List | +----------------------------------+-----------------+ | Connection | List | +----------------------------------+-----------------+ | Content-Encoding | List | +----------------------------------+-----------------+ | Content-Language | List | +----------------------------------+-----------------+ | Content-Length | List | +----------------------------------+-----------------+ | Content-Type | Item | +----------------------------------+-----------------+ | Cross-Origin-Resource-Policy | Item | +----------------------------------+-----------------+ | DNT | Item | +----------------------------------+-----------------+ | Expect | Dictionary | +----------------------------------+-----------------+ | Expect-CT | Dictionary | +----------------------------------+-----------------+ | Host | Item | +----------------------------------+-----------------+ | Keep-Alive | Dictionary | +----------------------------------+-----------------+ | Max-Forwards | Item | +----------------------------------+-----------------+ | Origin | Item | +----------------------------------+-----------------+ | Pragma | Dictionary | +----------------------------------+-----------------+ | Prefer | Dictionary | +----------------------------------+-----------------+ | Preference-Applied | Dictionary | +----------------------------------+-----------------+ | Retry-After | Item | +----------------------------------+-----------------+ | Sec-WebSocket-Extensions | List | +----------------------------------+-----------------+ | Sec-WebSocket-Protocol | List | +----------------------------------+-----------------+ | Sec-WebSocket-Version | Item | +----------------------------------+-----------------+ | Server-Timing | List | +----------------------------------+-----------------+ | Surrogate-Control | Dictionary | +----------------------------------+-----------------+ | TE | List | +----------------------------------+-----------------+ | Timing-Allow-Origin | List | +----------------------------------+-----------------+ | Trailer | List | +----------------------------------+-----------------+ | Transfer-Encoding | List | +----------------------------------+-----------------+ | Upgrade-Insecure-Requests | Item | +----------------------------------+-----------------+ | Vary | List | +----------------------------------+-----------------+ | X-Content-Type-Options | Item | +----------------------------------+-----------------+ | X-Frame-Options | Item | +----------------------------------+-----------------+ | X-XSS-Protection | List | +----------------------------------+-----------------+ Table 1: Compatible Fields 2.1. Caveats Note the following caveats regarding compatibility: Parsing differences: Some values may fail to parse as Structured Fields, even though they are valid according to their originally specified syntax. For example, HTTP parameter names are case- insensitive (per Section 5.6.6 of [HTTP]), but Structured Fields require them to be all-lowercase. Likewise, many Dictionary-based fields (e.g., Cache-Control, Expect-CT, Pragma, Prefer, Preference-Applied, Surrogate-Control) have case-insensitive keys. Similarly, the parameters rule in HTTP (see Section 5.6.6 of [HTTP]) allows whitespace before the ";" delimiter, but Structured Fields does not. And, Section 5.6.4 of [HTTP] allows backslash- escaping most characters in quoted strings, whereas Structured Field Strings only escape "\" and DQUOTE. The vast majority of fields seen in typical traffic do not exhibit these behaviors. Error handling: Parsing algorithms specified (or just widely implemented) for current HTTP headers may differ from those in Structured Fields in details such as error handling. For example, HTTP specifies that repeated directives in the Cache-Control header field have a different precedence than that assigned by a Dictionary structured field (which Cache-Control is mapped to). Token limitations: In Structured Fields, tokens are required to begin with an alphabetic character or "*", whereas HTTP tokens allow a wider range of characters. This prevents use of mapped values that begin with one of these characters. For example, media types, field names, methods, range-units, character and transfer codings that begin with a number or special character other than "*" might be valid HTTP protocol elements, but will not be able to be represented as Structured Field Tokens. Integer limitations: Structured Fields Integers can have at most 15 digits; larger values will not be able to be represented in them. IPv6 Literals: Fields whose values contain IPv6 literal addresses (such as CDN-Loop, Host, and Origin) are not able to be represented as Structured Fields Tokens, because the brackets used to delimit them are not allowed in Tokens. Empty Field Values: Empty and whitespace-only field values are considered errors in Structured Fields. For compatible fields, an empty field indicates that the field should be silently ignored. Alt-Svc: Some ALPN tokens (e.g., h3-Q43) do not conform to key's syntax, and therefore cannot be represented as a Token. Since the final version of HTTP/3 uses the h3 token, this shouldn't be a long-term issue, although future tokens may again violate this assumption. Content-Length: Note that Content-Length is defined as a List because it is not uncommon for implementations to mistakenly send multiple values. See Section 8.6 of [HTTP] for handling requirements. Retry-After: Only the delta-seconds form of Retry-After can be represented; a Retry-After value containing a http-date will need to be converted into delta-seconds to be conveyed as a Structured Field Value. 3. Mapped Fields Some HTTP field values have syntax that cannot be successfully parsed as Structured Field values. Instead, it is necessary to map them into a Structured Field value. For example, the Date HTTP header field carries a date: Date: Sun, 06 Nov 1994 08:49:37 GMT Its value would be mapped to: @784111777 Unlike those listed in Section 2, these representations are not compatible with the original fields' syntax, and MUST NOT be used unless they are explicitly and unambiguously supported. For example, this means that sending them to a next-hop recipient in HTTP requires prior negotiation. This specification does not define how to do so. 3.1. URLs The field names in Table 2 have values that can be mapped into Structured Field values by treating the original field's value as a String. +==================+ | Field Name | +==================+ | Content-Location | +------------------+ | Location | +------------------+ | Referer | +------------------+ Table 2: URL Fields For example, this Location field: Location: https://example.com/foo would have a mapped value of: "https://example.com/foo" 3.2. Dates The field names in Table 3 have values that can be mapped into Structured Field values by parsing their payload according to Section 5.6.7 of [HTTP] and representing the result as a Date. +=====================+ | Field Name | +=====================+ | Date | +---------------------+ | Expires | +---------------------+ | If-Modified-Since | +---------------------+ | If-Unmodified-Since | +---------------------+ | Last-Modified | +---------------------+ Table 3: Date Fields For example, an Expires field's value could be mapped as: @1659578233 3.3. ETags The field value of the ETag header field can be mapped into a Structured Field value by representing the entity-tag as a String, and the weakness flag as a Boolean "w" parameter on it, where true indicates that the entity-tag is weak; if 0 or unset, the entity-tag is strong. For example, this ETag header field: ETag: W/"abcdef" would have a mapped value of: "abcdef"; w If-None-Match's field value can be mapped into a Structured Field value which is a List of the structure described above. When a field value contains "*", it is represented as a Token. Likewise, If-Match's field value can be mapped into a Structured Field value in the same manner. For example, this If-None-Match field: If-None-Match: W/"abcdef", "ghijkl", * would have a mapped value of: "abcdef"; w, "ghijkl", * 3.4. Cookies The field values of the Cookie and Set-Cookie fields [COOKIES] can be mapped into Structured Fields Lists. In each case, a cookie is represented as an Inner List containing two Items; the cookie name and value. The cookie name is always a String; the cookie value is a String, unless it can be successfully parsed as the textual representation of another, bare Item structured type (e.g., Byte Sequence, Decimal, Integer, Token, or Boolean). Cookie attributes map to Parameters on the Inner List, with the parameter name being forced to lowercase. Cookie attribute values are Strings unless a specific type is defined for them. This specification defines types for existing cookie attributes in Table 4. +================+=================+ | Parameter Name | Structured Type | +================+=================+ | Domain | String | +----------------+-----------------+ | HttpOnly | Boolean | +----------------+-----------------+ | Expires | Date | +----------------+-----------------+ | Max-Age | Integer | +----------------+-----------------+ | Path | String | +----------------+-----------------+ | Secure | Boolean | +----------------+-----------------+ | SameSite | Token | +----------------+-----------------+ Table 4: Set-Cookie Parameter Types The Expires attribute is mapped to a Date representation of parsed- cookie-date (see Section 5.1.1 of [COOKIES]). For example, this Set-Cookie field: Set-Cookie: Lang=en-US; Expires=Wed, 09 Jun 2021 10:18:14 GMT; samesite=Strict; secure would have a mapped value of: ("Lang" "en-US"); expires=@1623233894; samesite=Strict; secure And this Cookie field: Cookie: SID=31d4d96e407aad42; lang=en-US would have a mapped value of: ("SID" "31d4d96e407aad42"), ("lang" "en-US") 4. IANA Considerations Please add the following note to the "Hypertext Transfer Protocol (HTTP) Field Name Registry": A prefix of "*" in the Structured Type column indicates that it is a retrofit type (i.e., not natively Structured); see RFC nnnn. Then, add a new column, "Structured Type", with the values from Section 2 assigned to the nominated registrations, prefixing each with "*" to indicate that it is a retrofit type. Finally, add a new column to the "Cookie Attribute Registry" established by [COOKIES] with the title "Structured Type", using information from Table 4. 5. Security Considerations Section 2 identifies existing HTTP fields that can be parsed and serialised with the algorithms defined in [STRUCTURED-FIELDS]. Variances from existing parser behavior might be exploitable, particularly if they allow an attacker to target one implementation in a chain (e.g., an intermediary). However, given the considerable variance in parsers already deployed, convergence towards a single parsing algorithm is likely to have a net security benefit in the longer term. Section 3 defines alternative representations of existing fields. Because downstream consumers might interpret the message differently based upon whether they recognise the alternative representation, implementations are prohibited from generating such values unless they have negotiated support for them with their peer. This specification does not define such a mechanism, but any such definition needs to consider the implications of doing so carefully. 6. Normative References [COOKIES] Bingler, S., West, M., and J. Wilander, "Cookies: HTTP State Management Mechanism", Work in Progress, Internet- Draft, draft-ietf-httpbis-rfc6265bis-15, 21 July 2024, . [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, June 2022, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [STRUCTURED-FIELDS] Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", Work in Progress, Internet-Draft, draft-ietf- httpbis-sfbis-06, 21 April 2024, . Author's Address Mark Nottingham Prahran Australia Email: mnot@mnot.net URI: https://www.mnot.net/