Internet Engineering Task Force (IETF)M. Nottingham
Request for Comments: 8288September 2017
Obsoletes: 5988
Category: Standards Track
ISSN: 2070-1721

Web Linking


Abstract

This specification defines a model for the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

It also defines the serialisation of such links in HTTP headers with the Link header field.

Status of This Memo

This is an Internet Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc8288.

Copyright Notice

Copyright © 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

1. Introduction

This specification defines a model for the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

HTML [W3C.REC-html5-20141028] and Atom [RFC4287] both have well-defined concepts of linking; Section 2 generalises this into a framework that encompasses linking in these formats and (potentially) elsewhere.

Furthermore, Section 3 defines an HTTP header field for conveying such links.

1.1. Notational Conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses the Augmented Backus-Naur Form (ABNF) [RFC5234] notation of [RFC7230], including the #rule, and explicitly includes the following rules from it: quoted-string, token, SP (space), BWS (bad whitespace), OWS (optional whitespace), RWS (required whitespace), LOALPHA, DIGIT.

Additionally, the following rules are included:

1.2. Conformance and Error Handling

The requirements regarding conformance and error handling highlighted in [RFC7230], Section 2.5 apply to this document.

4. IANA Considerations

5. Security Considerations

The content of the Link header field is not secure, private, or integrity-guaranteed. Use of Transport Layer Security (TLS) with HTTP [RFC2818] is currently the only end-to-end way to provide these properties.

Link applications ought to consider the attack vectors opened by automatically following, trusting, or otherwise using links gathered from HTTP header fields.

For example, Link header fields that use the “anchor” parameter to associate a link’s context with another resource cannot be trusted since they are effectively assertions by a third party that could be incorrect or malicious. Applications can mitigate this risk by specifying that such links should be discarded unless some relationship between the resources is established (e.g., they share the same authority).

Dereferencing links has a number of risks, depending on the application in use. For example, the Referer header [RFC7231] can expose information about the application’s state (including private information) in its value. Likewise, cookies [RFC6265] are another mechanism that, if used, can become an attack vector. Applications can mitigate these risks by carefully specifying how such mechanisms should operate.

The Link header field makes extensive use of IRIs and URIs. See [RFC3987], Section 8 for security considerations relating to IRIs. See [RFC3986], Section 7 for security considerations relating to URIs. See [RFC7230], Section 9 for security considerations relating to HTTP header fields.

6. Internationalisation Considerations

Link targets may need to be converted to URIs in order to express them in serialisations that do not support IRIs. This includes the Link HTTP header field.

Similarly, the anchor parameter of the Link header field does not support IRIs; therefore, IRIs must be converted to URIs before inclusion there.

Relation types are defined as URIs, not IRIs, to aid in their comparison. It is not expected that they will be displayed to end users.

Note that registered Relation Names are required to be lowercase ASCII letters.

7. References

7.1. Normative References

[RFC2119]
Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC3864]
Klyne, G., Nottingham, M., and J. Mogul, “Registration Procedures for Message Header Fields”, BCP 90, RFC 3864, DOI 10.17487/RFC3864, September 2004, <https://www.rfc-editor.org/info/rfc3864>.
[RFC3986]
Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax”, STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, <https://www.rfc-editor.org/info/rfc3986>.
[RFC3987]
Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs)”, RFC 3987, DOI 10.17487/RFC3987, January 2005, <https://www.rfc-editor.org/info/rfc3987>.
[RFC5234]
Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF”, STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008, <https://www.rfc-editor.org/info/rfc5234>.
[RFC5646]
Phillips, A., Ed. and M. Davis, Ed., “Tags for Identifying Languages”, BCP 47, RFC 5646, DOI 10.17487/RFC5646, September 2009, <https://www.rfc-editor.org/info/rfc5646>.
[RFC6838]
Freed, N., Klensin, J., and T. Hansen, “Media Type Specifications and Registration Procedures”, BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, <https://www.rfc-editor.org/info/rfc6838>.
[RFC7230]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing”, RFC 7230, DOI 10.17487/RFC7230, June 2014, <https://www.rfc-editor.org/info/rfc7230>.
[RFC7231]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content”, RFC 7231, DOI 10.17487/RFC7231, June 2014, <https://www.rfc-editor.org/info/rfc7231>.
[RFC8126]
Cotton, M., Leiba, B., and T. Narten, “Guidelines for Writing an IANA Considerations Section in RFCs”, BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-editor.org/info/rfc8126>.
[RFC8174]
Leiba, B., “Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words”, BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8187]
Reschke, J., “Indicating Character Encoding and Language for HTTP Header Field Parameters”, RFC 8187, DOI 10.17487/RFC8187, September 2017, <https://www.rfc-editor.org/info/rfc8187>.
[W3C.REC-css3-mediaqueries-20120619]
Rivoal, F., “Media Queries”, World Wide Web Consortium Recommendation REC-css3-mediaqueries-20120619, June 2012, <http://www.w3.org/TR/2012/REC-css3-mediaqueries-20120619>.

7.2. Informative References

[RFC2046]
Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types”, RFC 2046, DOI 10.17487/RFC2046, November 1996, <https://www.rfc-editor.org/info/rfc2046>.
[RFC2818]
Rescorla, E., “HTTP Over TLS”, RFC 2818, DOI 10.17487/RFC2818, May 2000, <https://www.rfc-editor.org/info/rfc2818>.
[RFC4287]
Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format”, RFC 4287, DOI 10.17487/RFC4287, December 2005, <https://www.rfc-editor.org/info/rfc4287>.
[RFC6265]
Barth, A., “HTTP State Management Mechanism”, RFC 6265, DOI 10.17487/RFC6265, April 2011, <https://www.rfc-editor.org/info/rfc6265>.
[W3C.REC-html5-20141028]
Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Navara, E., O&#039;Connor, T., and S. Pfeiffer, “HTML5”, World Wide Web Consortium Recommendation REC-html5-20141028, October 2014, <http://www.w3.org/TR/2014/REC-html5-20141028>.

B. Algorithms for Parsing Link Header Fields

This appendix outlines a set of non-normative algorithms: for parsing the Link header(s) out of a header set, for parsing a Link header field value, and algorithms for parsing generic parts of the field value.

These algorithms are more permissive than the ABNF defining the syntax might suggest; the error handling embodied in them is a reasonable approach, but not one that is required. As such they are advisory only, and in cases where there is disagreement, the correct behaviour is defined by the body of this specification.

B.1. Parsing a Header Set for Links

This algorithm can be used to parse the Link header fields that a HTTP header set contains. Given a header_set of (string field_name, string field_value) pairs, assuming ASCII encoding, it returns a list of link objects.

  1. Let field_values be a list containing the members of header_set whose field_name is a case-insensitive match for “link”.
  2. Let links be an empty list.
  3. For each field_value in field_values:
    1. Let value_links be the result of Parsing A Link Field Value (Appendix B.2) from field_value.
    2. Append each member of value_links to links.
  4. Return links.

B.2. Parsing a Link Field Value

This algorithm parses zero or more comma-separated link-values from a Link header field. Given a string field_value, assuming ASCII encoding, it returns a list of link objects.

  1. Let links be an empty list.
  2. While field_value has content:
    1. Consume any leading OWS.
    2. If the first character is not “<”, return links.
    3. Discard the first character (“<”).
    4. Consume up to but not including the first “>” character or end of field_value and let the result be target_string.
    5. If the next character is not “>”, return links.
    6. Discard the leading “>” character.
    7. Let link_parameters be the result of Parsing Parameters (Appendix B.3) from field_value (consuming zero or more characters of it).
    8. Let target_uri be the result of relatively resolving (as per [RFC3986], Section 5.2) target_string. Note that any base URI carried in the payload body is NOT used.
    9. Let relations_string be the second item of the first tuple of link_parameters whose first item matches the string “rel” or the empty string (“”) if it is not present.
    10. Split relations_string on RWS (removing it in the process) into a list of string relation_types.
    11. Let context_string be the second item of the first tuple of link_parameters whose first item matches the string “anchor”. If it is not present, context_string is the URL of the representation carrying the Link header [RFC7231], Section 3.1.4.1, serialised as a URI. Where the URL is anonymous, context_string is null.
    12. Let context_uri be the result of relatively resolving (as per [RFC3986], Section 5.2) context_string, unless context_string is null, in which case context is null. Note that any base URI carried in the payload body is NOT used.
    13. Let target_attributes be an empty list.
    14. For each tuple (param_name, param_value) of link_parameters:
      1. If param_name matches “rel” or “anchor”, skip this tuple.
      2. If param_name matches “media”, “title”, “title*” or “type” and target_attributes already contains a tuple whose first element matches the value of param_name, skip this tuple.
      3. Append (param_name, param_value) to target_attributes.
    15. Let star_param_names be the set of param_names in the (param_name, param_value) tuples of link_parameters where the last character of param_name is an asterisk (“*”).
    16. For each star_param_name in star_param_names:
      1. Let base_param_name be star_param_name with the last character removed.
      2. If the implementation does not choose to support an internationalised form of a parameter named base_param_name for any reason (including, but not limited to, it being prohibited by the parameter’s specification), remove all tuples from link_parameters whose first member is star_param_name, and skip to the next star_param_name.
      3. Remove all tuples from link_parameters whose first member is base_param_name.
      4. Change the first member of all tuples in link_parameters whose first member is star_param_name to base_param_name.
    17. For each relation_type in relation_types:
      1. Case-normalise relation_type to lowercase.
      2. Append a link object to links with the target target_uri, relation type of relation_type, context of context_uri, and target attributes target_attributes.
  3. Return links.

B.3. Parsing Parameters

This algorithm parses the parameters from a header field value. Given input, an ASCII string, it returns a list of (string parameter_name, string parameter_value) tuples that it contains. input is modified to remove the parsed parameters.

  1. Let parameters be an empty list.
  2. While input has content:
    1. Consume any leading OWS.
    2. If the first character is not “;”, return parameters.
    3. Discard the leading “;” character.
    4. Consume any leading OWS.
    5. Consume up to but not including the first BWS, “=”, “;”, “,” character or end of input and let the result be parameter_name.
    6. Consume any leading BWS.
    7. If the next character is “=”:
      1. Discard the leading “=” character.
      2. Consume any leading BWS.
      3. If the next character is DQUOTE, let parameter_value be the result of Parsing a Quoted String (Appendix B.4) from input (consuming zero or more characters of it).
      4. Else, consume the contents up to but not including the first “;” or “,” character, or up to the end of input, and let the results be parameter_value.
      5. If the last character of parameter_name is an asterisk (“*”), decode parameter_value according to [RFC8187]. Continue processing input if an unrecoverable error is encountered.
    8. Else:
      1. Let parameter_value be an empty string.
    9. Case-normalise parameter_name to lowercase.
    10. Append (parameter_name, parameter_value) to parameters.
    11. Consume any leading OWS.
    12. If the next character is “,” or the end of input, stop processing input and return parameters.

B.4. Parsing a Quoted String

This algorithm parses a quoted string, as per [RFC7230], Section 3.2.6. Given input, an ASCII string, it returns an unquoted string. input is modified to remove the parsed string.

  1. Let output be an empty string.
  2. If the first character of input is not DQUOTE, return output.
  3. Discard the first character.
  4. While input has content:
    1. If the first character is a backslash (“\”):
      1. Discard the first character.
      2. If there is no more input, return output.
      3. Else, consume the first character and append it to output.
    2. Else, if the first character is DQUOTE, discard it and return output.
    3. Else, consume the first character and append it to output.
  5. Return output.

C. Changes from RFC 5988

This specification has the following differences from its predecessor, RFC 5988:

  • The initial relation type registrations were removed, since they’ve already been registered by RFC 5988.
  • The introduction has been shortened.
  • The “Link Relation Application Data” registry has been removed.
  • Incorporated errata.
  • Updated references.
  • Link cardinality was clarified.
  • Terminology was changed from “target IRI” and “context IRI” to “link target” and “link context”, respectively.
  • Made assigning a URI to registered relation types serialisation specific.
  • Removed misleading statement that the Link header field is semantically equivalent to HTML and Atom links.
  • More carefully defined and used “link serialisations” and “link applications.”
  • Clarified the cardinality of target attributes (generically and for “type”).
  • Corrected the default link context for the Link header field, to be dependent upon the identity of the representation (as per RFC 7231).
  • Defined a suggested parsing algorithm for the Link header.
  • The value space of target attributes and their definition has been specified.
  • The ABNF has been updated to be compatible with [RFC7230]. In particular, whitespace is now explicit.
  • Some parameters on the HTTP header field can now appear as a token.
  • Parameters on the HTTP header can now be valueless.
  • Handling of quoted strings is now defined by [RFC7230].
  • The type header field parameter now needs to be quoted (as token does not allow “/”).

Author's Address

Mark Nottingham
Email: mnot@mnot.net
URI: https://www.mnot.net/