Internet-Draft RECALL_DEVICE March 2024
Haynes Expires 20 September 2024 [Page]
Workgroup:
Network File System Version 4
Internet-Draft:
draft-haynes-nfsv4-recalldevice-00
Published:
Intended Status:
Standards Track
Expires:
Author:
T. Haynes
Hammerspace

Add CB_LAYOUTRECLL_DEVICE to NFSv4.2

Abstract

The Parallel Network File System (pNFS) allows for the metadata server to use CB_LAYOUTRECALL to recall a layout from a client by file id or file system id or all. It also allows the server to use CB_NOTIFY_DEVICEID to delete a devicid. It does not provide a mechanism for the metadata server to recall all layouts that have a data file on a specific deviceid. This document presents an extension to RFC8881 to allow the server recall layouts from clients based on deviceid.

This note is to be removed before publishing as an RFC.

Discussion of this draft takes place on the NFSv4 working group mailing list (nfsv4@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/nfsv4/. Working Group information can be found at https://datatracker.ietf.org/wg/nfsv4/about/.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 20 September 2024.

Table of Contents

1. Introduction

In the Network File System version4 (NFSv4) with a Parallel NFS (pNFS) metadata server ([RFC8881]), there is no mechanism for the metadata server to recall layouts from the client for when a particular deviceid (see Section 3.3.14 of [RFC8881]) either temporarily or permanently is no longer available.

One use case is when the deviceids in a layout are separated by power fault domains. Each layout might describe 3 different devices, each contained in a different power fault domain. In such a scenario, a single fault domain can have the power removed and not cause the loss of access to the data. However, client I/O will be impacted as the client still has to perform WRITEs (see Section 18.32 of [RFC8881]) to the unavailable device, send LAYOUTERRORs (see Section 15.6 of [RFC7862]) to inform the metadata server of NFS4ERR_NXIO (see Section 15.1.16.3 of [RFC8881]).

If the metadata sever had the means to recall layouts by deviceid, a lot of this unnecessary traffic could be eliminated. Finally, while the metadata server could recall layouts one by one, this is again unnecessary traffic and can be offloaded to the client.

Besides the use case above, consider if the metadata server wants to set the NOTIFY4_DEVICEID_DELETE in the CB_NOTIFY_DEVICEID callback (see Section 20.12 of [RFC8881]). This flag cannot be set if a layout is outstanding for a deviceid. While the metadata server can revoke all such layouts, there is no way to know that the client has acknowledged that revocation and hence is still not doing I/O to other data files in the layout. The metadata server could fence those layouts as well (see Section 12.5.5 of [RFC8881]), but that can be an expensive operation.

Using the process detailed in [RFC8178], the revisions in this document become an extension of NFSv4.2 [RFC7862]. They are built on top of the external data representation (XDR) [RFC4506] generated from [RFC7863].

1.1. Do we need [RFC8435]?

This section is to be removed before publishing as an RFC.

The authors have tried to introduce this new functionality outside of a particular pNFS Layout Type. Does that work?

1.2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Extension to Operation 5: CB_LAYOUTRECALL - Recall Layout from Client

The original union layoutrecall4 (see Section 20.3.1 of [RFC8881]) is:

<CODE BEGINS> file "new_union_layoutrecall4"

enum layoutrecall_type4 {
        LAYOUTRECALL4_FILE = LAYOUT4_RET_REC_FILE,
        LAYOUTRECALL4_FSID = LAYOUT4_RET_REC_FSID,
        LAYOUTRECALL4_ALL  = LAYOUT4_RET_REC_ALL
};

union layoutrecall4 switch(layoutrecall_type4 lor_recalltype) {
   case LAYOUTRECALL4_FILE:
           layoutrecall_file4 lor_layout;
   case LAYOUTRECALL4_FSID:
           fsid4              lor_fsid;
   case LAYOUTRECALL4_ALL:
           void;
   };


<CODE ENDS>

The proposed extension is:

<CODE BEGINS> file "new_union_layoutrecall4"

///    const LAYOUT4_RET_REC_ALL       = 4;
///
///    enum layoutrecall_type4 {
///           LAYOUTRECALL4_FILE = LAYOUT4_RET_REC_FILE,
///           LAYOUTRECALL4_FSID = LAYOUT4_RET_REC_FSID,
///           LAYOUTRECALL4_ALL  = LAYOUT4_RET_REC_ALL,
///           LAYOUTRECALL4_DEVICEID = LAYOUTRECALL4_RET_REC_DEVICEID
///   };
///
/// union layoutrecall4 switch(layoutrecall_type4 lor_recalltype) {
///   case LAYOUTRECALL4_FILE:
///           layoutrecall_file4 lor_layout;
///   case LAYOUTRECALL4_FSID:
///           fsid4              lor_fsid;
///   case LAYOUTRECALL4_DEVICEID:
///           deviceid4          lor_deviceid;
///   case LAYOUTRECALL4_ALL:
///           void;
///   };


<CODE ENDS>

With this minimal change, all of the semantics of CB_LAYOUTRECALL in (see Section 20.3 of [RFC8881]) remain the same, i.e., the client and server are aware of how CB_LAYOUTRECALL interacts with each other. The one issue to investigated is what happens if a NFSv4.2 client sees a LAYOUTRECALL4_DEVICEID in a CB_LAYOUTRECALL. They SHOULD return NFS4ERR_UNION_NOTSUPP, but the implementations might not be compliant with [RFC8178]. As such, a survey should be conducted of the major implementations.

Finally, when the client does handle a LAYOUTRECALL4_DEVICEID in a CB_LAYOUTRECALL, it MUST return all layouts which have a given deviceid. The server can determine that the client no longer has any layouts with the given devicedid once the client replies with NFS4ERR_NOMATCHING_LAYOUT.

3. Extraction of XDR

This document contains the external data representation (XDR) [RFC4506] description of the new open flags for delegating the file to the client. The XDR description is embedded in this document in a way that makes it simple for the reader to extract into a ready-to-compile form. The reader can feed this document into the following shell script to produce the machine readable XDR description of the new flags:

<CODE BEGINS>
#!/bin/sh
grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'


<CODE ENDS>

That is, if the above script is stored in a file called "extract.sh", and this document is in a file called "spec.txt", then the reader can do:

<CODE BEGINS>
sh extract.sh < spec.txt > layout_wcc.x


<CODE ENDS>

The effect of the script is to remove leading white space from each line, plus a sentinel sequence of "///". XDR descriptions with the sentinel sequence are embedded throughout the document.

Note that the XDR code contained in this document depends on types from the NFSv4.2 nfs4_prot.x file (generated from [RFC7863]). This includes both nfs types that end with a 4, such as offset4, length4, etc., as well as more generic types such as uint32_t and uint64_t.

While the XDR can be appended to that from [RFC7863], the various code snippets belong in their respective areas of the that XDR.

4. Security Considerations

There are no new security considerations beyond those in [RFC7862].

5. IANA Considerations

IANA should use the current document (RFC-TBD) as the reference for the new entries.

6. References

6.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC4506]
Eisler, M., Ed., "XDR: External Data Representation Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, , <https://www.rfc-editor.org/info/rfc4506>.
[RFC7862]
Haynes, T., "Network File System (NFS) Version 4 Minor Version 2 Protocol", RFC 7862, DOI 10.17487/RFC7862, , <https://www.rfc-editor.org/info/rfc7862>.
[RFC7863]
Haynes, T., "Network File System (NFS) Version 4 Minor Version 2 External Data Representation Standard (XDR) Description", RFC 7863, DOI 10.17487/RFC7863, , <https://www.rfc-editor.org/info/rfc7863>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8178]
Noveck, D., "Rules for NFSv4 Extensions and Minor Versions", RFC 8178, DOI 10.17487/RFC8178, , <https://www.rfc-editor.org/info/rfc8178>.
[RFC8434]
Haynes, T., "Requirements for Parallel NFS (pNFS) Layout Types", RFC 8434, DOI 10.17487/RFC8434, , <https://www.rfc-editor.org/info/rfc8434>.
[RFC8435]
Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible File Layout", RFC 8435, DOI 10.17487/RFC8435, , <https://www.rfc-editor.org/info/rfc8435>.
[RFC8881]
Noveck, D., Ed. and C. Lever, "Network File System (NFS) Version 4 Minor Version 1 Protocol", RFC 8881, DOI 10.17487/RFC8881, , <https://www.rfc-editor.org/info/rfc8881>.

6.2. Informative References

IETF Trust, "Legal Provisions Relating to IETF Documents", , <http://trustee.ietf.org/docs/IETF-Trust-License-Policy.pdf>.
[RFC1813]
Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 Protocol Specification", RFC 1813, DOI 10.17487/RFC1813, , <https://www.rfc-editor.org/info/rfc1813>.

Appendix A. Acknowledgments

Trond Myklebust and Paul Saab have were invloved in the initial requirements for this functionality.

Author's Address

Thomas Haynes
Hammerspace