0% found this document useful (0 votes)
57 views

Design and Implementation of The Sun Network Filesystem: R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon

NFS was the first commercially successful network file system developed by Sun Microsystems in the 1980s. It was designed to be robust, maintain UNIX semantics transparently across a network, and provide adequate performance. The key aspects of NFS included its stateless protocol design for easy crash recovery, use of file handles and lookups to access remote files, and caching at the client for performance with consistency protocols to maintain semantics. While NFS had limitations, it succeeded due to its robustness, reasonable efficiency through tuning, and ability to evolve over time.

Uploaded by

michael.ferraris
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Design and Implementation of The Sun Network Filesystem: R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon

NFS was the first commercially successful network file system developed by Sun Microsystems in the 1980s. It was designed to be robust, maintain UNIX semantics transparently across a network, and provide adequate performance. The key aspects of NFS included its stateless protocol design for easy crash recovery, use of file handles and lookups to access remote files, and caching at the client for performance with consistency protocols to maintain semantics. While NFS had limitations, it succeeded due to its robustness, reasonable efficiency through tuning, and ability to evolve over time.

Uploaded by

michael.ferraris
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

DESIGN AND IMPLEMENTATION

OF THE
SUN NETWORK FILESYSTEM

R. Sandberg, D. Goldberg
S. Kleinman, D. Walsh, R. Lyon
Sun Microsystems
What is NFS?
• First commercially successful network file system:
– Developed by Sun Microsystems for their
diskless workstations
– Designed for robustness and “adequate
performance”
– Sun published all protocol specifications
– Many many implementations
Paper highlights
• NFS is stateless
– All client requests must be self-contained
• The virtual filesystem interface
– VFS operations
– VNODE operations
• Performance issues
– Impact of tuning on NFS performance
Objectives (I)
• Machine and Operating System Independence
– Could be implemented on low-end machines of the mid-
80’s
• Fast Crash Recovery
– Major reason behind stateless design
• Transparent Access
– Remote files should be accessed in exactly the same
way as local files
Objectives (II)
• UNIX semantics should be maintained on client
– Best way to achieve transparent access
• “Reasonable” performance
– Robustness and preservation of UNIX
semantics were much more important
• Contrast with Sprite and Coda
Basic design
• Three important parts
– The protocol
– The server side
– The client side
The protocol (I)
• Uses the Sun RPC mechanism and Sun eXternal
Data Representation (XDR) standard
• Defined as a set of remote procedures
• Protocol is stateless
– Each procedure call contains all the
information necessary to complete the call
– Server maintains no “between call” information
Advantages of statelessness
• Crash recovery is very easy:
– When a server crashes, client just resends
request until it gets an answer from the
rebooted server
– Client cannot tell difference between a server
that has crashed and recovered and a slow
server
• Client can always repeat any request
Consequences of statelessness
• Read and writes must specify their start offset
– Server does not keep track of current position in
the file
– User still use conventional UNIX reads and writes
• Open system call translates into several
lookup calls to server
• No NFS equivalent to UNIX close system call
The lookup call (I)
• Returns a file handle instead of a file descriptor
– File handle specifies unique location of file
• lookup(dirfh, name) returns (fh, attr)
– Returns file handle fh and attributes of named
file in directory dirfh
– Fails if client has no right to access directory
dirfh
The lookup call (II)
– One single open call such as
fd = open(“/usr/joe/6360/list.txt”)
will be result in several calls to lookup
lookup(rootfh, “usr”) returns (fh0, attr)
lookup(fh0, “joe”) returns (fh1, attr)
lookup(fh1, “6360”) returns (fh2, attr)
lookup(fh2, “list.txt”) returns (fh, attr)
The lookup call (III)
• Why all these steps?
– Any of components of /usr/joe/6360/list.txt
could be a mount point
– Mount points are client dependent and mount
information is kept above the lookup() level
Server side (I)
• Server implements a write-through policy
– Required by statelessness
– Any blocks modified by a write request
(including i-nodes and indirect blocks) must
be written back to disk before the call
completes
Server side (II)
• File handle consists of
– Filesystem id identifying disk partition
– I-node number identifying file within partition
– Generation number changed every time
i-node is reused to store a new file
• Server will store
– Filesystem id in filesystem superblock
– I-node generation number in i-node
Client side (I)
• Provides transparent interface to NFS
• Mapping between remote file names and remote
file addresses is done a server boot time through
remote mount
– Extension of UNIX mounts
– Specified in a mount table
– Makes a remote subtree appear part of a local
subtree
Remote mount
Client tree
/
Server subtree

usr
rmount
bin

After rmount, root of server subtree


can be accessed as /usr
Client side (II)
• Provides transparent access to
– NFS
– Other file systems (including UNIX FFS)
• New virtual filesystem interface supports
– VFS calls, which operate on whole file system
– VNODE calls, which operate on individual files
• Treats all files in the same fashion
Client side (III)
User interface
UNIX system calls is unchanged

VNODE/VFS Common
interface

Other FS NFS UNIX FS

RPC/XDR disk

LAN
File consistency issues
• Cannot build an efficient network file system
without client caching
– Cannot send each and every read or write to
the server
• Client caching introduces consistency issues
Example
• Consider a one-block file X that is concurrently
modified by two workstations
• If file is cached at both workstations
– A will not see changes made by B
– B will not see changes made by A
• We will have
– Inconsistent updates
– Non respect of UNIX semantics
Example

A B
Server

x’ x’’ x

Inconsistent updates
UNIX file access semantics (I)
• Conventional timeshared UNIX semantics
guarantee that
– All writes are executed in strict sequential
fashion
– Their effect is immediately visible to all other
processes accessing the file
• Interleaving of writes coming from different
processes is left to the kernel discretion
UNIX file access semantics (II)
• UNIX file access semantics result from the use
of a single I/O buffer containing all cached
blocks and i-nodes
• Server caching is not a problem
• Disabling client caching is not an option:
– Would be too slow
– Would overload the file server
NFS solution (I)
• Stateless server does not know how many users
are accessing a given file
– Clients do not know either
• Clients must
– Frequently send their modified blocks to the
server
– Frequently ask the server to revalidate the
blocks they have in their cache
NFS solution (II)

A B
? ?
Server

x’ x

Better to propagate my updates


and refresh my cache
Implementation
• VNODE interface only made the kernel 2%
slower
• Few of the UNIX FS were modified
• MOUNT was first included into the NFS protocol
– Later broken into a separate user-level RPC
process
Hard issues (I)
• NFS root file systems cannot be shared:
– Too many problems
• Clients can mount any remote subtree any way they
want:
– Could have different names for same subtree by
mounting it in different places
– NFS uses a set of basic mounted filesystems on
each machine and let users do the rest
Hard issues (II)
• NFS passes user id, group id and groups on
each call
– Requires same mapping from user id and
group id to user on all machines
– Achieved by Yellow Pages (YP) service
• NFS has no file locking
Hard issues (III)
• UNIX allows removal of opened files
– File becomes nameless
– Processes that have the file opened can continue
to access the file
– Other processes cannot
• NFS cannot do that and remain stateless
– NFS client detecting removal of an opened file
renames it and deletes renamed file at close time
Hard issues (IV)
• In general, NFS tries to preserve UNIX open
file semantics but does not always succeed
– If an opened file is removed by a process on
another client, file is immediately deleted
Tuning (I)
• First version of NFS was much slower than Sun
Network Disk (ND)
• First improvement
– Added client buffer cache
– Increased the size of UDP packets from 2048 to
9000 bytes
• Next improvement reduced the amount of buffer to
buffer copying in NFS and RPC (bcopy)
Tuning (II)
• Third improvement introduced a client-side
attribute cache
– Cache is updated every time new attributes
arrive from the server
– Cached attributes are discarded after
• 3 seconds for file attributes
• 30 seconds for directory attributes
• These three improvements cut benchmark run
time by 50%
Tuning (III)

These three improvements


had the biggest impact on
NFS performance
My conclusion
• NFS succeeded because it was
– Robust
– Reasonably efficient
– Tuned to the needs of diskless workstations

In addition, NFS was able to evolve and


incorporate concepts such as close-to-open
consistency (see next paper)

You might also like