Distributed File Systems

Requirements of a Distributed File System

Transparency:

  • Access Transparency: Programs written to operate local files are able to access remote ones without modification
  • Location Transparency: Uniform name space, files maybe relocated without changing path-names
  • Mobility Transparency: no change of administration tables or client programs when files are moved
  • Performance Transparency: Clients continue to perform satisfactorily with load variations
  • Scaling Transparency

Concurrent File Updates

File Replication

Hardware and Operating System Heterogeneity

Fault Tolerance:

  • at-most-once
  • at-least-once  with idempotent operations
  • Can be stateless to avoid state recovery after a crash

Consistency: Maybe some deviations from one-copy update semantics.

Security: clients need to be authenticated

Efficiency

Architecture

Three main components are identified:

  • flat file service and directory service: export their interfaces through RPC to be used by client modules
  • client module: Programming interface to adapt file operations from different Operating Systems

Flat File Service:

  • UFID (Unique File Identifier)
  • Operating on the contents of files
  • Idempotent operations
  • stateless

Directory Service:

  • Maps text names and UFIDs
  • Client of Flat File Service
  • Directory files stored in files
  • stateless

Client Module:

  • Runs in each client computer
  • Provides the standard file operation for the target OS
  • Implements caches

Access Control:

  • Access check when a filename is converted to UFID and returns a capability to the client
  • User identity submitted with every client request

File Groups:

  • Several file may be moved together between servers
  • File cannot change to which group it belongs

Sun NFS File System

RPC interface to the NFS Server is open, any process can send request to a server.

In UNIX, NFS’ file handle derived from file’s i-node number adding two extra fields

  • Filesystem Identifier: UNIX mountable filesystem as grouping unit, stored in the superblock.
  • i-node generation number: In UNIX it is incremented each time the i-node is reused.

v-node related a remote file system to the local directory it is mounted.

Client Integration:

In UNIX it is integrated with the kernel so:

  • programs access files via system calls.
  • a single cliente module used for all the user-level processes.
  • encryption key passed to server retained in the kernel.

It share the same cache as local file systems

Several clients in different machines can access th same file so cache inconsistencies arise

Access Control and Authentication:

  • Each request must include user credentials.
  • DES Encryption and Kerberos added to the protocol

Mount Service:

  • A separate service deals with mounting using the mount protocol
  • Takes directory pathname and credentials and returns file handle.
  • hard-mounted: process suspended till operation has completed and keeps retrying if it is not available.
  • soft-mounted: client returns a failure after a number of retries

Path Name Translation:

Performed iteratively by the client

each remote directory translated separately using lookup

Caching must be used

Automounter:

  • Local NFS server
  • Receives a lookup request and forwards it to servers, then create  a symlink to avoid automounter in future requests
  • No references to the symbolic link for a time leads to unmount it
  • kernel implementations avoid the symlink

Server Caching:

  • NFS Servers use OS cache
  • write-through caching, data received stored in cache and written before a reply is sent to the client
  • commit caching, data is only written when a client issues a commit operation
  • standard NFS Clients send commits when a file is closed

Client Caching:

  • Client module caches read, write, geattr, lookup and readdir
  • Timestamp used to validate cached blocks
  • Tc, last validation
  • Tm, block last modification at the server
  • range of time 30 – 60 secs
  • Clients has no means to know whether a file is shared
  • Validation must be done for every file
  • First check (T – Tc) if false then Tm = Tmserver
  • Recent updates not always visible to clients sharing the file
  • read-ahead, delayed-write using bio-daemon.

NFS + Kerberos:

  • Used only at mount time
  • Server retains mounts at each client computer.

Performance:

  • No performance penalty comparing to local disks
  • frequent use of getattr
  • write-through at the server is poor

Design Goals:

  • Access Transparency:  Delivered through client module.
  • Location Transparency: Each client establishes file name space via mounted directories
  • Mobility Transparency: When moving between servers clients must updates their tables too.
  • Scalability: Adding disks or CPUs. Problems when some files are frequently accessed.
  • File Replication: Not supported
  • Hardware and OS Heterogeneity: Implemented for lots of OSs and hardware platforms
  • Fault Tolerance: If the server crashes the clients continue at the same point they left when the server restarts
  • Consistency: half achieved being close to one-copy semantics, ok for majority of applications. Not for coordination between computers
  • Security: Must use Kerberos or other new mechanisms
  • Efficiency: Used in heavy loads situations