0% found this document useful (0 votes)
58 views

9 File Systems

a presentation on file systems

Uploaded by

Kevin Kagambi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

9 File Systems

a presentation on file systems

Uploaded by

Kevin Kagambi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

File Systems

Introduction
 All computer applications need to store and retrieve
information.
 A process can store a limited amount of information within its
own address space
 Storage capacity is restricted to the size of the virtual
address space
 For some applications this size is adequate, but for others,
such as airline reservations, banking, or corporate record
keeping, it is far too small.
 Applications require information to outlive the processes
 To enable sharing, information must be independent of the
processes
File Systems

 We have three essential requirements for long-term


information storage:
 It must be possible to store a very large amount of
information.
 The information must survive the termination of the
process using it.
 Multiple processes must be able to access the
information concurrently.
 Magnetic disks have been used for years for this
long-term storage. Tapes and optical disks are also
used, but they have much lower performance.
Files

 Some of the issues that arise when using information


include:
1. How do you find information?
2. How do you keep one user from reading another user's
data?
3. How do you know which blocks are free?
 The file abstraction is used by the operating system to deal
with some of the issues.
 Files are logical units of information created by
processes.
 Files are managed by the operating system. How they
are structured, named, accessed, used, protected,
implemented, and managed are major topics in
operating system design.
File Concept
 Contiguous logical address space
 Types:
 Data
 numeric
 character
 binary
 Program
 Contents defined by file’s creator
 Many types
 Consider text file,
 source file - a sequence of functions, each of
which is further organized as declarations followed by executable statements.
 , executable file - series of code sections that the loader can bring into memory and
execute.
File Attributes
 Name – only information kept in human-readable form
 Identifier – unique tag (number) identifies file within file system
 Type – needed for systems that support different types
 Location – pointer to file location on device
 Size – current file size
 Protection – controls who can do reading, writing, executing
 Time, date, and user identification – data for protection, security,
and usage monitoring
 Information about files are kept in the directory structure, which is
maintained on the disk
 Many variations, including extended file attributes such as file
checksum
 Information kept in the directory structure
File Operations
 File is an abstract data type
 Create
 Write – at write pointer location
 Read – at read pointer location
 Reposition within file - seek
 Delete
 Truncate
 Open(Fi) – search the directory structure on disk for entry
Fi, and move the content of entry to memory
 Close (Fi) – move the content of entry Fi in memory to
directory structure on disk
Open Files
 Several pieces of data are needed to manage open files:
 Open-file table: tracks open files
 File pointer: pointer to last read/write location, per process
that has the file open
 File-open count: counter of number of times a file is open – to
allow removal of data from open-file table when last processes
closes it
 Disk location of the file: cache of data access information
 Access rights: per-process access mode information
Open File Locking
 Provided by some operating systems and file systems
 Shared lock similar to reader lock – several processes can acquire
concurrently
 Exclusive lock similar to writer lock
 Further, operating systems may provide either mandatory or
advisory file-locking mechanisms.
 If a lock is mandatory, then once a process
acquires an exclusive lock, the operating system will prevent
any other process from accessing the locked file.

 Advisory – processes can find status of locks and decide what to


do. It is up to software developers to ensure that locks are
appropriately acquired and released
File Types – Name, Extension
File Structure
 Options
 Unstructured sequence of bytes - having the operating
system regard files as nothing more than byte sequences
provides the maximum flexibility
 All versions of UNIX, MS-DOS, and Win­dows use this file
model.
 Sequence of fixed-length records - read operation
returns one record and the write operation overwrites
or appends one record.
 Tree of records - not necessarily all the same length,
each con­taining a key field in a fixed position in the
record. The tree is sorted on the key field, to allow
rapid searching for a particular key
File Types
 Regular files are the ones that contain user
information. Can be ASCII or binary
 Directories are system files for maintaining the
structure of the file system.
 Character special files are related to input/output
and used to model serial I/0 devices, such as ter­
minals, printers, and networks.
 Block special files are used to model disks.
Sequential-access File
Access Methods
 Sequential Access - Information in the file is processed in order, one
record after the other.
 Direct Access – file is fixed length logical records that allow programs to
read and write records rapidly in no particular order.
 The direct-access method is based on a disk
model of a file, since disks allow random access to any file block.
 File is viewed as a numbered sequence of blocks or records.
 We may read block 14, then read block 53, and then write block 7.
 No restrictions on the order of reading or writing for a direct-access file.
Other Access Methods
 The index - like an index in the back of a book, contains
pointers to the various blocks.
 To find a record in the file, we first search the index and
then use the pointer to access the file directly and to find
the desired record
Example of Index and Relative Files
Directory Structure

 A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk


A Typical File-system Organization
Types of File Systems
 We mostly talk of general-purpose file systems
 But systems frequently have may file systems, some general- and
some special- purpose
 Consider Solaris has
 tmpfs – memory-based volatile FS for fast, temporary I/O
 objfs – interface into kernel memory to get kernel symbols for
debugging
 ctfs – contract file system for managing daemons
 lofs – loopback file system allows one FS to be accessed in place of
another
 procfs – kernel interface to process structures
 ufs, zfs – general purpose file systems
Operations Performed on Directory
 Search for a file

 Create a file

 Delete a file

 List a directory

 Rename a file

 Traverse the file system


Directory Organization
The directory is organized logically to obtain

 Efficiency – locating a file quickly


 Naming – convenient to users
 Two users can have same name for different files
 The same file can have several different names
 Grouping – logical grouping of files by properties, (e.g.,
all Java programs, all games, …)
Single-Level Directory
 A single directory for all users

 Naming problem
 Grouping problem
Two-Level Directory
 Separate directory for each user

 Path name
 Can have the same file name for different user
 Efficient searching
 No grouping capability
Tree-Structured Directories
Tree-Structured Directories (Cont.)
 Efficient searching

 Grouping Capability

 Current directory (working directory)


 cd /spell/mail/prog
 type list
Tree-Structured Directories (Cont)
 Absolute or relative path name
 Creating a new file is done in current directory
 Delete a file
rm <file-name>
 Creating a new subdirectory is done in current directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count

Deleting “mail”  deleting the entire subtree rooted by “mail”


Sharing subdirectories and files

 A tree structure prohibits the sharing of files or directories.


 An acyclic graph —that is, a graph with no cycles—allows
directories to share subdirectories and files.
 People working as a team - all the files they want to share can be
put into one directory.
 The User File Directory of each team member will contain this
directory
of shared files as a subdirectory.
 Sharing of files can be implemented in several ways:
 Link: a pointer to another file or subdirectory. Link may be
implemented as an absolute or a relative path name.
 Duplicate all information about them in both sharing directories.
Inconsistency problem arises
Sharing subdirectories and files

Acyclic-Graph Directories
Acyclic-Graph Directories (Cont.)
 Two different names (aliasing)
 If dict deletes list  dangling pointer
 If we delete a file, we have to locate the links and remove them
 May be expensive if no list exists
 We can leave the links until an attempt is made to use them.
 Another approach to deletion is to preserve the file until all references to it
are deleted.
 To implement this approach, we must have some mechanism
for determining that the last reference to the file has been deleted.

 A serious problem with using an acyclic-graph structure is ensuring that


there are no cycles.
 Cycles in graphs result in poor performance during searches due to infinite
loops
General Graph Directory
General Graph Directory (Cont.)
 How do we guarantee no cycles?
 Every time a new link is added use a cycle detection algorithm to
determine whether it is OK
 Garbage collection – remove any files not referenced
Implementing Files
 Contiguous Allocation
 The simplest allocation scheme is to store each file as
a contiguous run of disk blocks. Thus on a disk with 1-
KB blocks, a 50-KB file would be allocated 50 con­
secutive blocks.
 Simple to implement
 Read performance is excellent because the entire
file can be read from the disk in a single operation
 Drawbacks:
 Disk fragmentation
 Keep track of holes and continuously search for hole with
appropriate size – have to be known in advance
Implementing Files
 Linked List Allocation
 Keeps files as a linked list of disk blocks
 No space is lost to disk fragmentation (except for internal
fragmentation in the last block)
 It is sufficient for the directory entry to merely store the disk
address of the first block.
 Random access is extremely slow.
 To get to block n, the operating system has to start at the beginning and
read the n - 1 blocks prior to it, one at a time.
 The amount of data storage in a block is no longer a power of
two be­cause the pointer takes up a few bytes - less efficient
because many programs read and write in blocks whose size is a
power of two.
Linked List Allocation Using a
Table in Memory
 Both disadvantages of the linked list allocation can
be eliminated by taking the pointer word from each
disk block and putting it in a table in memory.
 FAT (File Allocation Tables) are used. E.g. we have 2
files whose blocks as stored in different blocks.
 File A - 4, 7, 2, 10, and 12
 File B - 6, 3, 11, and 14
 A FAT is used to store the two chains. Both chains are
terminated with a special marker (e. g., -1)
Linked List Allocation Using a
Table in Memory
Linked List Allocation Using a
Table in Memory
 The main disadvantage of using this method is that the
FAT has to be in memory and therefore for large disks,
the memory requirements increase.
 With a 200-GB disk and a 1-KB block size, the table
needs 200 million entries.
Implementing Files

 I-nodes
 Another method for keeping track of which blocks belong to which
file is to associate with each file a data structure called an i-node
(index-node), which lists the attributes and disk addresses of the
file's blocks.
 The big advantage of this scheme over linked files using an in-
memory table is that the i-node need only be in memory when the
corresponding file is open.
 If each i-node occupies n bytes and a maximum of k files may be
open at once, the total memory occupied by the array holding the
i-nodes for the open files is only kn bytes
i-node

You might also like