100% found this document useful (1 vote)
3K views

OS-Chapter 5 - File Management

1. The document discusses different types of file systems including fundamental concepts like data, metadata, files and file operations. 2. It describes four methods of organizing files - sequential, random/direct, serial, and indexed-sequential. Sequential organization stores records in order while random allows direct access via a record key. 3. The document also covers buffering, which stores file data in memory during transfers to/from devices to improve performance, and differences between sequential and non-sequential files in terms of storage, reconstruction and use.

Uploaded by

Desalegn Asefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views

OS-Chapter 5 - File Management

1. The document discusses different types of file systems including fundamental concepts like data, metadata, files and file operations. 2. It describes four methods of organizing files - sequential, random/direct, serial, and indexed-sequential. Sequential organization stores records in order while random allows direct access via a record key. 3. The document also covers buffering, which stores file data in memory during transfers to/from devices to improve performance, and differences between sequential and non-sequential files in terms of storage, reconstruction and use.

Uploaded by

Desalegn Asefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Operating Systems

Chapter 5
File Systems
5.1. Fundamental Concepts:
Data: Facts and statistics collected together for reference or analysis.
The quantities or characters or symbols, on which operations are performed by a
computer is known as data. And which may be stored and transmitted in the form of electrical
signals and recorded on magnetic, optical or mechanical recording media.
File management system is a type of software that manages data files in a computer
system.it has limited capabilities and is designed to manage individual or group files, such as
special office documents and records.
The data maybe numbers, characters or binary information. Things are known or assumed
as facts, making the basis of reasoning or calculation.
Metadata: means “data about data”
Metadata is data that provides information about other data. Three distinct types of
metadata exist: descriptive metadata, structural metadata and administrative metadata.
Descriptive metadata describes a resource for purpose such as discover and identification.
It can include elements such as title, abstract, author and keywords.
Structural metadata is metadata about containers of data indicates how compound objects
are put together. For example how pages are ordered to from chapters. It describes the types,
versions, relationships and other characteristics of digital materials.
Administrative metadata provides information to help manage a resources, such as when
and how it was created, file type and other technical information, and who can access it.
Metadata traditionally used in the card catalogs of libraries, museum, digital audio files,
websites, traffic analysis etc.
File: A collection of data or information that has a name called the file name. Almost all
information stored in a computer must be in a file. (or) A file is an object on a computer that
stores data, information, settings or commands used with a computer program.
All computer applications need to store and retrieve information. While a process is
running, it can store a limited amount of information within its own address space. However, the
storage capacity is restricted to the size of the virtual address space.
A second problem with keeping information with in a process address space is that when
the process terminates, the information is lost. For many applications the information must be
retained for weeks, months or even forever.
A third problem is that it is frequently necessary for multiple processes to access the
information at the same time.
Thus we have three essential requirements for long-term information storage:
1. It must be possible to store a very large amount of information.
2. The information must survive the termination of the process using it.
3. Multiple processes must be able to access the information concurrently.
Magnetic disks have been used for years for this long-term storage and supporting two
operations.
 Read block k
 Write block k

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 1
Operating Systems

Files are logical units of information created by processes. A disk usually contains
thousands or even millions of them, each one independent of the others. Processes can read
existing file and creates a new ones if need be.
File Operations: File exists to store information and allow it to be retrieved later. Different
system provides different operations to allow storage and retrieval. Below is a discussion of the
most common system calls relating to files.
1. Create: The file is created with no data. The purpose of the call is to announce that the files
are coming and to set some of the attributes.
2. Delete: When the file is no longer needed, it has to be deleted to free up the disk space.
3. Open: Before using a file, a process must open it. The purpose of open call is to allow the
system to fetch the attributes for rapid access on later calls.
4. Close: When all the accesses are finished, the attributes and disk address are no longer needed,
so the file should be closed to free up internal table space.
5. Read: Data are read from file. Usually, the bytes come from the current position. The caller
must specify how much data are needed and must also provide a buffer to put them in.
6. Write: Data are written to the file again, usually, at the current position. If the current position
is the end of the file, the file size increases.
7. Append: This call is a restricted from of write. It can only add data to the end of the file.
Systems that provide a minimal set of system call do not generally have append, but many
systems provide multiple ways of doing the same things, and these systems sometimes have
append.
8. Seek: For random access files, a method is needed to specify from where to take the data. One
common approach is a system call, seek that repositions the file pointer to a specific place in the
file. After this call has completed, data can be read from, or written to that position.
9. Rename: It frequently happens that a user needs to change the name of an existing file. This
system call makes that possible. It is not always strictly necessary, because the file can usually be
copied to a new file with the new name, and the old file then deleted.
File Organization: File organization refers to the way data is stored in a file. File organization
is very important because it determines the methods of access, efficiency, flexibility and storage
devices to use.
There are four methods of organizing files on a storage media. This includes:
1. Sequential
2. Random or Direct
3. Serial
4. Indexed- Sequential
1. Sequential File Organization:
Records are stored and accessed in a particular order stored using a key field. Retrieval
requires searching sequentially through the entire file record by record to the end. Because the
record in a file stored in a particular order, better file searching methods like the binary search
technique can be used to reduce the time used for searching a file. For example, the file has
records with the key fields 20,30,40,50 and 60 the computer is searching for a record with key
field 50, it starts at 40 in its search, ignoring the first half of the set.
Advantages:
1. The sorting makes it easy to access records.

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 2
Operating Systems

2. The binary chop technique can be used to reduce record search time by as much as half
the time taken.
Disadvantages:
1. The sorting does not remove the need to access other records as the search looks for
particular records.
2. Sequential records cannot support modern technologies that require fast access to stored
records.
2. Random or Direct File Organization:
Records are stored randomly but accessed directly. To access a file stored randomly, a
record key is used to determine where a record is stored on the storage media. Magnetic and
optical disks allow data to be stored and accessed randomly.
Advantages:
1. Quick retrieval of records.
2. The records can be of different sizes.
3. Serial File Organization:
Records in a file are stored and accessed one after the anther. The records are not stored
in any way on the storage medium this type of organization is mainly used on magnetic tapes.
Advantages:
1. It is simple.
2. It is cheap.
Disadvantages:
1. It is cumbersome to access because you have to access all proceeding records before
retrieving the one being searched.
2. Wastage of space on medium inform of inter record gap.
3. It cannot support modern high speed requirements for quick record access.
4. Indexed-Sequential File Organization:
Almost similar to sequential method but only that an index is used to enable the computer
to locate individual records on the storage media. For example, on a magnetic durm, records are
stored sequential on the tracks. However, each records is assigned an index that can be used to
access it directly.

Buffering: Operating system stores (its own copy of) data in memory while transferring to or
from devices is known as buffering.
The following are the uses of buffering:
 To store multiple copies of files.
 To access file very fast.
 To maintain copy semantics.
 To make searching easy.
Sequential vs Non-sequential Files:
Function Sequential Non-Sequential
Storage space allocation and Volumes Disk blocks
tracking
Aggregate reconstruction Not available Available
Available for use as copy Available Not available
storage pools or active data

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 3
Operating Systems

pools
File location Volume location is limited by File volumes use directories
the trigger prefix or by manual
specification
Migration Performed by volumes Performed by node
Storage pool backup Performed by volume Performed by node and file

5.2. Content and Structure of Directories:


A directory is a collection of nodes containing information about all files. Both the
directory structure and the files reside on disk.
The following operations are performed on directories:
 Search for a file
 Create a file
 Delete a file
 List a directory
 Rename a file
 Traverse the file system
The directory is organized logically to obtain
 Efficiency- locating a file quickly.
 Naming-convenient to users.
o Two users can have same name for different files.
o The same file can have several different names.
 Grouping-logical grouping of files by properties( e.g. all java programs,
all games,…)
Single level directory:

 A single directory for all users


 Naming problem
 Grouping problem
Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 4
Operating Systems

Two level directory:

 Separate directory for each user


 Path name
 Can have the same file name for different user
 Efficient searching
 No grouping capability
Tree structured directories:

 Efficient searching
 Grouping capability
 Current directory (Working directory) eg. spell/mail/prog/obj.

5.3. File System Techniques:


Partitioning: A partition is a logical division of a hard disk that is treated as a separate unit by
operating systems and file systems. The operating systems and file systems can manage
information on each partition as if it were a distinct hard drive. This allows the drive to operate

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 5
Operating Systems

as several smaller sections to improve efficiency, although it reduces usable space on the hard
disk because of additional overhead from multiple operating systems.
A disk manager partition manager allows system administrators to create, resize, delete
and manipulate partitions, while a partition table logs the location and size of the partition. Each
partition appears to the operating system as a distinct logical disk, and the operating system reads
the partition table before any other part of the disk.
Once a partition is created, it is formatted with a file system such as:
 NTFS on windows drives
 BSD partition
 FAT32 and exFAT for removable drives
 Solaris x86
 HFS plus on mac computers
 DOS partition
 Ext4 on Linux etc.
Data and files are then written to the file system on the partition. When users boot the
operating system in a computer, a critical part of the process is to give control to the first sector
on the hard disk.
This includes the partition table that defines how many partitions will be formatted on the
hard disk, the size of each partition and the address where each disk partition begins. The sector
also contains a program that reads the boot sector for the operating system and gives it control so
that the rest of the operating system can be loaded into RAM.

Disk Partitioning

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 6
Operating Systems

Mounting: Mounting is a process by which the operating system makes files and directories on
a storage device (such as hard disk drive, CD-ROM, or network share) available for user to
access via the computer file system.
In general, the process of mounting comprises operating system acquiring access to the
storage medium, recognizing, reading, processing file system structure and metadata on it, before
registering them to the virtual file system component.
The exact location in VFS that the newly-mounted medium got registered is called mount
point, when the mounting process is completed, the user can access files and directories on the
medium from there.
Un mounting: An opposite process of mounting is called un mounting, in which the operating
system cuts off all user access to files and directories on the mount point, writes the remaining
queue of user data to the storage device, refreshes file system metadata, then relinquishes access
to the device, making the storage safe for removal.
Normally, when the computer is shutting down, every mounted storage will undergo an
un mounting process to ensure that all queued data got written, and to preserve integrity of the
system structure on the media.
Virtual File System: A virtual file system is programming that forms an interface between an
operating systems kernel and a more concrete file system. The VFS serves as an abstraction layer
that gives applications access to different types of file systems and local and network storage
device.
VFS on UNIX provides an object oriented way of implementing file systems. VFS allows
the same system call interface (the API) to be used for different types of file systems. The API is
to the VFS interface, rather than only specific type of file system.

5.4.Memory –Mapped Files:


A memory mapped file is a feature for all modern operating system. It requires
coordination between the memory manger and the I/O subsystem.
Basically, you can tell the OS that some file is the backing store for a certain portion of
the process memory. In order to understand that there is a virtual memory.

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 7
Operating Systems

Memory mapped files offer a unique memory management feature that allows
applications to access file on disk in the same way they access dynamic memory through
pointers. With this capability you can map a view of all or part of a file on disk to a specific
range of addresses within your processor address space.

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 8
Operating Systems

5.5. Special Purpose File Systems:


The most familiar file systems make use of an underlying data storage device that offers
access to an array of fixed size blocks, sometimes called sectors. The file system software is
responsible for organizing these sectors into files and directories, and keeping track of which
sectors belongs to which file and which are not being used.
File systems typically have directories which associate file names with files, usually by
connecting the file name to an index into a file allocation table of some sort, such as the FAT in
an MS-DOS file system, or an inode in a UNIX like file system.

 Disk File System – FAT,NTFS,ext2, ISO9660


 Database File System – File, Topic, another structure and SQL,
WinFS.
 Transactional File System – Transactions Sending and Receiving.
 Special Purpose File System – Dynamically arrangement, UNIX,
procfs, processes.
5.6. Naming:
A filename (also written as two words, file name) is a name used to uniquely identify a
computer file stored in file systems. Different file systems impose different restrictions on file
name lengths and the allowed characters within filenames.
Filenames consists with long filenames, foreign letters, comma, dot and space characters
as they appear in a software displaying filenames.

Filename may include one or more of these components:


 Host (or server) – network device that contains the file.
 Device (or drive) – hardware device or drive.
 Directory (or path) – directory tree.
 File – base name of the file.
 Type (or format) – the content type of the file.
 Version – revision or generation number of the file.

The components required to identify a file varies across operating systems, as does the
syntax and format for a valid filename.
Example: c:\directory\mufile.txt
5.7. Searching:
Searching is just trying to find the information you need. Searching a file means finding a
file where the file is stored in computer memory. Searching can be done in two ways:
 Linear search
 Binary search
Linear Search: This is the simplest method of searching. In this method, the element to be
found is sequentially searched in the list. This method can be applied to a sorted or an unsorted
list.
Binary Search: Binary search method is very fast and efficient. This method requires that the
list of elements be in sorted order. In this method, to search an element we compare it with the
element present at the center of the list. If it matches then the search is successful.

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 9
Operating Systems

5.8. Access:
File access mechanisms refers to the manner in which the records of a file may be
accessed. There are several ways to access files:
 Sequential access
 Direct/Random access
 Indexed Sequential access
5.9. Backup Strategies:
In information technology, a backup, or the process of backup, refers to the copying into
an archived file of computer data so it may be used to restore the original after a data loss event.
Backup have two distinct purposes:
1. The primary purpose is to recover data after its loss, be it by data deletion or corruption.
2. The secondary purpose of backups is to recover data from an earlier time, according to a
user defined data retention policy, typically configured within a backup application for
how long copies of data are required.
Data backup is an essential part of data center operations, but it’s important to really
understand what makes a backup strategy successful. Most people say that its necessary to have a
second copy of data in case the original copy fails.
A good backup strategy is obviously going to create that second copy, but it is more
crucial that, when file recovery is needed, the data can actually be found quickly.
 CD’s or DVD’s
 Flash memory
 Hard Disk Drive
 Backup software
 Cloud storage
 Compression
 Duplication
 Cache etc.

Prepared by Ande, Lecturer, Dept of Computer Science, Gambella University, Gambella. Page 10

You might also like