CON UNIT 5
CON UNIT 5
UNIT -V
Introduction to Application Layer: Introduction, WWW and HTTP – FTP - E-mail - TELNET
- Secure Shell - Domain Name System - SNMP.
INTRODUCTION:
➢ The application layer provides services to the user. Communication is provided using a
logical connection, which means that the two application layers assume that there is an
imaginary direct connection through which they can send and receive messages.
Services:
➢ All communication networks that started before the Internet were designed to provide
services to network users.
➢ Most of these networks, however, were originally designed to provide one specific
service. For example, the telephone network was originally designed to provide voice
service: to allow people all over the world to talk to each other.
➢ This network, however, was later used for some other services, such as facsimile (fax),
enabled by users adding some extra hardware at both ends.
➢ The Internet was originally designed for the same purpose: to provide service to users
around the world.
➢ The layered architecture of the TCP/IP protocol suite, however, makes the Internet more
flexible than other communication networks such as postal or telephone networks.
➢ Each layer in the suite was originally made up of one or more protocols, but new
protocols can be added or some protocols can be removed or replaced by the Internet
authorities.
➢ However, if a protocol is added to each layer, it should be designed in such a way that it
uses the services provided by one of the protocols at the lower layer.
➢ If a protocol is removed from a layer, care should be taken to change the protocol at the
next higher layer that supposedly uses the services of the removed protocol.
➢ To provide smooth operation of the Internet, the protocols used in the first four layers of
the TCP/IP suite need to be standardized and documented.
➢ There are several application-layer protocols that have been standardized and
documented by the Internet authority, and we are using them in our daily interaction with
the Internet.
➢ Each standard protocol is a pair of computer programs that interact with the user and the
transport layer to provide a specific service to the user.
➢ A programmer can create a nonstandard application-layer program if she can write two
programs that provide service to the user by interacting with the transport layer.
Application-Layer Paradigms
➢ It should be clear that to use the Internet we need two application programs to interact
with each other: one running on a computer somewhere in the world, the other running on
another computer somewhere else in the world.
➢ The two programs need to send messages to each other through the Internet
infrastructure.
➢ However, we have not discussed what the relationship should be between these programs.
➢ The traditional paradigm is called the client-server paradigm. It was the most popular
paradigm until a few years ago.
➢ In this paradigm, the service provider is an application program, called the server
process; it runs continuously, waiting for another application program, called the client
process, to make a connection through the Internet and ask for service.
➢ There are normally some server processes that can provide a specific type of service, but
there are many clients that request service from any of these server processes.
➢ The server process must be running all the time; the client process is started when the
client needs to receive service.
➢ A client process communicate with a server process with the help of a computer program
which is normally written in a computer language with a predefined set of instructions
that tells the computer what to do.
➢ A computer language has a set of instructions for mathematical operations, a set of
instructions for string manipulation, a set of instructions for input/output access, and so
on.
➢ If we need a process to be able to communicate with another process, we need a new set
of instructions to tell the lowest four layers of the TCP/IP suite to open the connection,
send and receive data from the other end, and close the connection. A set of instructions
of this kind is normally referred to as an application programming interface (API).
➢ An interface in programming is a set of instructions between two entities. In this case,
one of the entities is the process at the application layer and the other is the operating
system that encapsulates the first four layers of the TCP/IP protocol suite.
➢ Several APIs have been designed for communication. One of the most common one is:
socket interface. The socket interface is a set of instructions that provide communication
between the application layer and the operating system, as shown in Figure 5.1.
Sockets:
➢ Although a socket is supposed to behave like a terminal or a file, it is not a physical entity
like them; it is an abstraction. It is an object that is created and used by the application
program.
Socket Addresses:
➢ How can a client or a server find a pair of socket addresses for communication? The
situation is different for each site.
Server Site:
➢ The server needs a local (server) and a remote (client) socket address for communication.
➢ The local (server) socket address is provided by the operating system. The operating
system knows the IP address of the computer on which the server process is running. The
port number of a server process, however, needs to be assigned.
➢ If the server process is a standard one defined by the Internet authority, a port number is
already assigned to it. For example, the assigned port number for a Hypertext Transfer
Protocol (HTTP) is the integer 80, which cannot be used by any other process.
➢ The remote socket address for a server is the socket address of the client that makes the
connection. Since the server can serve many clients, it does not know beforehand the
remote socket address for communication.
Client Site:
➢ The client also needs a local (client) and a remote (server) socket address for
communication.
➢ The local (client) socket address is also provided by the operating system. The operating
system knows the IP address of the computer on which the client is running.
➢ The port number, however, is a 16-bit temporary integer that is assigned to a client
process each time the process needs to start the communication.
➢ The port number, however, needs to be assigned from a set of integers defined by the
Internet authority and called the ephemeral (temporary) port numbers. The operating
system, however, needs to guarantee that the new port number is not used by any other
running client process.
➢ Finding the remote (server) socket address for a client, however, needs more work. When
a client process starts, it should know the socket address of the server it wants to connect
to.
➢ A pair of processes provide services to the users of the Internet, human or programs. A
pair of processes, however, need to use the services provided by the transport layer for
communication because there is no physical communication at the application layer.
➢ The idea of the Web was first proposed by Tim Berners-Lee in 1989. The Web today is a
repository of information in which the documents, called web pages, are distributed all
over the world and related documents are linked together.
➢ The popularity and growth of the Web can be related to two terms in the above statement:
distributed and linked. Distribution allows the growth of the Web.
➢ Each web server in the world can add a new web page to the repository and announce it
to all Internet users without overloading a few servers.
➢ Linking allows one web page to refer to another web page stored in another server
somewhere else in the world.
➢ The linking of web pages was achieved using a concept called hypertext, which was
introduced many years before the advent of the Internet.
➢ The idea was to use a machine that automatically retrieved another document stored in
the system when a link to it appeared in the document.
➢ The Web implemented this idea electronically to allow the linked document to be
retrieved when the link was clicked by the user.
➢ Today, the term hypertext, coined to mean linked text documents, has been changed to
hypermedia, to show that a web page can be a text document, an image, an audio file, or
a video file.
Architecture:
➢ The WWW today is a distributed client-server service, in which a client using a browser
can access a service using a server.
➢ However, the service provided is distributed over many locations called sites. Each site
holds one or more web pages.
➢ Each web page, however, can contain some links to other web pages in the same or other
sites.
➢ A variety of vendors offer commercial browsers that interpret and display a web page,
and all of them use nearly the same architecture. Each browser usually consists of three
parts: a controller, client protocols, and interpreters.
➢ The controller receives input from the keyboard or the mouse and uses the client
programs to access the document.
➢ After the document has been accessed, the controller uses one of the interpreters to
display the document on the screen.
➢ The client protocol can be one of the protocols described later, such as HTTP or FTP.
The interpreter can be HTML, Java, or JavaScript, depending on the type of document.
Some commercial browsers include Internet Explorer, Netscape Navigator, and Firefox.
Web Server:
➢ The web page is stored at the server. Each time a request arrives, the corresponding
document is sent to the client.
➢ To improve efficiency, servers normally store requested files in a cache in memory;
memory is faster to access than a disk.
➢ A server can also become more efficient through multithreading or multiprocessing. In
this case, a server can answer more than one request at a time.
➢ Some popular web servers include Apache and Microsoft Internet Information Server.
➢ A web page, as a file, needs to have a unique identifier to distinguish it from other web
pages. To define a web page, we need three identifiers: host, port, and path.
Protocol:
➢ The first identifier is the abbreviation for the client-server program that we need in order
to access the web page.
➢ Although most of the time the protocol is HTTP (HyperText Transfer Protocol), we can
also use other protocols such as FTP (File Transfer Protocol).
Host:
➢ The host identifier can be the IP address of the server or the unique name given to the
server. IP addresses can be defined in dotted decimal notation. Port. The port, a 16-bit
integer, is normally predefined for the client-server application.
Path:
➢ The path identifies the location and the name of the file in the underlying operating
system. The format of this identifier normally depends on the operating system.
➢ To combine these four pieces together, the uniform resource locator (URL) has been
designed; it uses three different separators between the four pieces as shown below:
Web Documents:
➢ The documents in the WWW can be grouped into three broad categories: static, dynamic,
and active.
➢ Static documents are fixed-content documents that are created and stored in a server.
The client can get a copy of the document only. In other words, the contents of the file
are determined when the file is created, not when it is used.
➢ Static documents are prepared using one of several languages: HyperText Markup
Language (HTML), Extensible Markup Language (XML), Extensible Style Language
(XSL), and Extensible Hypertext Markup Language (XHTML).
Dynamic Documents:
Active Documents:
➢ For many applications, we need a program or a script to be run at the client site. These
are called active documents.
➢ For example, suppose we want to run a program that creates animated graphics on the
screen or a program that interacts with the user.
HTTP Security:
➢ HTTP per se does not provide security. HTTP can be run over the Secure Socket Layer
(SSL). In this case, HTTP is referred to as HTTPS. HTTPS provides confidentiality,
client and server authentication, and data integrity.
FTP:
➢ File Transfer Protocol (FTP) is the standard protocol provided by TCP/IP for copying
a file from one host to another. Although transferring files from one system to another
seems simple and straightforward, some problems must be dealt with first.
➢ Although we can transfer files using HTTP, FTP is a better choice to transfer large files
or to transfer files using different formats. Figure 5.3 shows the basic model of FTP.
The client has three components: the user interface, the client control process, and the
client data transfer process. The server has two components: the server control process
and the server data transfer process.
➢ The control connection is made between the control processes. The data connection is
made between the data transfer processes.
➢ Separation of commands and data transfer makes FTP more efficient. The control
connection uses very simple rules of communication.
➢ We need to transfer only a line of command or a line of response at a time. The data
connection, on the other hand, needs more complex rules due to the variety of data types
transferred.
Two Connections:
➢ The two connections in FTP have different lifetimes. The control connection remains
connected during the entire interactive FTP session. The data connection is opened and
then closed for each file transfer activity.
➢ FTP uses two well-known TCP ports: port 21 is used for the control connection, and port
20 is used for the data connection.
Control Connection:
➢ During this control connection, commands are sent from the client to the server and
responses are sent from the server to the client.
➢ Commands, which are sent from the FTP client control process, are in the form of ASCII
uppercase, which may or may not be followed by an argument. Some of the most common
commands are shown in table below:
➢ Every FTP command generates at least one response. A response has two parts: a three-
digit number followed by text.
➢ The numeric part defines the code; the text part defines needed parameters or further
explanations.
➢ The first digit defines the status of the command. The second digit defines the area in
which the status applies. The third digit provides additional information.
ELECTRONIC MAIL:
➢ Electronic mail (or e-mail) allows users to exchange messages. The nature of this
application, however, is different from other applications discussed so far.
➢ In an application such as HTTP or FTP, the server program is running all the time, waiting
for a request from a client.
➢ When the request arrives, the server provides the service. There is a request and there is a
response.
➢ In the case of electronic mail, the situation is different. First, e-mail is considered a one-
way transaction.
➢ When Alice sends an email to Bob, she may expect a response, but this is not a mandate.
Bob may or may not respond. If he does respond, it is another one-way transaction.
➢ In the common scenario, the sender and the receiver of the e-mail, Alice and Bob
respectively, are connected via a LAN or a WAN to two mail servers. The administrator
has created one mailbox for each user where the received messages are stored.
➢ A mailbox is part of a server hard drive, a special file with permission restrictions. Only
the owner of the mailbox has access to it. The administrator has also created a queue
(spool) to store messages waiting to be sent.
➢ A simple e-mail from Alice to Bob takes nine different steps. Alice and Bob use three
different agents: a user agent (UA), a message transfer agent (MTA), and a message
Name Space:
➢ A name space that maps each address to a unique name can be organized in two ways: flat
or hierarchical. In a flat name space, a name is assigned to an address.
➢ A name in this space is a sequence of characters without structure. The names may or may
not have a common section; if they do, it has no meaning.
➢ The main disadvantage of a flat name space is that it cannot be used in a large system such
as the Internet because it must be centrally controlled to avoid ambiguity and duplication.
➢ In a hierarchical name space, each name is made of several parts. The first part can define
the nature of the organization, the second part can define the name of an organization, the
third part can define departments in the organization, and so on.
➢ In this case, the authority to assign and control the name spaces can be decentralized.
➢ A central authority can assign the part of the name that defines the nature of the
organization and the name of the organization. The responsibility for the rest of the name
can be given to the organization itself.
➢ The organization can add suffixes (or prefixes) to the name to define its host or resources.
The management of the organization need not worry that the prefix chosen for a host is
taken by another organization because, even if part of an address is the same, the whole
address is different.
Label:
➢ Each node in the tree has a label, which is a string with a maximum of 63 characters. The
root label is a null string (empty string).
➢ DNS requires that children of a node (nodes that branch from the same node) have
different labels, which guarantees the uniqueness of the domain names.
Domain Name:
➢ Each node in the tree has a domain name. A full domain name is a sequence of labels
separated by dots (.). The domain names are always read from the node up to the root.
➢ The last label is the label of the root (null). This means that a full domain name always
ends in a null label, which means the last character is a dot because the null string is
nothing. Figure 5.7 shows some domain names.
SNMP:
➢ Several network management standards have been devised during the last few decades.
The most important one is Simple Network Management Protocol (SNMP), used by the
Internet.
➢ SNMP is a framework for managing devices in an internet using the TCP/IP protocol suite.
It provides a set of fundamental operations for monitoring and maintaining an internet.
SNMP uses the concept of manager and agent.
➢ That is, a manager, usually a host, controls and monitors a set of agents, usually routers or
servers (see Figure 5.9).