Distributed System

What is distributed System?

Internet is one of the distributed systems

Distributed system is a collection of independent computers that work together as one computer. Distributed System consist of component that are autonomous.(act independently) However the autonomous component need to collaborate. The component don't have to homogeneous to communicate each others.
The main purpose of the distributed system is to sharing resources for example like world wide web/internet is one of the successful distributed system which vast interconnected collection of computer network of many different types.

There are several goals of distributed system:
  1. Transparency
  2. Openness
  3. Reliability
  4. Performance
  5. Scalability

Transparency

An important goal to hide the fact that all the process and resource are physically distributed across multiple computers but for users is like a single computer.
The figure show the example of transparency of the system however it not necessary to have all the transparency listed.

Openness

Open distributed system is a system that offers services according to the standards rules that describe the syntax and semantics of those services. Services are generally specified through interfaces(Intergace Definition Language (IDL) ). Interface definition allows an arbitrary process that needs a certain interface to talk to another process. It also allows 2 independent parties to build completely different implementations of those interfaces, leading to 2 separate distributed systems that operate in exactly the same way.
Another important goal is it should be easy to configure the system out of different components and able to add new component or replace existing one without affecting others original component.

Reliability

Distributed system(DS) should be more reliable than single system. The DS should have high availability so that the large fraction of time the system is usable by redundancy. The DS also need to maintain consistency and need to be secure. Moreover, it should have fault tolerance which need to mask failures and automatic recovering from errors.

Performance

Performance is another important goal of DS because without gain on performance why bother with distributed systems. In distributed system because of the resources is physically distributed around the world, the communication delays is the main issue of performance of distributed systems. Fine-grain parallelism is high degree of interaction between the DS component while coarse-grain parallelism is has lower degree of interaction the size of packet send is larger compare to the fine-grain.

Scalability

The DS is consider scalable if the performance of the DS is remain while the users is increase significantly with the resources. The challanges faced by the Distributed system is controlling the cost of physical resources because as the demand of resources grow, it should possible to extend the system at reasonable cost to meet the requirement. Beside it also able to control the performance loss and prevent the software resources is running out.
Here are the few bottlenecks of scalability:

  1. Centralized components: a single mail server
  2. Centralized tables/data: a single URL address book
  3. Centralized algorithms: routing based on complete information
Example of centralized component is many services are centralized which implement only on a single server, then the server can become a bottleneck as the number of users and applications grows.

A decentralized algorithms should used because the the characteristics of decentralized algorithms:
  1. No machine has complete information about the system state.
  2. Machines make decisions based on local information
  3. Failure of one machine does not ruin the algorithm
  4. There is no implicit assumption that a global clock exists.
There are 3 basic technique for scaling:
  1. Hiding communication lateness to avoid waiting for responses to remote services request as much as possible which are important to achieve geographical scalability.
  2. Distribution by taking a component split it to smaller parts and subsequently spreading those parts across system. 
  3. Replication which increase the availability ,balance load between components leading to better performance but it may lead to consistency problems. To let the data inconsistency can be tolerated depends highly on the usage of a resource.

Comments

Popular posts from this blog

Reading and Writing Operation of SRAM

Transmission Control Protocol (TCP)

File transfer from android to linux