Cateva Raspunsuri
-
Upload
ionut-piticas -
Category
Documents
-
view
213 -
download
1
Transcript of Cateva Raspunsuri
Winter, 2004 CSS490 Fundamentals 1
CSS490 FundamentalsCSS490 FundamentalsTextbook Ch1Textbook Ch1
Instructor: Munehiro Fukuda
These slides were compiled from the course textbook and the reference books.
Winter, 2004 CSS490 Fundamentals 2
Parallel v.s. Distributed Systems
Parallel Systems Distributed SystemsMemory Tightly coupled shared
memoryUMA, NUMA
Distributed memoryMessage passing, RPC, and/or used of distributed shared memory
Control Global clock controlSIMD, MIMD
No global clock controlSynchronization algorithms needed
Processor interconnection
Order of TbpsBus, mesh, tree, mesh of tree, and hypercube (-related) network
Order of GbpsEthernet(bus), token ring and SCI (ring), myrinet(switching network)
Main focus PerformanceScientific computing
Performance(cost and scalability)Reliability/availabilityInformation/resource sharing
Winter, 2004 CSS490 Fundamentals 3
Milestones in Distributed Computing Systems1945-1950s Loading monitor1950s-1960s Batch system1960s Multiprogramming1960s-1970s Time sharing systems Multics, IBM3601969-1973 WAN and LAN ARPAnet, Ethernet1960s-early1980s
Minicomputers PDP, VAXEarly 1980s Workstations Alto1980s – present
Workstation/Server models
Sprite, V-system
1990s Clusters BeowulfLate 1990s Grid computing Globus, Legion
Winter, 2004 CSS490 Fundamentals 4
System Models Minicomputer model Workstation model Workstation-server model Processor-pool model Cluster model Grid computing
Winter, 2004 CSS490 Fundamentals 5
Minicomputer Model
Extension of Time sharing system User must log on his/her home minicomputer. Thereafter, he/she can log on a remote machine by telnet.
Resource sharing Database High-performance devices
Mini-computer
Mini-computer
Mini-computer
ARPAnet
Winter, 2004 CSS490 Fundamentals 6
Workstation Model
Process migration Users first log on his/her personal workstation. If there are idle remote workstations, a heavy job
may migrate to one of them. Problems:
How to find am idle workstation How to migrate a job What if a user log on the remote machine
100GbpsLAN
Workstation
Workstation Workstation
WorkstationWorkstation
Winter, 2004 CSS490 Fundamentals 7
Workstation-Server Model Client workstations
Diskless Graphic/interactive applications processed
in local All file, print, http and even cycle
computation requests are sent to servers. Server minicomputers
Each minicomputer is dedicated to one or more different types of services.
Client-Server model of communication RPC (Remote Procedure Call) RMI (Remote Method Invocation)
A Client process calls a server process’ function.
No process migration invoked Example: NSF
100GbpsLAN
Workstation
Workstation Workstation
Mini-Computerfile server
Mini-Computerhttp server
Mini-Computer
cycle server
Winter, 2004 CSS490 Fundamentals 8
Processor-Pool Model Clients:
They log in one of terminals (diskless workstations or X terminals)
All services are dispatched to servers.
Servers: Necessary number of
processors are allocated to each user from the pool.
Better utilization but less interactivity
Server 1
100GbpsLAN
Server N
Winter, 2004 CSS490 Fundamentals 9
Cluster Model Client
Takes a client-server model
Server Consists of many
PC/workstations connected to a high-speed network.
Puts more focus on performance: serves for requests in parallel.
100GbpsLAN
Workstation
Workstation Workstation
Masternode
Slave1
SlaveN
Slave2
1Gbps SAN
http server1http server2
http server N
Winter, 2004 CSS490 Fundamentals 10
High-speedInformation high way
Grid Computing Goal
Collect computing power of supercomputers and clusters sparsely located over the nation and make it available as if it were the electric grid
Distributed Supercomputing Very large problems needing lots of
CPU, memory, etc. High-Throughput Computing
Harnessing many idle resources On-Demand Computing
Remote resources integrated with local computation
Data-intensive Computing Using distributed data
Collaborative Computing Support communication among multiple
parties
Super-computer
Cluster
Super-computer Cluster
Mini-computer
Workstation
Workstation Workstation
Winter, 2004 CSS490 Fundamentals 11
Reasons for Distributed Computing Systems
Inherently distributed applications Distributed DB, worldwide airline reservation, banking system
Information sharing among distributed users CSCW or groupware
Resource sharing Sharing DB/expensive hardware and controlling remote lab.
devices Better cost-performance ratio / Performance
Emergence of Gbit network and high-speed/cheap MPUs Effective for coarse-grained or embarrassingly parallel applications
Reliability Non-stopping (availability) and voting features.
Scalability Loosely coupled connection and hot plug-in
Flexibility Reconfigure the system to meet users’ requirements
Winter, 2004 CSS490 Fundamentals 12
Network v.s. Distributed Operating Systems
Features Network OS Distributed OSSSI(Single System Image)
NOSsh, sftp, no view of remote memory
YESProcess migration, NFS,DSM (Distr. Shared memory)
Autonomy High Local OS at each computerNo global job coordination
LowA single system-wide OSGlobal job coordination
Fault Tolerance Unavailability grows as faulty machines increase.
Unavailability remains little even if fault machines increase.
Winter, 2004 CSS490 Fundamentals 13
Issues in Distributed Computing System
Transparency (=SSI) Access transparency
Memory access: DSM Function call: RPC and RMI
Location transparency File naming: NFS Domain naming: DNS (Still location concerned.)
Migration transparency Automatic state capturing and migration
Concurrency transparency Event ordering: Message delivery and memory
consistency Other transparency:
Failure, Replication, Performance, and Scaling
Winter, 2004 CSS490 Fundamentals 14
Issues in Distributed Computing System Reliability
Faults Fail stop Byzantine failure
Fault avoidance The more machines involved, the less avoidance
capability Fault tolerance
Redundancy techniques K-fault tolerance needs K + 1 replicas K-Byzantine failures needs 2K + 1 replicas.
Distributed control Avoiding a complete fail stop
Fault detection and recovery Atomic transaction Stateless servers
Winter, 2004 CSS490 Fundamentals 15
Flexibility Ease of modification Ease of enhancement
Network
MonolithicKernel(Unix)
MonolithicKernel(Unix)
MonolithicKernel(Unix)
Userapplications
Userapplications
Userapplications
Network
Microkernel(Mach)
Userapplications
Userapplications
Userapplications
Daemons(file, name,
Paing)Microkernel
(Mach)
Daemons(file, name,
Paing)Microkernel
(Mach)
Daemons(file, name,
Paing)
Winter, 2004 CSS490 Fundamentals 16
Performance/ScalabilityUnlike parallel systems, distributed systems involves OS intervention and slow network medium for data transfer
Send messages in a batch: Avoid OS intervention for every message transfer.
Cache data Avoid repeating the same data transfer
Minimizing data copy Avoid OS intervention (= zero-copy messaging).
Avoid centralized entities and algorithms Avoid network saturation.
Perform post operations on client sides Avoid heavy traffic between clients and servers
Winter, 2004 CSS490 Fundamentals 17
Heterogeneity Data and instruction formats depend on each
machine architecture
If a system consists of K different machine types, we need K–1 translation software.
If we have an architecture-independent standard data/instruction formats, each different machine prepares only such a standard translation software. Java and Java virtual machine
Winter, 2004 CSS490 Fundamentals 18
Security Lack of a single point of control Security concerns:
Messages may be stolen by an intruder. Messages may be plagiarized by an
intruder. Messages may be changed by an intruder.
Cryptography is the only known practical method.
Winter, 2004 CSS490 Fundamentals 19
Distributed Computing Environment
Various 0perating systems and networking
Threads
Distributed File Service
RPC
SecurityName
Distributed Time Service
DCE Applications
Winter, 2004 CSS490 Fundamentals 20
Exercises (No turn-in)1. In what respect are distributed computing systems
superior to parallel systems?2. In what respect are parallel systems superior to distributed c
omputing systems?3. Discuss the difference between the workstation-server and t
he processor-pool model from the availability view point.4. Discuss the difference between the processor-pool and the c
luster model from the performance view point. 5. What is Byzantine failure? Why do we need 2k+1 replica for t
his type of failure?6. Discuss about pros and cons of Microkernel.7. Why can we avoid OS intervention by zero copy?