Verteilte Systeme

Basics

4 Ziele von verteilten Systemen nach Tatenbaum aufzaehlen und erklaeren (auswendig lernen)
- Connecting users and resources
  - Provide access to remote resources
  - Simplifies collaboration and exchange of information
  - Security mechanisms
  - Billing and accounting mechanisms
- Transparency
  - Hide the fact that resources are physically distributed
  - Hide where a resource is located
  - Hide the failure and recovery of a resource
- Openness
  - System offers functionality according to standard rules that specify how to access this functionality
  - Protocols, API, DSL, description
- Scalability
  - Allow growth in number of users/resources, geographical extent, number of administrative domains
Baumstruktur zu den verschiedenen Architekturen zeichnen und die Ebenen erklaeren (auswendig lernen)
- System vs Software
- In System: Is it Centralized, Decentralized or Hybrid?
- In Decentralized: Vertical or Horizontal?
- In Horizontal: Is it Structured or Unstructured?
What is the difference between Software Architecture and System Architecture?
- I would say, that Software Architecture is a part of System Architecture
- System architecture includes elements of both software and hardware and is used to enable design of such a composite system. → More about final working solution, deployment and structure, that guarantees the “qualities” → “Final instantiation of software architecture on real machines”
- Software architecture considers different factors such as Business strategy, human dynamics, quality attributes, design, and IT environment etc. → More about “qualities” itself. → “How various software components are organized and should interact”
What is the difference between Centralized, Decentralized and Hybrid?
- Centralized: Client-Server, where the Layers (UI-Application-DB) can be deployed to Client and Server in various forms (e.g Fat Client where Client has both UI and a part of Application or Thin Client where Client has only the UI). Trend: application parts are shifting from client to server. Serveral layers in Server, e.g separate servers for application and databases, or even services that are separately deployed.
- Decentralized: There is no single central point of control. Usualy voting instead. Pure P2P, e.g P2P file sharing systems, or blockchain.
- Hybrid: combination of centralized with decentralized solutions. P2P with usage of centralized server, e.g for maintaining the centralized list of users that have specific files. Security problem: central instance knows the information about all peers. Example: BitTorrent
- Decision:
  - Centralized: all decision-making powers and processing capabilities are located within a single central unit or location
  - Decentralized: decision-making and processing capabilities are distributed across multiple nodes or units within the system
- Welche zwischen P2P Architekturen gibt es?
  - Structrured P2P: based on deterministic algorithms for structuring, joining and leaving. Examples:
    - Chord system, which is organized logically a as ring, where id defines which data is stored in which node. Entering and leaving has strict policy.
    - Content Addressable Network (CAN): Each node has an associated region in d-dimensional coordinate space. Every data item in CAN will be assigned a unique point in this space, after which it is also clear which node is responsible for that data. Kinda similar to Chord, but imagine it with regions of 2D plane instead of ring. Also strict join and leaving procedures.
  - Unstructured P2P: based on randomized algorithms for structuring.
    - Each node maintains a list of neighbors, where each of these neighbors represents a randomly chosen node from the current set of nodes
    - Can be easily constructed as a new peer that wants to join the network can copy existing links of another node and then form its own links over time
    - If a peer wants to find a desired piece of data in the network, the query has to be flooded through the network in order to find as many peers as possible that share the data. “Do you have X? Do you know somebody, who has X? Do you know somebody, who know somebody, who has X?…”
    - Main disadvantage: queries may not always be resolved
    - Popular content is likely to be available at several peers and any peer searching for it, is likely to find the same thing
    - If a peer is looking for a rare or not so popular data shared by only a few other peers, then it is highly unlikely that a search will be successful
    - Flloding causes a high amount of signaling traffic in the network → much overhead → poor search efficiency
Is there any solution for Unstructured P2P disadvantages?
- Superpeers: use special nodes that maintain an index of data items, that are acting as a broker for nodes
- Example: Content Delivery Networks (CDN), i.e. collaboration of nodes that offer resources to each other
- When a regular peer joins the network, it attaches to multiple the superpeers and remains attached until it leaves the network
- Assumes that superpeers are long lived processes with a high availability

Verteilung

Was ist die vertikale Verteilung?
- Vertical distribution in decentralized networks refers to the allocation of different layers or levels of functionality across various nodes or components.
- Manage distributed systems by logically (and physically) splitting functions across machines, where each machine is tailored for a specific group of functions
- Term is related to the concept of vertical fragmentation as used in distributed relational databases (i.e. tables are split column wise)
- → Split by funtionality
Was ist die horizontale Verteilung? (auswendig lernen)
- Clients and server are physically split up into logically equivalent parts, but each part is operating on its own share of the complete data set → load balancing
- Each component is equal
- Each part acts as a client and server at the same time
- Communication is done through an overlay network
- All participating peers act as network nodes; if two nodes know each other there is an edge between the nodes
- Based on how the nodes are linked with each other, a classification in structured and unstructured Peer to Peer (P2P) overlay networks is done
- → Each node can do everything, and operates with some subset of data
Vertrikale vs Horizontale Verteilung
- Vertrikale: jede Gruppe ist verantwortlich fuer bestimmte Funktionalitaet
- Horizontale: alle Nodes haben gleiche Funktionalitaet, haben aber nur eine Teilmenge von Daten
Vorteile von horizontaler Verteilung (auswendig lernen)
- Load Balancing
- No Single Point of Failure
- Gute Skalierbarkeit - es ist einfach, neue Nodes zu hinzufuegen
- Gute Ausfallsicherheit, wenn man Redundanz hat
Nachteile von horizontaler Verteilung
- Overhead fuer Datensuche, eventuell auch kein Ergebnis fuer unpopulaere Dateien
Was sind die logisch aequvalente Teile
- Komponente, die die identischen Rechte und gleiche Funktionen haben
Was ist Vorteil von logisch aquivalenten Teilen?
- Die Nodes koennen ausfaellen, ohne System zum Ausfall zu bringen → Ausfallsicherheit
- Teile von Daten koennen nicht verfugbar sein, oder Aktionen koennen laenger dauern, System ist aber aktiv
Wir haben viele aquivalente Knoten. Was braucht man, um die Kommunikation zwischen Knoten zu ermoeglichen?
- Nodes sind normalerweise ueber Internet verbunden. Fuer die Kommunitaion wir darauf ein virtuelles Overlay Network gebaut.
Overlay vs. underlay networks: What are the differences?
- An overlay network is a network that is built on top of another network and is supported by its infrastructure. An overlay network decouples network services from the underlying infrastructure by encapsulating one network packet inside of another packet. After the encapsulated packet has been forwarded to the endpoint, it is de-encapsulated.

P2P

Was ist P2P?
- A system structure where each component is equal and acts as a client and server at the same time
Was benoetigen wir fuer ein P2P?