Course details
Code
CS-586
Name
Distributed Computing
Program
Postgraduate
Areas
Algorithms and systems analysis
Computer security and distributed systems
(Algorithms and Systems Analysis)
(Parallel and Distributed Systems)
Description
This course focuses on the study of advanced and modern topics of distributed computing. During the course, the students will have the chance to study and present a collection of research papers in highly-challenging, hot research problems in the area of parallel and distributed computing. Examples of research directions covered by the course are the following:
1. Computing with Non-Volatile Memories: Aspects of Distributed and Parallel Computing
Next generation computer systems will rely on emerging memory technologies, such as Non-Volatile Memory (NVM), to address the high computation demands of modern applications and provide persistence. Major architecture manufacturers have already moved in the direction of designing and prototyping NVM technologies, with Intel Optane DC persistent memory being already launched by Intel. NVM technologies are expected to play a crucial role in the design of a wide range of future architectures, including not only commodity computers, but also storage servers, mobile devices, cloud systems, and the internet-of-things. The computing world is thus rapidly moving into the NVM era.
The first part of the course aims at answering questions such as the following: “How does distributed and parallel computation change in the NVM era and what will be the impact of the persistent memory revolution on the way we compute?” To answer this question, an extended collection of research papers on the topic will be studied. Moreover, through a series of presentations, the most important research achievements in this direction will be discussed and the most promising open problems for future research will be discovered.
2. Distributed and Parallel Aspects of Big-Data Analysis
An increasing number of applications across many diverse domains continuously produce very large amounts of data series. (A data series, or data sequence, is an ordered sequence of data points.) Examples of such applications are in finance, environmental sciences, astrophysics, neuroscience, engineering, multimedia, and others. Thus, data series are one of the most common types of data. When these sequence collections are generated, users need to query and analyze them as soon as they become available, a process that is heavily dependent on data series similarity search (which apart from being a useful query in itself, also lies at the core of several machine learning methods, such as, clustering, classification, motif and outlier detection, etc.).
The brute-force approach for evaluating similarity search queries is by performing a sequential pass over the complete dataset. However, as data series collections grow larger, scanning the complete dataset becomes a performance bottleneck, taking hours or more to complete. This is especially problematic in exploratory search scenarios, where every next query depends on the results of previous queries. Consequently, there exists an increased interest in developing indexing techniques and algorithms for similarity search
The course studies existing research work in the direction of efficient and scalable large-scale data series processing on top of distributed and parallel computing environments. It also studies challenging research directions for research work in the future.
LEARNING OUTCOMES
1. Knowledge: Having attended and succeeded in the course, the student is able to describe the main techniques governing the design of parallel and/or distributed algorithms to solve important research problems, which arise when computing on modern computing platforms and/or and with emerging hardware technologies. S/he will also be able to identify important questions that remain open for further study, and will have the ability to combine techniques studied in the course to exploit paths towards their solution. Finally, he will be able to explore the bibliography related to a research problem and figure out which of the techniques described in existing work can be suitable for solving the problem.
2. Understanding: Having attended and succeeded in the course, the student is able to distinguish the main difficulties when calculating on modern computing platforms, as well as on systems that support new hardware technologies. S/he will also be able to explain the main principles of designing, implementing and analyzing parallel and distributed algorithms and applications for such environments. S/he will also be able to assess whether existing techniques can lead to the solution of a research problem, as well as s/he will be able to generalize or combine such techniques for solving the problem.
3. Application: Having attended and succeeded in the course, the student is able to discover the principles governing the research area covered by the course, as well as to explore solutions for relevant research problems based on the application of the taught techniques and the combination of the gained knowledge.
4. Analysis: Having attended and succeeded in the course, the student is able to understand complex solutions of the research problems under study and to divide such a problem into simpler subproblems (i.e. decompose it to its individual components).
5. Synthesis: Having attended and succeeded in the course, the student is able to combine the solutions of individual subproblems in order to arrive at the solution of a complex problem. He is also able to analyze the literature and synthesize existing solutions in order to organize the solution of a research problem from the solution of its individual components.
6. Evaluation: Having attended and succeeded in the course, the student is able to compare different techniques and draw conclusions both in terms of their performance and also of their power to solve more difficult problems. Also, the student will be able to judge and evaluate papers he/she will read in the literature on research problems in the field of parallel and distributed computing.
STUDENT PERFORMANCE EVALUATION
Each week, material for study (eg a research paper) is assigned and groups of students or the teacher undertake its presentation. Each presentation is followed by a discussion with questions and discussion of open problems, as well as ideas towards their solutions. Also, during the semester students are asked to work on projects. Examples of such tasks are creating reports that describe the techniques studied in some part of the course. There is also a final assignment. For example, this can be a study on one of the topics studied in the course, including writing a report on the topic and presenting it to the class.
Each student's score is based on the following criteria:
• Presentations: 30%
• Participation in discussions during the class: 10%
• Assignments: 40%
• Final study: 30%
There is no final written exam for the course.
ECTS
6
Prerequisites
CS-380
Course website
Course email
hy586 AT csd DOT uoc DOT grShow email