Computer Science Department,
University of Crete
HY-590.45.
Modern Topics in Scalable Storage Systems
info |
readings |
syllabus |
other resources
Course Staff
Name |
Email |
Office Hours |
Instructor: Kostas Magoutis |
hy590-45@csd |
By appt./E-306 |
Teaching Assistant: Efthymios Papageorgiou |
hy590-45@csd |
By appt. |
General Information
The course meets on Tue 12-2pm, Fri 12-2pm in E.313 (3rd floor of CSD building). Exceptionally, in certain weeks we will be meeting on Wed 6-8pm in E.313 instead of Tue 12-2pm (in those occasions, you will be notified upfront).
Announcements
24.3.2025 10:00: Your project proposals are due April 1, 2025
20.2.2025 10:00: Note NEW course meeting times: Tue 12-2pm, Fri 12-2pm (with certain exceptions, see under General Information)
14.2.2024 10:00: To join the HY-590.45 mailing list, send an e-mail to majordomo@csd with body subscribe hy590-45-list
10.2.2024 10:00: We will be using the AWS Academy cloud platform for course assignments, you may find our course page here
12.1.2024 10:00: The course will start on Thursday 13/2
1.1.2024 10:00: You are welcome to get in touch with the instructor to discuss course-related issues
Course Description
The explosive growth of information processing
services in recent years has created an
unprecedented need for storage capacity.
Scalable access to storage resources requires
a class of distributed systems designed for
fast, reliable, and uninterrupted access to storage media
(e.g., magnetic disks and tapes) over high-speed
networks. This course offers an
introduction to scalable storage systems and examines existing design
techniques as well as current research problems in the
design and implementation of such systems, along with
possible solutions.
Some of the
advantages of the scalable storage model over direct-attached
storage include expandable capacity and performance,
as well as improved utilization
and sharing of distributed storage resources. A number of
challenges, however, are facing the scalable storage
systems architect:
First, it is the higher complexity (compared to direct-attached
storage) due to the distributed nature of the scalable storage
system. Administration, capacity planning, configuration,
backup, and disaster recovery are complicated in large-scale
scalable storage systems. Second, transferring data over the
network requires stronger security and safety guarantees than
when transferring them on the system I/O bus. In addition, it
sometimes requires new, storage-specific network transport
protocols. These and other challenges make scalable storage an
exciting research area that has made significant advances in
recent years.
The core part of the course focuses on the study of
scalable storage systems with special emphasis on
architectures, design principles for scalable performance,
reliability, and availability, the management of data
during their lifecycle, application-specific design
concepts, ways to reduce implementation cost, storage
system capacity planning, and storage outsourcing services.
This course is targeted for graduate students and advanced
undergraduates and requires the undertaking of a research
project. The topics of the
research projects will be chosen with the help and guidance of the
course staff.
Coursework
- Reading and discussing classic and current papers
- Review and presentation of two research papers
- A research project of your choice
Prerequisites
- Introductory operating systems and database courses such as
HY-345 and
HY-360
- Solid understanding of the function and operation of the
file, disk, and network I/O subsystems in modern UNIX systems.
Grading
The final grade depends on class participation, presentation of two research papers, and a research project.
Readings
There are a number of paper readings that are
available online. You are expected to read
the papers before the beginning of each class.
There is no required textbook for this class. The following textbooks,
however, are recommended readings:
Syllabus
Date |
Topic |
Readings, notes
|
Thu 13/2
|
Course overview
|
-
|
Fri 14/2
|
Background I
|
-
|
Thu 20/2
|
Background II
|
-
|
Fri 21/2
|
Background III
|
-
|
Wed 26/2 6pm
|
Background IV
|
-
|
Fri 28/2
|
Class will be rescheduled
|
-
|
Wed 5/3 6pm
|
Background V
|
-
|
Fri 7/3
|
Extending file systems over the network
|
csdp1178, Sandberg: Design and implementation of the Sun Network Filesystem
|
Wed 12/3 6pm
|
NFS (contd.)
|
Macklem: Not Quite NFS, Soft Cache Consistency for NFS
|
Fri 14/3
|
Distributed coordination
|
Ongaro: In Search of an Understandable Consensus Algorithm
|
Tue 18/3
|
Raft (contd.)
|
Visualization
|
Fri 21/3
|
Distributed virtual disks
|
Petal: Distributed virtual disks
|
Wed 26/3 6pm
|
Petal (contd.)
|
-
|
Fri 28/3
|
Presentations I
|
csdp1318, csdp1368, csdp1394
|
Tue 1/4
|
Tutorial on AWS Academy Learner Lab (TA)
|
Project proposals due
|
Fri 4/4
|
Presentations I
|
csdp1397, csdp1408, csdp1414
|
Wed 9/4 6pm
|
Presentations I
|
csd5880, csd5881, csdp1388
|
Fri 11/4
|
Presentations I
|
csdp1178, csdp1411, csdp1418
|
Mon 14/4 - Fri 25/4
|
Easter recess
|
-
|
Tue 29/4
|
Distributed file systems I
|
Thekkath: Frangipani: A Scalable Distributed File System
|
Fri 2/5
|
Distributed file systems II
|
Ghemawat: The Google File System
|
Tue 6/5
|
Presentations II
|
csdp1318, csdp1368, csdp1394
|
Tue 9/5
|
Presentations II
|
csdp1397, csdp1408, csdp1388
|
Tue 13/5
|
Presentations II
|
csdp1414, csdp1418
|
Tue 16/5
|
Presentations II
|
csd5880, csd5881
|
Tue 20/5
|
Application-specific storage systems
|
Saito: Manageability, availability and performance in Porcupine
|
Projects HOWTO
Please note the following project guidelines:
- Discuss project ideas with course staff
- Prepare a short desription (abstract) of your goals and deliverables by Tue 1/4 11:59pm
In preparing your abstract, describe
- The system you will be evaluating (artifacts, experiments)
- Expected outputs (e.g., experiments to focus on, type of deployments on AWS Academy, etc.)
- Project reports (in the ACM paper formatting templates) are due June 30
- Prepare a "rapid fire" (10min, up to 6 slides) presentation of your key results
- As part of your submission, please include a video of your presentation
- You may use this presentation template
Other Resources / Useful links