CS527 Parallel Computer Architecture

Department of Computer Science

University of Crete

___________________________________________________________________________________________________________________________________

Course Type: 

Link to online course catalogue.

Instructors:

Angelos Bilas

___________________________________________________________________________________________________________________________________


Course Info                                 Syllabus                     Assignments

___________________________________________________________________________________________________________________________________

Syllabus

___________________________________________________________________________________________________________________________________

 

Week

Date (2024-25) 

Topic

Tasks

 

Mon 23-Sep

 

Postponed due to instructor trip

 

Wed 25-Sep

 

Postponed due to instructor trip

1a

Mon 30-Sep

Logistics & Introduction

1.     Course topics and logistics 

2.     Assignment 1 

3.     Lecture: Historic technology trends (Chapters: 1) - slides in ~hy527/slides/Ch1-PC-new.ppt

4.     Optional reading: G. MOORE, "Cramming More Components onto Integrated Circuits", Electronics, p114-117, April 1965. 

1b

Wed, 02-Oct

Technology Trends

1.     Assignment 1 intro 

o   What are threads? 

o   User vs. kernel threads 

2.     Lecture: Technology trends today (HiPEAC Vision) - slides in ~hy527/slides/HiPEAC_ORAP_150402-final+modified.pdf 

3.     Lecture: Technology trends today - Implications of data growth - slides in ~hy527/slides/coping_with_data_growth_sep15.pdf 

2a

Fri, 04-Oct

(virtual Mon 23-Sep)

Understanding Concurrency

1.     Assignment 1 data structures, ADTs, and mechanisms 

2.     Paper discussion: 

o   Why threads are a bad idea (for most purposes).  Invited talk. In USENIX 1996 Annual Technical Conference. 

o   Why events are a bad idea. Rob von Behren, Jeremy Condit, and Eric Brewer. 9th Workshop on Hot Topics in Operating Systems (HotOS IX). 

3.     Optional reading: Evolution of the x86 context switch in Linux 

2b

Mon, 07-Oct

Convergence of Parallel Architectures

1.     Assignment 1 questions 

2.     Lecture: Convergence of parallel architectures (Chapters: 1) - slides in ~hy527/slides/Ch1-PC-new.ppt

3.     Optional reading: G. M. AMDAHL, "Validity of the Single-Processor Approach to Achieving Large Scale Computing Capabilities", AFIPS Conference Proceedings, (April 1967), 483-485. 

3a

Wed, 09-Oct

Convergence of Parallel Architectures

1.     Lecture: Parallel Programming Examples (Chapters: 2,3) - slides in ~hy527/slides/Ch2-PC-new.ppt

3b

Fri, 11-Oct

Parallel Programming

1.     Paper discussion: J. B. Dennis and D. P. Misunas, "A Preliminary Architecture for a Basic Data-Flow Processor," Proc. 2nd Annual Symposium on Computer Architecture, Computer Architecture News, 3, 4 (December 1974), 126-132, ACM.

2.     If interested, recent further reading: Tony Nowatzki, Vinay Gangadhar, and Karthikeyan Sankaralingam, "Heterogeneous Von Neumann/Dataflow Microprocessors." Communications of the ACM, June 2019, Vol. 62 No. 6, Pages 83-91, 10.1145/3323923

4a

Mon, 14-Oct

Parallel Programming

1.     Lecture: Parallel Programming Examples (Chapters: 2,3) - slides in ~hy527/slides/Ch2-PC-new.ppt

4b

Wed, 16-Oct

Workload-driven Performance Evaluation

1.     Paper discussion: The SPLASH2 Programs: Characterization and Methodological Considerations. Steven Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh and Anoop Gupta. In Proceedings of the 21st International Symposium on Computer Architecture, June 1995. 

5a

Mon, 21-Oct

Parallel Programming,

Workload-driven Performance Evaluation

1.      Assignment 1 due 

2.      Assignment 2 intro 

3.      Lecture: Workloadr-driven performance evaluation: Parameter space, metrics, interactions between application and architecture (Chapters: 4) - slides in ~hy527/slides/Ch4-PC-new.ppt

5b

Wed, 23-Oct

Time, Ordering, and Memory Consistency

1.     Paper discussion (cont'd): The SPLASH2 Programs: Characterization and Methodological Considerations Steven Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh and Anoop Gupta. In Proceedings of the 21st International Symposium on Computer Architecture, June 1995.

2.     Lecture: Ordering issues, program order, sequential order, coherency vs. consistency, memory consistency models, ensuring coherency, ensuring consistency. (Chapters: 5,9)

 

Mon, 28-Oct

National Holiday

 

6a

Wed, 30-Oct

Time, Ordering, and Memory Consistency

1.     Paper discussion: Time, Clocks, and the Ordering of Events in a Distributed System. Leslie Lamport. Communications of the ACM, 21(7), pp. 558-565, July 1978.

2.     Optional reading: Why logical clocks are easy.  Carlos Baquero and Nuno Preguia. Communications of the ACM. Volume 59 Issue 4, April 2016. Pages 43-47. 

6b

Fri 01-Nov 

(virtual Mon, 28-Oct)

Memory consistency models

1.     Lecture: Memory consistency models: sequential consistency, weak consistency, release consistency, entry consistency. (Chapters: 5,9) 

2.     Optional reading: How to Make a Multiprocessor That Correctly Executes Multiprocess Programs. Leslie Lamport. IEEE Trans. on Computers, Vol. C-28, Number 9, pp. 690-691, September 1979.

7a

Mon, 04-Nov

Memory consistency models

1.      Assignment 2 due 

2.      Assignment 3 intro

3.      Paper discussion: Memory Consistency Models. D. Mosberger. ACM Operating Systems Review, 27(1), pp. 18-26, January 1993. 

4.      Further reading: (book) A Primer on Memory Consistency and Cache Coherence. Daniel J. Sorin, Mark D. Hill, and David A. Wood.   1st Edition. Morgan & Claypool Publishers ©2011. ISBN:1608455645 9781608455645. [online pdf for preliminary draft]

7b

Wed, 06-Nov

Snoop-Based Shared Memory Multiprocessors

1.     Start thinking about your project proposal

2.     Lecture: Providing consistency in bus-based shared memory multiprocessors. Cache coherency/memory consistency protocols, cache misses categorization. (Chapters: 6) 

3.     Optional reading:

·       P. Sweazey and A. J. Smith. 1986. A class of compatible cache consistency protocols and their support by the IEEE futurebus. In Proceedings of the 13th annual international symposium on Computer architecture (ISCA '86). IEEE Computer Society Press, Washington, DC, USA, 414–423.

·       Correct Memory Operation of Cache-Based Multiprocessors. C. Scheurich and M. Dubois. Proceedings of the 14th International Symposium on Computer Architecture, June 1987, pp. 234:243.

·       J. E. THORNTON "Parallel Operation in the Control Data 6600," Fall Joint Computers Conference, vol. 26, pp. 33-40, 1961.

 

Mon, 11-Nov

Local Holiday

 

8a

Wed, 13-Nov

Distributed Shared Memory Multiprocessors

1.     Lecture: Scalable shared memory machines, distributed shared memory, directory protocols, CC-NUMA, COMA, synchronization issues. (Chapters: 7, 8, 9)

8b

Fri, 15-Nov

(virtual Mon, 11-Nov)

Distributed Shared Memory Multiprocessors

1.     Lecture: Scalable shared memory machines, distributed shared memory, directory protocols, CC-NUMA, COMA, synchronization issues. (Chapters: 7, 8, 9) 

9a 

Mon, 18-Nov

Distributed Shared Memory Multiprocessors

1.      Assignment 3 due 

2.      Project proposals due. After ack from instructor, start working on your projects.

3.      Paper Discussion: An Evaluation of Directory Schemes for Cache Coherence. A. Agarwal, R. Simoni, J. Hennessy, M. Horowitz. ISCA'88.

9b

Wed, 20-Nov

Software Distributed Shared Memory

1.     Project proposals due – start working on your projects

2.     Lecture: Software shared memory. Shared Virtual Memory, Instrumentation-based software DSM, Implementation issues. (Chapters: 9) 

3.     Paper discussion: Fine-grain Access Control for Distributed Shared Memory, Ioannis Schoinas, Babak Falsafi, Alvin R. Lebeck, Steven K. Reinhardt, James R. Larus, David A. Wood (The Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), Oct. 1994).

10a

Mon, 25-Nov 

Message Passing Multiprocessors

1.     Lecture: Node-to-network interfaces in message passing systems, programming considerations. Increasing message throughput. (Chapters: 10, 7.7). Slides: ~hy527/slides/2.nic_throughput.pptx

10b

Wed, 27-Nov

Message Passing Multiprocessors 

1.     Lecture: Node-to-network interfaces in message passing systems, programming considerations. Increasing message throughput. (Chapters: 10, 7.7). Slides: ~hy527/slides/2.nic_throughput.pptx

2.     Optional Reading: Introduction to the Cell multiprocessor. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. IBM Journal of Research and Development. Volume 49, Number 4/5, 2005.

11a

Mon, 02-Dec

Message Passing Multiprocessors 

1.     Lecture: Node-to-network interfaces in message passing systems, programming considerations. Reducing CPU overhead. (Chapters: 10, 7.7). Slides: ~hy527/slides/3.nic_overhead.pptx 

11b

Wed, 04-Dec

Message Passing Multiprocessors 

1.     Paper discussion: Software Overhead in Message Passing Layers: Where Does the Time Go? Vijay Karamcheti and Andrew Chien. International Conference on Architectural Support for Programming Languages and Operating Systems. October 1994. 

12a

Mon, 09-Dec

Message Passing Multiprocessors 

1.     Lecture: Node-to-network interfaces in message passing systems, programming considerations. Reducing message latency. (Chapters: 10, 7.7). Slides: ~hy527/slides/1.nic_latency.pptx 

12b

Wed, 11-Dec

Communication in Anton

1.     Paper discussion: Dror, R.O.; Grossman, J. P.; Mackenzie, K.M.; Towles, B.; Chow, E.; Salmon, J.K.; Young, C.; Bank, J.A.; Batson, B.; Deneroff, M.M.; Kuskin, J.S.; Larson, R.H.; Moraes, M.A.; Shaw, D.E. Overcoming Communication Latency Barriers in Massively Parallel Scientific Computation. Micro, IEEE , vol.31, no.3, pp.8,19, May-June 2011.

13a

Mon, 16-Dec

Parallel I/O (in Anton)

1.     Lecture: Why is parallel I/O important - main problems.

2.     Paper Discussion: Accelerating parallel analysis of scientific simulation data via Zazen. Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw. Usenix FAST'10.

13b

Wed, 18-Dec

Project Presentations

1.     Projects due: Presentations, 15-20 mins/project. 

2.     End of HY527

___________________________________________________________________________________________________________________________________

(c) Copyright University of Crete, Greece, Last Modified: 22-Sep-2024