Starting to Learn Distributed File Systems
Sunday, January 29, 2012 So here we go! I've been reading like crazy about distributed systems and distributed file systems the past few days. And like all graduate students, my first instinct was to start learning about the major conferences and reading the big papers in the field. The most obvious choice is to see if any of the big web companies published any papers on their infrastructure. Lo and Behold they have! So here's a coredump of my findings:
- The Datacenter as a Computer - Luiz Barroso and Urs Holzle
- Bigtable: A Distributed Storage System for Structured Data - Google
- MapReduce: Simplified Data Processing on Large Clusters - Jeffrey Dean and Sanjay Ghemawat (Google)
- The Google File System - Google
- GPFS: A Shared-Disk File System For Large Computing Clusters - Frank Schmuck and Roger Haskin (IBM)
- The Chubby Lock Service For Loosely-Coupled Distributed Systems - Mike Burrows (Google)
- Eventually Consistent - Werner Vogels (Amazon)
- Dynamo: Amazon's Highly Available Key-value Store - Amazon
- Windows Azure Storage: A Highly Available Cloud Storage System with Strong Consistency - Microsoft
- The Hadoop Distributed File System: Architecture and Design - Hadoop
Arstechnica has a gentle introduction to Distributed File Systems (DFS) here. If you know of any more papers, please let me know! The major conferences seem to be:
- Operating Systems Design and Implementation (OSDI)
- Symposium On Operating Systems Principles (SOSP)
- International Conference on Distributed Computing Systems (ICDCS)
I'll start digesting and blogging about these papers in another post - which means it's time to start Better Know a Distributed Systems tag!
