Wide Area Distributed File Systems: The Google File System

Tuesday, April 2, 2013

The Google File System

by S. Ghemewat., SOSP 2003.

Abstract:
We have designed and implemented the Google File Sys- tem, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.
While sharing many of the same goals as previous dis- tributed file systems, our design has been driven by obser- vations of our application workloads and technological envi- ronment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore rad- ically different design points.
The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our ser- vice as well as research and development efforts that require large data sets. The largest cluster to date provides hun- dreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.
In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real world use.

Link to the full paper:
http://www.cse.buffalo.edu/faculty/tkosar/cse710_spring13/papers/gfs.pdf

5 comments:

SameerApril 2, 2013 at 7:59 PM
Can this file system be used as a general file system?
ReplyDelete
Replies
UnknownApril 2, 2013 at 8:22 PM
How does Lazy space allocation avoids wasting space due to internal
fragmentation?
ReplyDelete
Replies
DevashisApril 3, 2013 at 2:00 AM
In case of a file smaller than the 64 MB chunk size, the chunk server might be overloaded for multiple file requests. Do they reduce the overloading by making run time replicas into multiple chunk server.
ReplyDelete
Replies

Add comment