Wide Area Distributed File Systems: Ceph: A Scalable, High-Performance Distributed File System

Tuesday, April 9, 2013

Ceph: A Scalable, High-Performance Distributed File System

by S. Weil et al., OSDI 2006.

Abstract:
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation ta- bles with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clus- ters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replica- tion, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific comput- ing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata manage- ment, supporting more than 250,000 metadata operations per second.

Link to the full paper:
http://www.cse.buffalo.edu/faculty/tkosar/cse710_spring13/papers/ceph.pdf

9 comments:

DevashisApril 9, 2013 at 3:58 PM
With CRUSH function in client side(and the hashing it uses), the client can directly know the location of the placement groups? ; thereby reducing the dependency on Metadata servers whose primary service might be to inform about the location of the files in the OSDs.
ReplyDelete
Replies
UnknownApril 9, 2013 at 7:47 PM
Heavily read directories are replicated across multiple nodes to distribute load. Also, clients accessing popular metadata are told the metadata reside either on different or multiple MDS nodes- to reduce hot spots. How is a globally consistent state of metadata attained if clients make changes to different metadata replicas?
ReplyDelete
Replies
Sonali BatraApril 9, 2013 at 8:33 PM
How is the CRUSH function better than consistent hashing? Both provide distribution without file allocation tables and consistent hashing is less complex
ReplyDelete
Replies
Shijith Thekkarikottil RajanApril 9, 2013 at 8:51 PM
How does Ceph ensure Data Integrity across replicas?
ReplyDelete
Replies
UnknownApril 9, 2013 at 9:30 PM
When directories become hot spot and are hashed across mul-tiple nodes. In this case how locality is preserved.
ReplyDelete
Replies
UnknownApril 9, 2013 at 9:35 PM
Are the replicas utilised for load balancing? Is so how is the right OSD identified from the PG?
ReplyDelete
Replies
VijayApril 9, 2013 at 10:23 PM
What is the significance of the global switch feature that Ceph supports?
ReplyDelete
Replies
Sharath ChandrashekharaApril 9, 2013 at 11:02 PM
In figure 4, does client release the object lock after receipt of the "ack" or the "commit" message?? After "ack", the system might accept another reads/writes right?
ReplyDelete
Replies
NikhilApril 9, 2013 at 11:05 PM
How does the client know the address of the metadata server where the information for the desired file is stored?
ReplyDelete
Replies

Add comment