Wide Area Distributed File Systems: Panache: A Parallel File System Cache for Global File Access

Monday, February 18, 2013

Panache: A Parallel File System Cache for Global File Access

by M. Eshel et al., FAST 2010.

Abstract:
Cloud computing promises large-scale and seamless ac- cess to vast quantities of data across the globe. Appli- cations will demand the reliability, consistency, and per- formance of a traditional cluster file system regardless of the physical distance between data centers.
Panache is a scalable, high-performance, clustered file system cache for parallel data-intensive applications that require wide area file access. Panache is the first file system cache to exploit parallelism in every aspect of its design—parallel applications can access and update the cache from multiple nodes while data and metadata is pulled into and pushed out of the cache in parallel. Data is cached and updated using pNFS, which performs parallel I/O between clients and servers, eliminating the single-server bottleneck of vanilla client-server file ac- cess protocols. Furthermore, Panache shields applica- tions from fluctuating WAN latencies and outages and is easy to deploy as it relies on open standards for high- performance file serving and does not require any propri- etary hardware or software to be installed at the remote cluster.
In this paper, we present the overall design and imple- mentation of Panache and evaluate its key features with multiple workloads across local and wide area networks.

Link to the full paper:
http://www.cse.buffalo.edu/faculty/tkosar/cse710_spring13/papers/panache.pdf

6 comments:

SameerFebruary 18, 2013 at 11:48 AM
The gateway node doesn't update the remote cluster immediately after it performs the write operation at the remote cluster upon receiving a message from application node. Isn't there a possibility of a conflict?
ReplyDelete
Replies
DevashisFebruary 18, 2013 at 3:26 PM
How the cache is updated? The File Handle and inode remains same for the file, so how Panache handles the outdated file handles in the cache, for the files that are deleted?
ReplyDelete
Replies
Fengwei TianFebruary 19, 2013 at 5:25 PM
Panache scales I/O performance by using multiple gateway nodes to read chunks of a single file in parallel from the multiple nodes over NFS/pNFS.
How to assure the order of the chunks?
If the chunks have been changed during the transmission, how to detect and how to deal with it
ReplyDelete
Replies
UnknownFebruary 20, 2013 at 2:11 AM
The paper says one of the gateway nodes will become the coordinator for the file to be read during parallel reading. I suppose this coordinator node should keep the information of all other gateway nodes that read chunks of the same file, i.e., the size of the file chunk, the offset, etc. It should be responsible to maintain the file consistency after parallel reading by multiple gateway nodes.

Not sure about the change during the file transmission. But the inode in the cache cluster stores the local modification time. If the file is accessed after the revalidation timeout expires, gateway node will get the remote file's attributes and compare them with the stored values.
ReplyDelete
Replies

Add comment