by M. Vrable et al., FAST 2012.
Abstract:We present BlueSky, a network file system backed by cloud storage. BlueSky stores data persistently in a cloud storage provider such as Amazon S3 or Windows Azure, allowing users to take advantage of the reliability and large storage capacity of cloud providers and avoid the need for dedicated server hardware. Clients access the storage through a proxy running on-site, which caches data to provide lower-latency responses and additional opportunities for optimization. We describe some of the optimizations which are necessary to achieve good performance and low cost, including a log-structured design and a secure in-cloud log cleaner. BlueSky supports multiple protocols—both NFS and CIFS—and is portable to different providers.
Link to the full paper:
http://www.cse.buffalo.edu/faculty/tkosar/cse710_spring13/papers/bluesky.pdf
In case Multiple Proxies are implemented and the implementation is focused on providing stronger consistency by serializing concurrent file access, how would this impact the performance of the system? Since this feature has not yet been implemented, I would like to know if achieving consistency through this method is really worth it, at the cost of performance.
ReplyDeleteLog structured file systems generally gives a performance by reducing seek times. So can we say there be an advantage only if the network delays << disk seek times. May be by the time we reach that point, SSDs would have replaced HDD. Thoughts?
ReplyDeleteCloud storage interface doesn't support partial update of a stored object. So append model is used for storing data. How are deletions handled in 'append model'? Does Cleaner(Garbage Collector) handle this altogether?
ReplyDeleteHow does the provider prove its identify?
ReplyDelete