Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

$λ$FS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions (2306.11877v1)

Published 20 Jun 2023 in cs.DC

Abstract: The metadata service (MDS) sits on the critical path for distributed file system (DFS) operations, and therefore it is key to the overall performance of a large-scale DFS. Common "serverful" MDS architectures, such as a single server or cluster of servers, have a significant shortcoming: either they are not scalable, or they make it difficult to achieve an optimal balance of performance, resource utilization, and cost. A modern MDS requires a novel architecture that addresses this shortcoming. To this end, we design and implement $\lambda$FS, an elastic, high-performance metadata service for large-scale DFSes. $\lambda$FS scales a DFS metadata cache elastically on a FaaS (Function-as-a-Service) platform and synthesizes a series of techniques to overcome the obstacles that are encountered when building large, stateful, and performance-sensitive applications on FaaS platforms. $\lambda$FS takes full advantage of the unique benefits offered by FaaS $\unicode{x2013}$ elastic scaling and massive parallelism $\unicode{x2013}$ to realize a highly-optimized metadata service capable of sustaining up to 4.13$\times$ higher throughput, 90.40% lower latency, 85.99% lower cost, 3.33$\times$ better performance-per-cost, and better resource utilization and efficiency than a state-of-the-art DFS for an industrial workload.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Alibaba Cloud Function Compute Custom Container Runtime. https://www.alibabacloud.com/help/doc-detail/179368.htm.
  2. Apache Hadoop. http://hadoop.apache.org/.
  3. Apache OpenWhisk. https://github.com/apache/incubator-openwhisk.
  4. AWS Lambda. https://aws.amazon.com/lambda/.
  5. AWS Lambda Pricing. https://aws.amazon.com/lambda/pricing/.
  6. BeeGFS. https://www.beegfs.io/c/.
  7. Capabilities in CephFS. https://docs.ceph.com/en/quincy/cephfs/capabilities/.
  8. GitHub EsotericSoftware/kryonet. https://github.com/EsotericSoftware/kryonet/blob/03a135e2039bd7eb20e436ad70539238563d15a4/README.md.
  9. Google Cloud Run. https://cloud.google.com/run.
  10. kubernetes: Horizontal Pod Autoscaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
  11. λ𝜆\lambdaitalic_λFS Source Code. https://github.com/ds2-lab/LambdaFS.
  12. λ𝜆\lambdaitalic_λFS Workload Driver. https://github.com/ds2-lab/LambdaFS-Benchmark-Utility.
  13. LevelDB. https://github.com/google/leveldb.
  14. Lustre file system. http://lustre.org/.
  15. MySQL :: MySQL 8.0 Reference Manual :: 23 MySQL NDB Cluster 8.0.
  16. MySQL Cluster NDB. https://www.mysql.com/products/cluster/.
  17. New for AWS Lambda – Container Image Support. https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/.
  18. NSF Computational and Data-Enabled Science and Engineering (CDS&E). https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504813.
  19. Nuclio. https://nuclio.io/.
  20. NumPy: the fundamental package for scientific computing with Python. http://www.numpy.org/.
  21. Preventing Long Tail Latency. https://www.section.io/blog/preventing-long-tail-latency/.
  22. Provisioned Concurrency for Lambda Functions. https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/.
  23. PyTorch: A Deep Learning Framework for Fast, Flexible Experimentation. https://pytorch.org/.
  24. REALIZING THE POTENTIAL OF DATA SCIENCE: Final Report from the National Science Foundation Computer and Information Science and Engineering Advisory Committee Data Science Working Group. https://www.nsf.gov/cise/ac-data-science-report/CISEACDataScienceReport1.19.17.pdf.
  25. Scaling Namespace Operations with Giraffa File System — USENIX. https://www.usenix.org/publications/login/summer2017/shvachko.
  26. Smkniazi/Hammer-Bench: HDFS-Distributed-BenchMark. https://github.com/smkniazi/hammer-bench.
  27. The exabyte club: LinkedIn’s journey of scaling the Hadoop Distributed File System. https://shorturl.at/agoyH.
  28. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265–283, Savannah, GA, 2016. USENIX Association.
  29. Efficient metadata management in large distributed storage systems. In 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings., pages 290–298, 2003.
  30. PVFS: A parallel file system for linux clusters. In 4th Annual Linux Showcase & Conference (ALS 2000), Atlanta, GA, October 2000. USENIX Association.
  31. Wukong: A scalable and locality-enhanced framework for serverless parallel computing. In ACM Symposium on Cloud Computing 2020 (SoCC’20), 2020.
  32. In search of a fast and efficient serverless dag engine. In 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW), pages 1–10, 2019.
  33. Serverless supercomputing: High performance function as a service for science, 2019.
  34. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. CoRR, abs/1512.01274, 2015.
  35. Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads. Proc. VLDB Endow., 5(12):1802–1813, August 2012.
  36. Mapreduce: Simplified data processing on large clusters. Commun. ACM, 51(1):107–113, January 2008.
  37. Encoding, fast and slow: Low-latency video processing using thousands of tiny threads. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 363–376, Boston, MA, 2017. USENIX Association.
  38. The google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 20–43, Bolton Landing, NY, 2003.
  39. Jim Gray. Why do computers stop and what can be done about it?, 1985.
  40. ZooKeeper: Wait-free coordination for internet-scale systems. In 2010 USENIX Annual Technical Conference (USENIX ATC 10). USENIX Association, June 2010.
  41. Occupy the cloud: Distributed computing for the 99%. In ACM SoCC ’17, 2017.
  42. Cloud programming simplified: A berkeley view on serverless computing. Technical Report UCB/EECS-2019-3, EECS Department, University of California, Berkeley, Feb 2019.
  43. Competitive snoopy caching. In 27th Annual Symposium on Foundations of Computer Science (sfcs 1986), pages 244–254, 1986.
  44. Jiffy: Elastic far-memory for stateful serverless analytics. In Proceedings of the Seventeenth European Conference on Computer Systems, EuroSys ’22, page 697–713, New York, NY, USA, 2022. Association for Computing Machinery.
  45. Pocket: Elastic ephemeral storage for serverless analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 427–444, Carlsbad, CA, 2018. USENIX Association.
  46. Measurement and analysis of large-scale network file system workloads. In USENIX 2008 Annual Technical Conference, ATC’08, page 213–226, USA, 2008. USENIX Association.
  47. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst., 7(4):321–359, nov 1989.
  48. Locofs: A loosely-coupled metadata service for distributed file systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’17, New York, NY, USA, 2017. Association for Computing Machinery.
  49. InfiniFS: An efficient metadata service for Large-Scale distributed filesystems. In 20th USENIX Conference on File and Storage Technologies (FAST 22), pages 313–328, Santa Clara, CA, February 2022. USENIX Association.
  50. Managing tail latency in datacenter-scale file systems under production constraints. In Proceedings of the Fourteenth EuroSys Conference 2019, EuroSys ’19, New York, NY, USA, 2019. Association for Computing Machinery.
  51. Hopsfs: Scaling hierarchical file system metadata using newsql databases. In 15th USENIX Conference on File and Storage Technologies (FAST 17), pages 89–104, Santa Clara, CA, February 2017. USENIX Association.
  52. Facebook’s tectonic filesystem: Efficiency from exascale. In 19th USENIX Conference on File and Storage Technologies (FAST 21), pages 217–231. USENIX Association, February 2021.
  53. Scale and concurrency of GIGA+: File system directories with millions of files. In 9th USENIX Conference on File and Storage Technologies (FAST 11), San Jose, CA, February 2011. USENIX Association.
  54. Indexfs: Scaling file system metadata performance with stateless caching and bulk insertion. In SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 237–248, 2014.
  55. igen: A realistic request generator for cloud file systems benchmarking. In 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pages 343–350, 2016.
  56. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications, page 122–137. Association for Computing Machinery, New York, NY, USA, 2021.
  57. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST ’10, pages 1–10, Washington, DC, USA, 2010. IEEE Computer Society.
  58. Konstantin V Shvachko. Hdfs scalability: The limits to growth. ; login:: the magazine of USENIX & SAGE, 35(2):6–16, 2010.
  59. {}CalvinFS{}: Consistent {}WAN{} Replication and Scalable Metadata Management for Distributed File Systems. pages 1–14, 2015.
  60. Characterizing and synthesizing task dependencies of data-parallel jobs in alibaba cloud. In Proceedings of the ACM Symposium on Cloud Computing, SoCC ’19, page 139–151, New York, NY, USA, 2019. Association for Computing Machinery.
  61. InfiniCache: Exploiting ephemeral serverless functions to build a cost-effective memory cache. In 18th USENIX Conference on File and Storage Technologies (FAST 20), pages 267–281, Santa Clara, CA, February 2020. USENIX Association.
  62. Peeking behind the curtains of serverless platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 133–146, Boston, MA, 2018. USENIX Association.
  63. Dynamic metadata management for petabyte-scale file systems. In SC ’04: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, pages 4–4, 2004.
  64. Springfs: Bridging agility and performance in elastic distributed storage. In 12th USENIX Conference on File and Storage Technologies (FAST 14), pages 243–255, Santa Clara, CA, February 2014. USENIX Association.
  65. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In USENIX NSDI 12, 2012.
  66. Infinistore: Elastic serverless cloud storage. Proc. VLDB Endow., 16(7):1629–1642, may 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Benjamin Carver (5 papers)
  2. Runzhou Han (4 papers)
  3. Jingyaun Zhang (1 paper)
  4. Mai Zheng (8 papers)
  5. Yue Cheng (32 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.