In this episode, Martin and Chris talk to Andy Klein from Backblaze. Since 2014, Backblaze has been collecting and publishing hard drive reliability information for their storage pods. The pods are custom-built and used to hold backup data from customers and to support the B2 cloud storage offering. Rather than buy an expensive storage platform or use the public cloud, Backblaze initially decided to build their own solution at a fraction of the cost of other options. Once at scale, it seemed logical to use the platform for B2.
Over the years, the Backblaze team has collected statistics on thousands of hard drives and dozens of models and capacities. The data shows some interesting statistics, but why collect the information and how is it used?
Andy explains the rationale behind collecting the data and some of the benefits the information provides. Data for 2018 can be found here – https://www.backblaze.com/blog/hard-drive-stats-for-2018/. You can see the generations of pods on this recent podcast – https://www.backblaze.com/blog/backblaze-storage-pod-museum/
Andy also references the source data, which is freely downloadable (although quite sizeable now). You can use this shortlink – https://brk.su/bb-hdd-stats or the full link URL – https://www.backblaze.com/b2/hard-drive-test-data.html.
Elapsed Time: 00:43:35
- 00:00:00 – Intros
- 00:01:10 – Who is or are Backblaze?
- 00:02:20 – DIY Storage Pods
- 00:05:30 – What software did Backblaze use?
- 00:07:00 – How does drive supply affect builds?
- 00:11:00 – Enterprise vendors use consumer drives
- 00:11:55 – Backblaze stats are collected from SMART
- 00:15:00 – Data is computed using drive days and AFR
- 00:21:00 – What interesting information did the data show?
- 00:25:50 – Do drives correlate in failures or fail randomly?
- 00:28:08 – The bathtub failure curve is true!
- 00:30:00 – How did Backblaze use the data to improve the service?
- 00:33:30 – What relationship does Backblaze have with drive vendors?
- 00:36:00 – Will SSD find it’s way into Storage Pods?
- 00:39:00 – What about using dual actuator drives?
Related Podcasts & Blogs
- #85 – Storage for Home and Homelabs
- #55 – Storage for Hyperscalers
- HDD Capacity Threshold Reaches 15TB
- Backblaze HDD Statistics for 2017