How to choose a hard disk, just knowing the so-called "shingled disk", you reall

Don't be reluctant to listen; it's really a matter of different dimensions. Honestly, do you think that a few minutes of videos by some digital bloggers can really replace several weeks of training courses specially designed for professional IT engineers? In the era of short video dominance, a single phrase like "SMR" can't buy it, which can attract a lot of traffic. Many digital novices will also seize whether SMR (Shingled Magnetic Recording) is the basis for buying a hard disk. In fact, this is a rather one-sided idea.

Today, let's expand and write a long article to talk about the parameters of hard drives.

When we see a hard drive:

On the surface label of the hard drive, there is usually the model number of the hard drive. For example, this Exos 10 hard drive, its model number is on the third line of the label "ST10000NM0096", and the rest of the things do not need to be looked at on the disk label. We can search for the data sheet of this model.

In addition, some hard drives also have model numbers similar to ST10000NM0256. These models are also derived models of ST10000NM0096, and we will talk about them a little bit later.

Advertisement

When you open the searched data sheet, you will see a lot of data. What needs to be said here is - all the data is useful.Today, let's discuss the purpose of these data point by point.

First, you will find that ST10000NM0096 is just one of the many models on this table. It is a sub-model within the larger product line of Exos enterprise storage disks. This product line is divided into three sub-market model groups: "Hyperscale SATA," "SATA 6Gb/s Standard," and "12Gb/s SAS Standard."

Here we can start by saying that the Standard mode is the most common hard disk format, while the Hyperscale mode is a disk optimized for large-scale OLTP online transaction processing, Hadoop data-intensive distributed storage, Ceph storage, and high-performance computing applications.

The following formal parameters are Capacity, which you can see has two different capacities of 8T and 10T for both Hyperscale and Standard modes. We can skip the capacity, as it is simply the factory capacity of the disk.

Is the Hyperscale mode hard disk better? Here we need to look at the next set of parameters.

Columns: Standard Model (512e), Hyperscale Model (512e), Standard Model (4Kn), SED Model (512e), SED Model (4Kn), SED-FIPS Model (512e), and SED-FIPS Model (4Kn).

In this column, Seagate has set the parameters a bit tricky. This column is the disk sector and storage standard. It needs to be divided into two parts, outside and inside the parentheses.

First, let's talk about the inside of the parentheses, there are only two, one is 512e, and the other is 4Kn. This is the sector format when the disk comes out of the factory.

In the earliest disk format, the 512-byte sector was the smallest storage unit of the hard disk. However, each sector cannot be fully used for data storage, as some codes for purposes such as ECC and address marking also occupy disk space (about 65 bytes per sector). The actual usable amount of data storage in a 512-byte sector will not exceed 1-65/512 = 87.3%. This is why the 10T hard disk you get looks less than 10T in the end, and it is not entirely due to the conversion of 1000 and 1024 that reduces the disk capacity.Later on, manufacturers proposed a standard called AF (Advanced Format), which used eight 512-byte sectors to form a new storage unit. These eight 512-byte sectors of the storage unit share a set of disk address codes, thus the disk space utilization rate became 1 - 65/(8*512) = 98.4%. The utilization rate of disk space has been greatly improved, which is the "512e" mentioned above. In fact, this is a particularly compromising method. 8*512 = 4K, in fact, this sector is a 4K sector, but just to be compatible with the old systems, it is artificially split into eight 512-byte sectors for use. If the new system can directly support 4K sectors, this step is not needed. There is a saying in the digital circle called "4K alignment" which is actually to regularize the 4K sector. However, this is a deceptive thing - after Windows 7, as long as the disk with 4K sectors is formatted, the system will automatically perform the 4K alignment operation by default.

If compatibility with old systems is not considered, then the disk can directly provide native 4K sectors, which is the parameter 4Kn.

The parameter information outside the parentheses, Standard Model standard mode, is the standard disk data transfer mode; Hyperscale Model is the ultra-large scale mode; SED Model is a disk format that encrypts its own data, and the key stored in the hardware on the motherboard is used to encrypt the data inside the disk. SED-FIPS Model is also an encryption mode, SED encryption is Seagate's own technology, and SED-FIPS is an encryption form that complies with the United States' "Federal Information Processing Standard" 140-2 and 140-3 level requirements, providing further protection for data security to prevent unauthorized reading and tampering of data.

So you see these parameters form a matrix, each node corresponds to its own further derived model.

Then there is the unique function of this series of hard drives.

Helium Sealed-Drive Design With Wide Weld - This is a helium disk, to be exact, it is a sealed disk filled with helium and welded under the protection of helium, filled with helium inside. Helium is an inert gas, which can effectively protect the internal components of the disk from oxidation, which is the characteristic of inert gas. In addition, only hydrogen is lighter than helium in the world. Under the same pressure, the lighter the gas, the less resistance to the moving object, which will make the hard drive more energy-efficient. In the large-scale application of enterprise hard drives, helium disks have more energy advantages.

Digital Environmental Sensors - This is a digital environmental sensor, which has various sensors on the hard drive. On the one hand, it monitors the temperature at different positions of the hard drive, and on the other hand, it also measures the error of the motor speed and the vibration that the hard drive itself is subjected to.

Combined with the algorithm inside the firmware, the mechanical part of the hard drive is adjusted to ensure data security.Protection Information (T10 DIF) — Information Protection (T10), this is a setting on SAS (SCSI) specifically for protecting the integrity of disk data. The method used is to expand the previously mentioned 512-byte sector data to 520 bytes. The additional 8 bytes of information can maintain data consistency during end-to-end transmission. However, this technology is a command made on SCSI, so SATA has no access to it.

SuperParity — Super parity check, this is Seagate's patented technology, which can combine the parity check information of multiple sectors, showing the characteristic of certain performance improvement when reading disk data.

PowerChoice™/PowerBalance™ Technology — Energy Choice™ is an energy-saving technology for Seagate's enterprise-level hard drives. In fact, this thing also has a considerable energy-saving benefit for home NAS. In an enterprise high-load environment, this technology can reduce the disk power consumption by about 54%. In a home environment, since most of the time the hard drives of NAS are not reading and writing, the contribution to energy saving is even greater. Power Balance™, on the other hand, adjusts between IOPS and power consumption through algorithms. If your hard drive has a large number of random read and write tasks, you can further gain energy-saving benefits through Power Balance.

Low Halogen/Hot-Plug Support — This is the hot-plug feature. Many people think that as long as the disk rack supports hot plugging, the hard drive can be hot-plugged. In fact, this is wrong. It is very likely to cause hard drive failure.

Typical hot-plug hard drives not only have pins of different lengths, but also when the detection end detects the pull-out action, its internal circuit will reset the magnetic head arm within 0.5 seconds. If you plug and unplug hard drives more often, you will find that when you press the hard drive's removal handle, you can often hear a "click" sound, which is the sound of the hard drive's magnetic head quickly resetting.

This is also why the disk rack is designed with a handle, not only to allow you to quickly lock the hard drive, but also to give the hard drive a relatively long reset time when pulling out the hard drive by lifting the handle. Many people, regardless of whether the hard drive supports hot plugging, will hot-plug the hard drive. This is a point that many digital newbies find it hard to notice.

Cache, Multisegmented (MB) — This is the cache, specifically indicating multi-segmented cache, up to 256MB. Many people see that this hard drive cache is far higher than 64MB, is it a shingled magnetic recording (SMR) disk? Of course not necessarily. The use of cache is often a remedial measure taken to achieve performance indicators. This hard drive only has a rotation speed of 7200 rpm, which has a performance bottleneck compared to hard drives with 10,000 rpm or 15,000 rpm, but it is a hard drive that highlights the energy consumption ratio, so a larger cache is added to improve the performance of the hard drive.

Organic Solderability Preservative — Organic solderability preservative, this is an environmental consideration, avoiding many toxic and harmful substances such as flux and preservatives. Now there is a strong call for green environmental protection. Therefore, hard drive manufacturers will also take this as a functional point to talk about.The paint that is actually applied on the welding surface is organic and does not contain a lot of heavy metals and toxic materials as before. However, this technology is still immature and is not as durable as the previously toxic and harmful protective agents, which will affect the lifespan to a certain extent.

Mean Time Between Failures (MTBF, hours) — This is the average interval of time between failures. All hard drives in this series have an MTBF of 2.5 million hours. It seems very high, but MTBF is a standard of the U.S. military, which is calculated through a large number of statistical formulas. 2.5 million hours is about 6,850 years. This is a parameter that seems reliable on the surface but has no practical significance. If you really think a hard drive can be used from the era of the Xia Dynasty myths to the present, it is a bit naive.

Reliability Rating @ Full 24x7 Operation (AFR) — The hard drive reliability rating, the annual failure rate under the condition of 24/7 operation. Seagate itself says that MTBF is not reliable, and they use AFR to determine the reliability of their hard drives. In fact, this value is still not very reliable. According to Seagate, the temperature inside the case should not exceed 40 degrees Celsius, the annual operation time is 8760 hours, the start and stop cycle is within 250 times, and the voltage is stable... This is similar to the fuel consumption of the Ministry of Industry and Information Technology, which only has a general reference significance, but the actual results are often more pessimistic. So the 0.35% on the data sheet needs to be magnified. According to our practice of configuring hard drives for customer projects, we prepare 5-15% spare hard drives based on different tasks within a 3-year system life cycle. According to actual experience, within about 3 years, at least 80% of the spare hard drives we prepared will be consumed.

Nonrecoverable Read Errors per Bits Read — This is what we call URE, which is the error probability per bit that Seagate says. Enterprise disks perform very well in this regard, with an error rate of 1/10^15. However, let's calculate a math problem:

A 10-disk RAID5 array, let's use the highest capacity 10T hard drive in this series.

1TB is 10^3 bytes, which is 8x10^3 bits, and 10T reaches 8x10^4 bits. If it is a 10-disk array, it reaches 8x10^5 bits. According to the probability, the data error in the array will appear 8 times during the fault recovery process of an 8-disk RAID5. In theory, RAID5 cannot successfully recover.

In fact, this value tells you the maximum limit of a disk array.

Power-On Hours per Year (24x7) — The annual power-on working time is 8760 hours. If calculated by 24 hours a day, this hard drive can work for 365 days.

512e Sector Size (Bytes per Sector) — This is the size of the 512e sector we mentioned. When the disk's "Information Protection (T10)" function is not used, the disk's sectors are simulated to be 512 bytes in size. When the Information Protection (T10) function is used, the size varies between 520 bytes and 528 bytes according to different levels. Of course, these two different sector sizes are only applicable to SAS interface hard drives, and SATA hard drives do not have this problem.4K Native Sector Size (Bytes per Sector) — The size of the 4K native sector is 4096 bytes. Here, you will notice that in the table, the Hyperscale Model hard drive is marked with a "—", which is what we were just talking about. The Hyperscale model supports dense scaling storage more, so there is no support for 4Kn. However, the SAS interface still requires additional bytes for information protection purposes to be expanded out.

Limited Warranty (years) — The warranty period is 5 years for all. The matter of hardware warranty is actually a bit useful for individual users, but for enterprise users, they rarely choose warranty. After all, the cost of system downtime caused by the time it takes to replace the hard drive is much more expensive than the hard drive itself, and most of them use spare parts for replacement. Moreover, the hard drive that is "repaired" under warranty is really not dared to be used in formal projects. It is precisely because of this that the warranty period of various enterprise hard drives is set very long for you. It looks good on the surface, but no one goes to repair.

Spindle Speed (RPM) — The speed of rotation is 7200 revolutions per minute (RPM), which is a very important factor for a hard drive! However, for enterprise users, this is sometimes not emphasized. The speed is related to the average seek time of the hard drive, which is the average response time after the hard drive receives the read and write instructions, divided into two stages. The first is the movement of the magnetic head to the corresponding cylinder on the relevant disk, and the second is the rotation of the data sector that needs to be read to the bottom of the magnetic head. We can know that the faster the disk rotates, the more time is saved in the second stage.

In the application scenarios that require high-speed read and write speeds, such as a web server, we would have chosen a 15000 RPM hard drive or directly used an SSD or NVMe. The main use of this type of hard drive is to store transaction files, and it doesn't really matter if it's a bit faster or slower.

For example, the average seek time of this series of hard drives is 4.16 milliseconds, and the average seek time of a 15000 RPM hard drive is about 2 milliseconds, which is indeed twice as fast. The continuous read speed can also reach more than 300 MB/s, which is not comparable to a 7200 RPM hard drive. So we won't mention SSDs and NVMe anymore, talking about performance without the application scenario is just being unscrupulous.

Interface Access Speed (Gb/s) — This is a very subtle thing! The rate of the SATA 6G interface should be 6G, and the SAS 12G interface should be 12G. These hard drives will be backward compatible, supporting 3G and 1.5G interfaces, and SAS is also compatible with 6G and 3G. But this thing is just to be seen, this is an electrical standard, the interface speed does not determine and improve the speed of the hard drive.

Max. Sustained Transfer Rate OD (MB/s, MiB/s) — This is the maximum continuous transfer speed, which is also a particularly deceptive data parameter. This is the fastest data transfer speed that a single hard drive can theoretically achieve, and the data table only has 249MB/s. Even the speed of SAS hard drives is only 254MB/s, far from the interface rate.

First, let's talk about why the interface rate is so large even if it can't reach the interface rate, because this is prepared for compatibility with subsequent high-speed devices. However, SATA is almost hopeless, and SAS may be able to run at full interface rate through daisy-chaining. This is where the array card is connected. If there is a chance, I will tell everyone how to play with the array, and it will also be a very large article.

Random Read/Write 4K QD16 WCD (IOPS) — This is the indicator of random read and write 4K data strings, represented by IOPS, that is, how many IOs are completed per second. It is the most important performance indicator of the hard drive, and there is no other!He is a comprehensive index, the larger the better, which proves that the response speed of the hard drive is faster in actual use.

Average Latency (ms)---Average seek time, also known as average response time. I have already mentioned it when talking about the rotation speed, and I will not repeat it.

Interface Ports---The number of interfaces, which may seem strange to everyone. In fact, most SAS hard drives support dual-port design, allowing SAS hard drives to be directly connected to two HBA cards, which can be used for fault migration. When one HBA card fails, it can be switched to another HBA card. However, single-port SATA hard drives do not have this function. But this option is not very significant for ordinary personal users.

Rotational Vibration @ 1500Hz (rad/s2)---Anti-rotational vibration, which is an inherent index of a hard drive, the larger the better, directly indicating the stability of the hard drive operation.

Idle A (W) Average---Idle power consumption refers to the power consumption of the hard drive after being powered on without reading or writing. If you have a NAS, the power consumption of the hard drive during the daily time when the NAS is not being read is the lower the better. However, too low power consumption will affect the response speed of the hard drive, and there may be a sense of jerkiness when reading data after a period of time.

Max Operating Power, Random Write (WCD) 4K/4Q RR50% / RW50%---Maximum operating power consumption, which is the average power consumption measured by random reading and writing of 4K data blocks, and can be regarded as the maximum power consumption index of your hard drive.

Power Supply Requirements---Power supply requirements, which is what kind of power to connect, but most hard drives are currently 12V+5V, and there are few others.

Temperature, Operating (°C)---Operating temperature, which is a very important index, determining the failure rate and life of your hard drive. Generally speaking, the operating temperature of enterprise-level hard drives needs to be limited to within 60 degrees Celsius. Exceeding this temperature will severely affect the hard drive. At the same time, this is an important observation index for daily monitoring when using the hard drive. If there are too many hard drives to observe, we will generally reduce the temperature of the computer room directly. This solves the problem at the source.

Vibration, Nonoperating: 10Hz to 500Hz (Grms)---Vibration, GRMS is "root mean square acceleration," which is a unit of vibration intensity at a position. The value of 2.27 is almost the vibration of a tractor in the field, and it is generally not achievable at home.

Shock, Operating 2ms (Read/Write) (Gs)---The impact force that can be withstood during operation, the acceleration within 2 milliseconds is 40G, which is generally not easy to achieve at home. Basically, we can refer to the standard of car airbags. The activation of car airbags is also an impact acceleration of 2ms/40G.Translate the following article into English: Shock, Nonoperating, 1ms/2ms (Gs) — Unoperated impact acceleration, the intensity of 250Gs can be referred to the above.

Then, there are the data such as length, width, height, and weight. Basically, a hard drive of the same size has almost the same length and width, and the thickness is related to the number of disks inside. The more disks, the thicker it is.

These are the parameters that can be queried for enterprise hard drives. If it is a consumer hard drive, the manufacturer will intentionally or unintentionally ignore some parameters. For example, this disk:

You can see that there are many fewer parameters, and you will see that the annual usage time of 2400 hours is much different from the enterprise disk's 8760 hours. Moreover, for example, the written data of 55T/year is also a very amusing number in our view. But to be honest, it is still enough for ordinary users.

Of course, the premise of talking without a burden is that no matter what kind of hard drive we use, we don't have to spend our own money, so we haven't thought about a series of issues such as hard drive life, performance, etc. It is customary to use a hard drive for two or three years and replace it as soon as there is a problem. This is equivalent to the feeling of eating and drinking with public funds, and there is a possibility of wasting a part of it.

However, there is no need to overexploit the performance. When everyone pays for a meal in a restaurant, they don't have to lick the plates clean, right? This is not decent, is it?

Leave a Comment