John Curl's Blowtorch preamplifier part II

Status
Not open for further replies.
Slightly OT: I have three PCs at 2 locations and use Dropbox on all three. That means that each PC as well as the Dropbox servers all have a copy of all my files, documents as well as music. Cost: € 8 per month for up to a terabyte (I am currently using 200G).



Up/download is pretty much instantaneous. So I have 4 distributed copies available, that should be enough for all contingencies save world-wide nuclear war. And no hassle with hardware.



Jan



Jan, I'm not discussing data loss in this way (failure).

I'm discussing silent corruption, which you will receive no notification of, will not result in a click or a blip or like something you'd associate with a scratched cd. It's a completely silent (to your knowledge) occurrence- a soft error.

There are a few types of silent data corruption: phantom writes, misdirected reads and writes and bit rot.

Now, I'm not going to even start with memory corruption, which is probably where your disk is placing the audio file when it's in use. Let's just stick with silent corruption on disk:

Your operating system will interpret the data and make a guess as to what was there. It will be an extremely small difference, but that difference depends on what small component was missing / corrupt.

So, all of your backups may just be duplicate copies of those errors, rather than a high integrity original.

Let me add a bit to everyone's audio paranoia:

This happens every day, all the time, everywhere.

It happens so much that large database corporations have people on payroll whose sole job is to deal with it.

A new breed of file system looks to prevent against this. ZFS (purchased by Oracle as part of Sun Microsystems), openzfs an open source fork of ZFS, BTRFS and to a lesser extent apples brand new AFS which was recently rolled out to replace the archaic HFS which was originally designed in 1983.

Let me say that again: up until a few months ago, if you were using a mac, your file system that you are storing your music on was rolled out in 1983. Yes, there were some marginal improvements, but not the kind that are preventative in any way whatsoever to what I described above.

You can read more here:

http://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf
 
You really need to put down the cool aid. Petabyte level issues are not a worry in the home.



This happens to everyone- photographers, graphic designers, mastering engineers etc..

Again, not about total size at all.

If you read the information I have provided or do your own research, you will see that this is a real concern for all people with audio libraries- especially those which are historically older, not larger. However with scale increases the frequency of course.
 
My point is not that this is something everyone needs to concern themselves with.

My point is that this is something that someone who is comparing 16 to 24-bit audio file quality or who has a large high resolution audio collection and state of the art playback system should be concerning themselves with, or at least considering and looking into.

If one is taking such an approach to their music listening or analysis of music equipment, and this is a scientifically verified fact that has audio implications and can be protected against with free and simple methods.... why would it be strange to suggest this as a topic of discussion?
 
In my 40 years as IT specialist, compiler and operating system designer, I have never seen [undetected] bit-rot (except on those old floppy's and first generation winchesters). If any, then the system will detect the corruption, and you need to go to one of your backup's for salvation. I'm not saying that it is impossible, [extreme] multi-bit errors could go undetected, in case of a database it will be detected as data corruption, in [some] cases of other files it may go undetected, but this must be extreme rare.
 
Member
Joined 2014
Paid Member
This happens to everyone- photographers, graphic designers, mastering engineers etc..

Again, not about total size at all.

If you read the information I have provided or do your own research, you will see that this is a real concern for all people with audio libraries- especially those which are historically older, not larger. However with scale increases the frequency of course.

Nope. As long as you backup regularly by whatever method works for and you can keep an offsite you really don't need to worry. In the home you don't even have to worry about Parity bits in your RAM. Disks degrade, but filesystems monitor that and most will move blocks around if they detect an issue.

The problem comes when people dont test restore until it all fails and I have had a backup disk go bad, but I had an off site to restore from.

And I remember both the original launch of ZFS and the first sun data appliances that came out using it. I've also been involved in projects with some pretty hairy hardware requirements that DO need all this stuff.
 
AX tech editor
Joined 2002
Paid Member
Now, I'm not going to even start with memory corruption, which is probably where your disk is placing the audio file when it's in use.

So these are the combinations of errors that escape the error correction/parity/ERC etc hurdles?

Let's just stick with silent corruption on disk:

Your operating system will interpret the data and make a guess as to what was there. [/url]

I don't think it works that way. The OS will read stuff from a stream which can originate from a physical medium, memory or an external incoming stream etc. Can even be a keyboard. As far as I know, all of this is heavily redundant in such a way that in theory all errors are corrected. I am aware that, also in theory, it is possible that a specific combination of errors may slip through but that should be extremely rare.
For example, in a simple single-bit parity-checked system, a combination of one bit dropping and one bit coming up will not trigger correction. But I haven't seen such primitive systems since kindergarten (and I am really old ;-).

I am somewhat sceptic about these things that bring in vast sums for the 'problem' solvers while I haven't seen any of it in 30+ years of intensive IT use, both as a programmer and as a user. But maybe I just have been extremely lucky.

Jan
 
Last edited:
Nope. As long as you backup regularly by whatever method works for and you can keep an offsite you really don't need to worry. In the home you don't even have to worry about Parity bits in your RAM. Disks degrade, but filesystems monitor that and most will move blocks around if they detect an issue.

The problem comes when people dont test restore until it all fails and I have had a backup disk go bad, but I had an off site to restore from.

And I remember both the original launch of ZFS and the first sun data appliances that came out using it. I've also been involved in projects with some pretty hairy hardware requirements that DO need all this stuff.

Unfortunately good housekeeping and regural backups do not protect you from silent data corruption. Fortunately, if your backup compares files by names, timestamps and sizes only (no checksumming) the corruption is not propagated to backups.

I have experienced data corruption at home myself. The quilty part was faulty capasitors on PC motherboard.

Redundant storagesystems, among the simplest one being "mirror" or RAID-1 usually DO NOT read both disks when retrieving data. (Most hardware RAIDs and for example Linux mirror code works like this. Solaris may do things differently.) They simply read disk that has been fastest one lately or some such. Hard drive does some level of checksumming, but that is not accessible nor visible to the operating system. (Mirror or RAID-1 stores data to two (or more) disks, disks contains exact same data.) Filesystems usually do not have checksumms for data.

This is where btrfs helps: data is checksummed and checksumms are stored to disk. So "mirror" btrfs contains two copies of data and two copies of checksumms. It has utility btrfs-scrub that reads data from all disks and compares to checksums. Even single disk, like usb-disk, can benefit from btrfs as you may create such filesystem with two copies of metadata - so data integrity can be verified (but not corrected). On two disks btrfs data corruption can be even corrected. With big money one can buy EMC Centera system that offers these features.

For more details, google is your friend.

ZFS has some steep hardware requirements depending what features are used. Deduplication is heavy task.

Word about backups: multiple copies over the internet do not equal with backup! Best is to have snapshots: thus you deletion (accidental or in purpose) do not delete anything for good immediately; data gets deleted when snapshot cycle exhausts that change. There is a good reason why any self respecting corporation takes backup daily and retains some months worth of backups.

PS I have been silent reader of many forums on this site for years. I have become quite familiar on the way how discussions here go. For the good and not so good. :rolleyes:
 
AX tech editor
Joined 2002
Paid Member
Fortunately, if your backup compares files by names, timestamps and sizes only (no checksumming) the corruption is not propagated to backups.

You cannot 'switch off' the complex reduncy-based error correction involved in reading and writing files. Whenever you compare anything on files, the error correction is involved. You cannot compare time stamps or file names without invoking error correction because the compare involves reading parts of the file and thus is subjected to the error checking and correction.

Jan
 
Last edited:
You cannot 'switch off' the complex reduncy-based error correction involved in reading and writing files. Whenever you compare anything on files, the error correction is involved. You cannot compare time stamps or file names without invoking error correction because the compare involves reading parts of the file and thus is subjected to the error checking and correction.

Jan

Jan,
Actually, file names and time stamps are in the directory structure part of the disk, not where file data is stored. So, retrieving directory information does not check file contents in any way. It's also possible for file size to remain unchanged even in the presence of some types of file corruption. Therefore, what is referred to as a checksum would be necessary to verify file contents.

Perhaps is would be of interest to mention that what is commonly called a checksum is probably not really a checksum algorithm, but rather something more like a hash type algorithm (which are more reliable at detecting any file changes).

As an aside, for my own backups, I like to have at least 3 copies in case a backup disk happens to go bad. Also, I think its advisable to check backup drives periodically, and also to copy data to new media at least every several years to make sure media formats don't slip into obsolescence.

Regarding risks such as Bit Rot, I would worry more about applications like databases that potentially can slowly and incrementally become increasingly corrupted without it necessarily being apparent. This type of risk of more of an issue in business and similar applications, especially when a database is written to regularly. In the case of music files, once written they are only read, so the risks are much, much less than for a frequently-written-to large database.
 
This happens to everyone- photographers, graphic designers, mastering engineers etc..

Unheard of in our business. An IC mask set is a huge database of polygons and locations (much more than a full DVD of data in some cases), unlike one pixel on a JPEG being a little off the circuit does not work if there is a single error. There is no interpolation or guessing allowed.

Regarding risks such as Bit Rot, I would worry more about applications like databases that potentially can slowly and incrementally become increasingly corrupted without it necessarily being apparent.

We have collaborative projects on huge chips with the same database accessed read and rewritten constantly from sites around the globe. It just does not happen.
 
Last edited:
Nope. As long as you backup regularly by whatever method works for and you can keep an offsite you really don't need to worry. In the home you don't even have to worry about Parity bits in your RAM. Disks degrade, but filesystems monitor that and most will move blocks around if they detect an issue.

The problem comes when people dont test restore until it all fails and I have had a backup disk go bad, but I had an off site to restore from.

And I remember both the original launch of ZFS and the first sun data appliances that came out using it. I've also been involved in projects with some pretty hairy hardware requirements that DO need all this stuff.

You cannot 'switch off' the complex reduncy-based error correction involved in reading and writing files. Whenever you compare anything on files, the error correction is involved. You cannot compare time stamps or file names without invoking error correction because the compare involves reading parts of the file and thus is subjected to the error checking and correction.

Jan

Hmm. I should have more clear what I mean.

Scenario:

1. File is saved to filesystem. Operating system saves it to file and creates filename and timestamps and saves filesize also to metadata.

2. Backup is created. As file in step 1 do not exist at all on backup, it is copied to the backup system.

3. Silent data corruption happens on local system. Normally it goes silent in meaning and sense that storage (hard drive) returns faulty data and operating system has no way to detect it as usually filesystems do not do checksumming for data. Should faulty data occur in metadata, like corrupted filename or filesystem metadata structure, it might be detected either by operating system or user.

4. Backup is done. As the file has same name, size and timestamps (access time is not considered here) that the backup system has, no file is transferred to the backup. This is how backups work - it is far too time consuming and resource intensive to read all files through and compare checksums between local system and the backup system. Even utility "rsync" by default do not "go through both copies of file byte-by-byte". Think about 2TB storage. Assuming reading speed of 100MB/sec (very optimistic for small files) it would take some 5.5 hours to do backup - even if nothing has changed!!
Thats why it relies on filenames, -sizes and timestamps.

So file, that has corrupted locally after first backup and is not modified by user, remains faulty locally and pristine in backup. Unfortunately user does not know any of this.

Filesystems keep filenames, -sizes and timestamps on metadata; when operating system returns directory listing to user it does not involve reading files (unless we are talking about graphical filemanager that reads files to show thumbnails and so on).


Silent data corruption is not silent with btrfs:
https://en.wikipedia.org/wiki/Btrfs
"CRC-32C checksums are computed for both data and metadata and stored as checksum items in a checksum tree.
..
If the file system detects a checksum mismatch while reading a block, it first tries to obtain (or create) a good copy of this block from another device – if internal mirroring or RAID techniques are in use.

Btrfs can initiate an online check of the entire file system by triggering a file system scrub job that is performed in the background. The scrub job scans the entire file system for integrity and automatically attempts to report and repair any bad blocks it finds along the way."

 
Last edited:
We have collaborative projects on huge chips with the same database accessed read and rewritten constantly from sites around the globe. It just does not happen.

Well, I used to work sometimes with some of the high level consultants at Accenture, and they and others have seen it happen. They consider it a real risk, one that can be difficult to protect against. In some cases databases have be successfully repaired. In other cases, it's a total loss and they have to build a new database from scratch.

EDIT: I'm not saying it's something that happens frequently. Just that is has happened. When it does happen, its usually due to previously unknown bugs caused by application programmers, which result in bad data being written to the database. It might be discovered, for example, when somebody tries to run a rarely used report on the database and it fails with errors. Investigation of the problem may show bad data has been accumulating for a long time.
 
Last edited:
Well, I used to work sometimes with some of the high level consultants at Accenture, and they and others have seen it happen. They consider it a real risk, one that can be difficult to protect against. In some cases databases have be successfully repaired. In other cases, it's a total loss and they have to build a new database from scratch.

EDIT: I'm not saying it's something that happens frequently. Just that is has happened. When it does happen, its usually due to previously unknown bugs caused by application programmers, which result in bad data being written to the database.

Multiple redundant databases and data storage methods prevent this kind of errors (duplication faults and data-rot [actually the effects of data-rot]). Besides this redundant file systems are being used. I'm not convinced (about these errors becoming a real problem [other than once in a 100001 blue moon's]), other than, configuration errors, errors that should not happen because protocol and independent checks will prevent this. But if you are building your corporate database on the cheap, who knows what can happen...
 
Last edited:
Multiple redundant databases and data storage methods prevent this kind of errors . But if you are building your corporate database on the cheap, who knows what can happen...

Redundancy doesn't help at all. As far as the database and all copies of it are concerned, data is being properly written. It's in the right format. It's just wrong data. And it isn't that a corporate database is being built on the cheap either, some databases are extremely complex and expensive. Its the complexity required for some applications that allows bugs to slip through. Saying that should never be able to happen is kind of like saying it should be possible to build a bug-free sophisticated operating system. There are too many programmers working on too many different subsystems that interact with each other in complex ways for it to be possible to be sure nothing can go wrong.
 
Last edited:
Unheard of in our business. An IC mask set is a huge database of polygons and locations (much more than a full DVD of data in some cases), unlike one pixel on a JPEG being a little off the circuit does not work if there is a single error. There is no interpolation or guessing allowed.



We have collaborative projects on huge chips with the same database accessed read and rewritten constantly from sites around the globe. It just does not happen.

I understood we were discussing about home/private level of data storage and backup solutions.

It would be truly foolish for a big corporation use a single disk storage for mission criticall duties. I am confident that big corporation like Analog Devices has a proper storage backends in use for any usage case where unplanned outage or data corruption is simply no option.

Enterprice level storagesystems do the utmost to quarantee data integrity. At the very least data is saved to multidisk RAID-array with with distributed parity. Such a system does scheluded data integrity checks at certain intervalls. Should there be a checksum error, data gets reconstructed, error is logged and disk gets marked suspect and most likely that disk will be removed from RAID-array and data reconstruction starts to hot-online-spare disk. All that happens automatically, operator get notified for disk fault. Serious systems have redundancy on multiple levels. Such a systems use ECC RAM, redundant power supplies with UPSs and generators and the data is copied on multiple locations.

Most likely the enterprice level database system has internal data checksumming - not even relying what storage subsystem and operating system gives to it.

It would be indeed a day to fill lottery coupons when enterprice storage system corrupts your data. But that to happen single disk system - it is not unheard of. It should not happen, hard drive is supposed not to corrupt even a single bit - yet it can happen.
 
First of all, a filesystem is a database! It has metadata and indexes and blocks of data. Saying the directories and other metadata are stored in a different part of a disk is just wrong, inodes reside in disk blocks and are distributed over the disk. Now many large databases forego a regular filesystem and write data directly to raw disk partitions.

Where I work we have (tens of) thousands of databases on multiple platforms hosted in data centers all over the world. They hold sensitive financial data, and are read and written constantly. If the data went bad and that change went undetected it would cost billions of dollars.

Of course an application can write bad data, but that is different. A sufficiently bad app could result in a corrupted database, but most db tables have constraints that prevent things like duplicate keys or incorrect data types, we're talking more about a Bobby Tables type situation. When databases get slow or wonky usually a dump/restore is sufficient to fix it.

As for bit rot, the most "likely" candidate files would be executable files which are written once then only read. Try flipping a bit in your​ assembler opcodes and watch it go off into the weeds until it hits a null pointer dereference.
 
An IC mask set is a huge database of polygons and locations (much more than a full DVD of data in some cases), unlike one pixel on a JPEG being a little off the circuit does not work if there is a single error.

...over a certain limit (read:size). In our foundry spec, a 10:1 projection reticle defect under 1/3 the minimum feature size (that is, slightly under 1 micrometer) is acceptable, as it won't create a catastrophic repetitive defect on the wafers, and the overall yield impact is negligible. Others may have tougher constraints, but the ground rule is the same: defects have to be over a certain limit to have an impact on the circuits.

Considering the DeepUV light wavelength as the ultimate noise source in the projection process, such defects could be considered almost buried in the noise floor. But I would still consider the mask/reticle data as allowing room for errors. Unlike for a stream of binary data entering a Turing machine, which HAS to be bit perfect if you want to predict the halting.
 
Status
Not open for further replies.