| 
									
										
										
										
											2010-09-21 15:43:03 +01:00
										 |  |  | = Block driver correctness testing with blkverify = | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | == Introduction == | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This document describes how to use the blkverify protocol to test that a block | 
					
						
							|  |  |  | driver is operating correctly. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It is difficult to test and debug block drivers against real guests.  Often | 
					
						
							|  |  |  | processes inside the guest will crash because corrupt sectors were read as part | 
					
						
							|  |  |  | of the executable.  Other times obscure errors are raised by a program inside | 
					
						
							|  |  |  | the guest.  These issues are extremely hard to trace back to bugs in the block | 
					
						
							|  |  |  | driver. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Blkverify solves this problem by catching data corruption inside QEMU the first | 
					
						
							|  |  |  | time bad data is read and reporting the disk sector that is corrupted. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | == How it works == | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The blkverify protocol has two child block devices, the "test" device and the | 
					
						
							|  |  |  | "raw" device.  Read/write operations are mirrored to both devices so their | 
					
						
							|  |  |  | state should always be in sync. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The "raw" device is a raw image, a flat file, that has identical starting | 
					
						
							|  |  |  | contents to the "test" image.  The idea is that the "raw" device will handle | 
					
						
							|  |  |  | read/write operations correctly and not corrupt data.  It can be used as a | 
					
						
							|  |  |  | reference for comparison against the "test" device. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | After a mirrored read operation completes, blkverify will compare the data and | 
					
						
							|  |  |  | raise an error if it is not identical.  This makes it possible to catch the | 
					
						
							|  |  |  | first instance where corrupt data is read. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | == Example == | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Imagine raw.img has 0xcd repeated throughout its first sector: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     $ ./qemu-io -c 'read -v 0 512' raw.img | 
					
						
							|  |  |  |     00000000:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................ | 
					
						
							|  |  |  |     00000010:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................ | 
					
						
							|  |  |  |     [...] | 
					
						
							|  |  |  |     000001e0:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................ | 
					
						
							|  |  |  |     000001f0:  cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd  ................ | 
					
						
							|  |  |  |     read 512/512 bytes at offset 0 | 
					
						
							|  |  |  |     512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | And test.img is corrupt, its first sector is zeroed when it shouldn't be: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     $ ./qemu-io -c 'read -v 0 512' test.img | 
					
						
							|  |  |  |     00000000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................ | 
					
						
							|  |  |  |     00000010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................ | 
					
						
							|  |  |  |     [...] | 
					
						
							|  |  |  |     000001e0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................ | 
					
						
							|  |  |  |     000001f0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................ | 
					
						
							|  |  |  |     read 512/512 bytes at offset 0 | 
					
						
							|  |  |  |     512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This error is caught by blkverify: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img | 
					
						
							|  |  |  |     blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A more realistic scenario is verifying the installation of a guest OS: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     $ ./qemu-img create raw.img 16G | 
					
						
							|  |  |  |     $ ./qemu-img create -f qcow2 test.qcow2 16G | 
					
						
							| 
									
										
										
										
											2020-08-03 17:04:25 +02:00
										 |  |  |     $ ./qemu-system-x86_64 -cdrom debian.iso \ | 
					
						
							|  |  |  |           -drive file=blkverify:raw.img:test.qcow2 | 
					
						
							| 
									
										
										
										
											2010-09-21 15:43:03 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | If the installation is aborted when blkverify detects corruption, use qemu-io | 
					
						
							|  |  |  | to explore the contents of the disk image at the sector in question. |