Wednesday, December 7, 2011

Troubleshooting 0x0000007A KERNEL_DATA_INPAGE_ERROR

The Debugging Tools for Windows are required to analyze crash dump files. If you do not have the Debugging Tools for Windows installed or dump files are not being generated on system crash, see this post for installation/configuration instructions:
http://mikemstech.blogspot.com/2011/11/windows-crash-dump-analysis.html

0x7A KERNEL_DATA_INPAGE_ERROR belongs to a class of errors that are generally considered hardware errors. These errors are rarely caused by software/drivers and typically indicate that a component is failing or has already failed).

The KERNEL_DATA_INPAGE_ERROR bug check occurs when there are problems with reading a memory page from disk back into physical memory. The error typically indicate a problem in the RAM, hard drive(s), or storage controller(s). Since it is typically a hardware problem, the value of the information reported by the debugger is usually limited to identifying corruption in the system files in memory and on disk. The following two debugging examples show how to use the !chkimg debugger command to identify corruption in system drivers.

To start debugging, open the crash dump according to the instructions in the post referenced above and execute the !analyze -v debugger command. If your version of the Debugging Tools for Windows is new enough, it may automatically execute the !chkimg extension on the modules referenced in the stack trace and report any system corruption evident from the memory dump.

3: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KERNEL_DATA_INPAGE_ERROR (7a)
The requested page of kernel data could not be read in.  Typically caused by
a bad block in the paging file or disk controller error. Also see
KERNEL_STACK_INPAGE_ERROR.
If the error status is 0xC000000E, 0xC000009C, 0xC000009D or 0xC0000185,
it means the disk subsystem has experienced a failure.
If the error status is 0xC000009A, then it means the request failed because
a filesystem failed to make forward progress.
Arguments:
Arg1: fffff6fc40008070, lock type that was held (value 1,2,3, or PTE address)
Arg2: ffffffffc000000e, error status (normally i/o status code)
Arg3: 00000000a18d7860, current process (virtual address for lock type 3, or PTE)
Arg4: fffff8800100e000, virtual address that could not be in-paged (or PTE contents if arg1 is a PTE address)

Debugging Details:
------------------


ERROR_CODE: (NTSTATUS) 0xc000000e - A device which does not exist was specified.

DISK_HARDWARE_ERROR: There was error with disk hardware

BUGCHECK_STR:  0x7a_c000000e

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  CODE_CORRUPTION

PROCESS_NAME:  System

CURRENT_IRQL:  0

TRAP_FRAME:  fffff880062e18b0 -- (.trap 0xfffff880062e18b0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffff8a018d232e0 rbx=0000000000000000 rcx=fffff880062e1a18
rdx=0000000000000001 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8800100e000 rsp=fffff880062e1a40 rbp=0000000000000002
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=fffff880062e19a8 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei ng nz ac pe nc
partmgr!PmQueryDepends+0x140:
fffff880`0100e000 0000            add     byte ptr [rax],al ds:35b8:fffff8a0`18d232e0=??
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff80003137b52 to fffff800030c6c40

STACK_TEXT:  
fffff880`062e1598 fffff800`03137b52 : 00000000`0000007a fffff6fc`40008070 ffffffff`c000000e 00000000`a18d7860 : nt!KeBugCheckEx
fffff880`062e15a0 fffff800`030ee6cf : fffffa80`06305a90 fffff880`062e1710 fffff800`032fc500 fffffa80`06305a90 : nt! ?? ::FNODOBFM::`string'+0x37bba
fffff880`062e1680 fffff800`030d4f59 : 00000000`00000000 00000000`00000008 ffffffff`ffffffff fffff8a0`12679c00 : nt!MiIssueHardFault+0x28b
fffff880`062e1750 fffff800`030c4d6e : 00000000`00000008 fffff880`0100e000 00000000`00000000 00000000`00000001 : nt!MmAccessFault+0x1399
fffff880`062e18b0 fffff880`0100e000 : fffffa80`04ab34a0 fffff880`062e1ae0 fffff880`062e1ae0 fffff8a0`0b7ce400 : nt!KiPageFault+0x16e
fffff880`062e1a40 fffff880`01001a51 : fffffa80`04d40ce0 00000000`00000000 fffffa80`00000000 fffff8a0`0b7ce400 : partmgr!PmQueryDepends+0x140
fffff880`062e1ab0 fffff800`033bd9e3 : fffffa80`04d40b90 fffff800`03267260 fffffa80`0458c690 fffffa80`0458c690 : partmgr! ?? ::FNODOBFM::`string'+0x47e
fffff880`062e1b40 fffff800`030d1001 : fffffa80`03d40700 fffff800`033bd901 fffff800`032c8800 00000000`00000003 : nt!IopProcessWorkItem+0x23
fffff880`062e1b70 fffff800`03361fee : 00000000`00000000 fffffa80`0458c690 00000000`00000080 fffffa80`03cb9040 : nt!ExpWorkerThread+0x111
fffff880`062e1c00 fffff800`030b85e6 : fffff880`009ea180 fffffa80`0458c690 fffffa80`03ce1b60 00000000`00000246 : nt!PspSystemThreadStartup+0x5a
fffff880`062e1c40 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16


STACK_COMMAND:  kb

CHKIMG_EXTENSION: !chkimg -lo 50 -d !partmgr
    fffff8800100e000-fffff8800100e011  18 bytes - partmgr!PmQueryDepends+140
 [ 89 5c 24 78 4c 3b ee 0f:00 00 00 00 00 00 00 00 ]
    fffff8800100e015-fffff8800100e019  5 bytes - partmgr!PmQueryDepends+155 (+0x15)
 [ 44 8b b4 24 80:00 00 00 00 00 ]
    fffff8800100e01d-fffff8800100e025  9 bytes - partmgr!PmQueryDepends+15d (+0x08)
 [ 44 89 27 45 85 f6 0f 88:00 00 00 00 00 00 00 00 ]
    fffff8800100e029-fffff8800100e02c  4 bytes - partmgr!PmQueryDepends+169 (+0x0c)
 [ 8d 04 dd 08:00 00 00 00 ]
    fffff8800100e030-fffff8800100e031  2 bytes - partmgr!PmQueryDepends+170 (+0x07)
 [ b9 01:00 00 ]
    fffff8800100e035-fffff8800100e04b  23 bytes - partmgr!PmQueryDepends+175 (+0x05)
 [ 41 b8 50 6d 52 52 48 63:00 00 00 00 00 00 00 00 ]
    fffff8800100e04e-fffff8800100e053  6 bytes - partmgr!PmQueryDepends+18e (+0x19)
 [ 49 3b c6 0f 84 9c:00 00 00 00 00 00 ]
    fffff8800100e057-fffff8800100e07e  40 bytes - partmgr!PmQueryDepends+197 (+0x09)
 [ 44 89 30 45 8b e6 44 39:00 00 00 00 00 00 00 00 ]
    fffff8800100e081-fffff8800100e08e  14 bytes - partmgr!PmQueryDepends+1c1 (+0x2a)
 [ 45 8b ce 4d 8b c6 44 39:00 00 00 00 00 00 00 00 ]
    fffff8800100e090-fffff8800100e0aa  27 bytes - partmgr!PmQueryDepends+1d0 (+0x0f)
 [ 48 8d 51 08 4c 39 12 74:00 00 00 00 00 00 00 00 ]
    fffff8800100e0ac-fffff8800100e0b5  10 bytes - partmgr!PmQueryDepends+1ec (+0x1c)
 [ 4a 89 44 c1 08 49 8b 87:00 00 00 00 00 00 00 00 ]
    fffff8800100e0b8 - partmgr!PmQueryDepends+1f8 (+0x0c)
 [ ff:00 ]
    fffff8800100e0ba-fffff8800100e183  202 bytes - partmgr!PmQueryDepends+1fa (+0x02)
 [ 41 ff c5 48 83 c5 08 44:00 00 00 00 00 00 00 00 ]
    fffff8800100e185-fffff8800100e1a4  32 bytes - partmgr!PmTakeDisk+15 (+0xcb)
 [ 48 8b 05 e4 9f ff ff 48:00 00 00 00 00 00 00 00 ]
    fffff8800100e1a7-fffff8800100e1a8  2 bytes - partmgr!PmTakeDisk+37 (+0x22)
 [ ba bb:00 00 ]
    fffff8800100e1ab-fffff8800100e1bb  17 bytes - partmgr!PmTakeDisk+3b (+0x04)
 [ c0 48 89 44 24 38 ff 15:00 00 00 00 00 00 00 00 ]
    fffff8800100e1be-fffff8800100e1c5  8 bytes - partmgr!PmTakeDisk+4e (+0x13)
 [ 41 fe 4b 43 49 83 83 b8:00 00 00 00 00 00 00 00 ]
    fffff8800100e1c9-fffff8800100e1ce  6 bytes - partmgr!PmTakeDisk+59 (+0x0b)
 [ b8 48 8b 96 a8 01:00 00 00 00 00 00 ]
    fffff8800100e1d1-fffff8800100e1d8  8 bytes - partmgr!PmTakeDisk+61 (+0x08)
 [ 48 8b 4f 28 48 8b 82 b8:00 00 00 00 00 00 00 00 ]
    fffff8800100e1dc-fffff8800100e1e1  6 bytes - partmgr!PmTakeDisk+6c (+0x0b)
 [ 4c 8d 44 24 38 c6:00 00 00 00 00 00 ]
    fffff8800100e1e3-fffff8800100e1ea  8 bytes - partmgr!PmTakeDisk+73 (+0x07)
 [ 0f 4c 89 42 18 83 60 08:00 00 00 00 00 00 00 00 ]
    fffff8800100e1ec-fffff8800100e1ef  4 bytes - partmgr!PmTakeDisk+7c (+0x09)
 [ c7 40 10 08:00 00 00 00 ]
    fffff8800100e1f3-fffff8800100e1f6  4 bytes - partmgr!PmTakeDisk+83 (+0x07)
 [ c7 40 18 08:00 00 00 00 ]
    fffff8800100e1f8 - partmgr!PmTakeDisk+88 (+0x05)
 [ 76:00 ]
    fffff8800100e1fa-fffff8800100e233  58 bytes - partmgr!PmTakeDisk+8a (+0x02)
 [ ff 15 18 8e ff ff 48 8b:00 00 00 00 00 00 00 00 ]
    fffff8800100e235-fffff8800100e247  19 bytes - partmgr!PmTakePartition+15 (+0x3b)
 [ 33 c0 48 8b da 48 8b f9:00 00 00 00 00 00 00 00 ]
    fffff8800100e24b-fffff8800100e25b  17 bytes - partmgr!PmTakePartition+2b (+0x16)
 [ 48 8b 42 10 49 89 43 e8:00 00 00 00 00 00 00 00 ]
    fffff8800100e25d-fffff8800100e26c  16 bytes - partmgr!PmTakePartition+3d (+0x12)
 [ eb 09 48 8b 41 20 48 89:00 00 00 00 00 00 00 00 ]
    fffff8800100e26f-fffff8800100e270  2 bytes - partmgr!PmTakePartition+4f (+0x12)
 [ ba bb:00 00 ]
    fffff8800100e273-fffff8800100e27e  12 bytes - partmgr!PmTakePartition+53 (+0x04)
 [ c0 ff 15 ae 8f ff ff 4c:00 00 00 00 00 00 00 00 ]
    fffff8800100e281-fffff8800100e289  9 bytes - partmgr!PmTakePartition+61 (+0x0e)
 [ 4c 8d 44 24 20 49 83 83:00 00 00 00 00 00 00 00 ]
    fffff8800100e28d-fffff8800100e291  5 bytes - partmgr!PmTakePartition+6d (+0x0c)
 [ b8 83 cd ff 41:00 00 00 00 00 ]
    fffff8800100e293-fffff8800100e29d  11 bytes - partmgr!PmTakePartition+73 (+0x06)
 [ 6b 43 48 8b 43 30 48 8b:00 00 00 00 00 00 00 00 ]
    fffff8800100e2a0-fffff8800100e2a7  8 bytes - partmgr!PmTakePartition+80 (+0x0d)
 [ 48 8b 48 28 48 8b 82 b8:00 00 00 00 00 00 00 00 ]
    fffff8800100e2ab - partmgr!PmTakePartition+8b (+0x0b)
 [ c6:00 ]
    fffff8800100e2ad-fffff8800100e2b4  8 bytes - partmgr!PmTakePartition+8d (+0x02)
 [ 0f 4c 89 42 18 83 60 08:00 00 00 00 00 00 00 00 ]
    fffff8800100e2b6-fffff8800100e2b9  4 bytes - partmgr!PmTakePartition+96 (+0x09)
 [ c7 40 10 10:00 00 00 00 ]
    fffff8800100e2bd-fffff8800100e2c0  4 bytes - partmgr!PmTakePartition+9d (+0x07)
 [ c7 40 18 04:00 00 00 00 ]
    fffff8800100e2c2 - partmgr!PmTakePartition+a2 (+0x05)
 [ 76:00 ]
    fffff8800100e2c4-fffff8800100e2d5  18 bytes - partmgr!PmTakePartition+a4 (+0x02)
 [ ff 15 4e 8d ff ff 4c 8b:00 00 00 00 00 00 00 00 ]
    fffff8800100e2d7-fffff8800100e2d9  3 bytes - partmgr!PmTakePartition+b7 (+0x13)
 [ 83 7b 2c:00 00 00 ]
    fffff8800100e2db-fffff8800100e2e0  6 bytes - partmgr!PmTakePartition+bb (+0x04)
 [ 74 13 83 87 0c 01:00 00 00 00 00 00 ]
    fffff8800100e2e3-fffff8800100e2ec  10 bytes - partmgr!PmTakePartition+c3 (+0x08)
 [ ff 8d 45 01 0f 44 c5 83:00 00 00 00 00 00 00 00 ]
    fffff8800100e2ee-fffff8800100e2fa  13 bytes - partmgr!PmTakePartition+ce (+0x0b)
 [ eb 05 b8 fe ff ff ff f0:00 00 00 00 00 00 00 00 ]
    fffff8800100e2fd-fffff8800100e343  71 bytes - partmgr!PmTakePartition+dd (+0x0f)
 [ f6 43 28 04 74 08 48 8b:00 00 00 00 00 00 00 00 ]
    fffff8800100e347-fffff8800100e355  15 bytes - partmgr!PmIsRedundantPath+17 (+0x4a)
 [ 48 8b 05 b2 9d ff ff 48:00 00 00 00 00 00 00 00 ]
    fffff8800100e359-fffff8800100e35c  4 bytes - partmgr!PmIsRedundantPath+29 (+0x12)
 [ 83 64 24 70:00 00 00 00 ]
    fffff8800100e35e-fffff8800100e37a  29 bytes - partmgr!PmIsRedundantPath+2e (+0x05)
 [ 48 8b ea 33 d2 4d 8b f0:00 00 00 00 00 00 00 00 ]
    fffff8800100e37c-fffff8800100e398  29 bytes - partmgr!PmIsRedundantPath+4c (+0x1e)
 [ 33 c0 f3 0f 6f 05 4a 8f:00 00 00 00 00 00 00 00 ]
    fffff8800100e39c-fffff8800100e3b1  22 bytes - partmgr!PmIsRedundantPath+6c (+0x20)
 [ 89 44 24 44 48 89 44 24:00 00 00 00 00 00 00 00 ]
WARNING: !chkimg output was truncated to 50 lines. Invoke !chkimg without '-lo [num_lines]' to view  entire output.
3809 errors : !partmgr (fffff8800100e000-fffff8800100efff)

MODULE_NAME: memory_corruption

IMAGE_NAME:  memory_corruption

FOLLOWUP_NAME:  memory_corruption

DEBUG_FLR_IMAGE_TIMESTAMP:  0

MEMORY_CORRUPTOR:  LARGE_4096

FAILURE_BUCKET_ID:  X64_MEMORY_CORRUPTION_LARGE_4096

BUCKET_ID:  X64_MEMORY_CORRUPTION_LARGE_4096

Followup: memory_corruption
--------- 
From the basic description presented by the debugger, there are 4 main exceptions that indicate a disk subsystem failure:

# for hex 0xc000000e / decimal -1073741810 :
  STATUS_NO_SUCH_DEVICE                                         ntstatus.h
# A device which does not exist was specified.
# 1 matches found for "c000000e"

# for hex 0xc000009c / decimal -1073741668 :
  STATUS_DEVICE_DATA_ERROR                                      ntstatus.h
# 1 matches found for "c000009c"

# for hex 0xc000009d / decimal -1073741667 :
  STATUS_DEVICE_NOT_CONNECTED                                   ntstatus.h
# 1 matches found for "c000009d"

# for hex 0xc0000185 / decimal -1073741435 :
  STATUS_IO_DEVICE_ERROR                                        ntstatus.h
# The I/O device reported an I/O error.
# 1 matches found for "c0000185" 
 
There is also an error code that indicates serious file system corruption:

# for hex 0xc000009a / decimal -1073741670 :
  STATUS_INSUFFICIENT_RESOURCES                                 ntstatus.h
# Insufficient system resources exist to complete the API.
# 1 matches found for "c000009a" 
 

From this dump, system corruption is evident in the fact that the partition manager driver (part of the IO stack) has more than 3800 errors,

3: kd> !chkimg !partmgr
3809 errors : !partmgr (fffff8800100e000-fffff8800100efff)
 
We know from this dump that the drive is likely failing, but for a moment lets pretend that the error returned is 0xc000009a (STATUS_INSUFFICIENT_RESOURCES). This error indicates a failure to read ahead in the file system. Depending on the extent of the corruption, some recovery may be possible using an offline chkdsk.

Going back to reality, this error is really 0xc000000e (STATUS_NO_SUCH_DEVICE). This indicates that the drive is failing badly enough that the system has lost the ability to communicate with the drive during the IO operation. At this point a few troubleshooting steps are possible:


If these steps are unsuccessful, then the drive is likely unrecoverable without the support of a trained data recovery specialist. This operation can become fairly difficult if a failed RAID 0 (stripe, no parity) or RAID 5 (stripe with parity) array needs to be recovered. Professional data recovery services can often recover these types of RAID arrays after failure, but it is often better to ensure that good data redundancy and backup practices are in place.

See Also,
Windows Crash Dump Analysis
Troubleshooting Memory Errors
How To Detect a Failing Hard Drive


2 comments:

  1. Hello, thanks for this info. Keep on writing PC troubleshooting articles. You're helping a lot of people willing to learn to troubleshoot computer errors. Thanks! =)

    ReplyDelete