The Hunt For Blue Screen

February 11, 2008 – 2:48 pm

This story began about 15 months ago, in November 2006. That was the time when Microsoft was getting very close to releasing Windows Vista, and it was the time for me to start getting serious about making sure my applications were compatible with it.

At that time I was using two computers for the development and testing, one with two single-core Intel processors, and another one with one single-core AMD x64 processor. Both were set up for development and testing of my programs: I was using the first one to test the 32-bit versions, and the second one to test the 64-bit editions of my programs. Since many people reported that Vista was more hardware hungry than XP, I thought it was a good occasion for me to also get a more powerful computer that would run Vista reasonably well. So I bought a new Core 2 Duo (dual core) processor, a motherboard to support it (P5L-MX from ASUS), a new video card to support the Aero user interface of Vista, put them together in a spare computer case I had, loaded up Vista Release Candidate on it, and started working on porting my applications to Vista.

It all went well for a while, except every couple of days or so my new powerful computer would once of a sudden “blue screen” and reboot.

After it happened a few times, I fired up WinDbg and loaded a few latest minidumps into it. They indicated that the crashes were happening in the FASTFAT.SYS driver, and the common reason for the errors was IRQL_NOT_LESS_OR_EQUAL, a common reason for a crash caused by a sloppily written device driver. It seemed like a bug in the FASTFAT.SYS driver shipped with the pre-release version of Vista. I decided there was not much I could do but hope that the bug would be fixed in the final (RTM) Vista release.

A couple of months later the RTM release of Vista became available, so I’ve reformatted the hard drive to get rid of the release candidate of Vista, installed a fresh copy of Vista RTM on it, and started using it.

In a day or two the same crashes started to happen again.

Figuring Microsoft would not release a new version of Vista with a buggy version of such an important driver as FASTFAT.SYS, I started looking for another reason. What made it difficult was that the blue screens appeared not very often, sometimes a week would go by and I started to hope I finally found out the reason, but inevitably, it would crash again no matter what I tried. And I tried a plenty:

I vacuumed the inside of the case and reseated the processor and the RAM modules.

The blue screens kept happening.

I ran the memtest program to check the RAM for errors for a few hours, it did not find any problems with the RAM.

The blue screens kept happening.

I installed the SpeedFan program to monitor the temperature of the hardware components. Although it did not show an overheating, I added another fan to the case.

The blue screens kept happening.

I’ve replaced the video card with another one.

The blue screens kept happening.

I’ve bought a new SATA hard drive (previously I was using an old IDE drive), and moved the Vista installation to it.

The blue screens kept happening.

I thought that maybe I got a faulty motherboard, so I bought a new one, this time P5LD2, again from ASUS. I also picked up another Core 2 Duo processor and a new set of RAM modules to go with it. I reinstalled Vista RTM from scratch, and set up my development environment, and started working as usual.

The blue screens kept happening.

As you can see, at that point I already had two computers which were giving me the blue screens every couple of days or so. I ran out of the new theories about the reason for the crashes, and I returned to the one I started with: the bug was probably in the FASTFAT driver of Vista after all, maybe I should have waited till Vista SP1 was out before switching to Vista as my main development platform. I started thinking about switching back to Windows XP. It so happened that at that time (in September 2007) I was locked out of both of my Vista computers by the buggy Genuine Advantage code of Windows Vista (I plan to share that experience of mine in a separate post, later on, stay tuned). That made the decision to switch back to XP real easy.

I was using Windows XP for several years, and never had a problem like that before, so imagine my surprise that after I’ve reinstalled Windows XP on each of my new computers, the blue screens started to happen almost from the day one. As before, they were occurring in the FASTFAT.SYS driver. It made it clear for me that I was blaming Vista in vain, it did not introduce a new bug, or, at least, if the bug was there, it was not Vista-specific.

I started analyzing the similarities between the two new computers, hoping that would give me a clue. They had different motherboards (although from the same manufacturer), slightly different processors, different RAM modules, different video cards, different hard drives (one was using a WD SATA drive, another one a Maxtor IDE drive). I came up with the idea that maybe I got very unlucky and I got two faulty motherboards. Luckily, at that time the built-in network adapter on one of the motherboards died, and I took this opportunity to RMA the motherboard back to ASUS. I got the replacement back in a few days, and installed it.

The blue screens kept happening.

Thinking that getting three faulty motherboards in a row was very unlikely, I started to try other things. Even though my two old computers were plugged in the same UPS device as the new ones and were working just fine, I thought maybe the new computers were more sensitive to the quality of the power they were getting.

I replaced a cheapo generic power supply in one of the new computers with a considerably more expensive and supposedly better one from Antec.

The blue screens kept happening.

I bought a new, more powerful UPS, specifically for use by the new computers.

The blue screens kept happening.

Out of desperation, I started all over and repeated every troubleshooting step I did before, with each of the crashing systems: reseated the modules, replaced the cables, ran the memtest.

The blue screens kept happening.

At that point, about a month ago, I ran out of ideas. I was ready to surrender and just live with it. Or maybe throw out both of the new computers I’ve built and buy a completely new one, and I was seriously contemplating that, when on January 15 it hit me: what that FASTFAT.SYS driver was doing there anyway? All of my hard drives have been formatted with the NTFS file system, I didn’t remember formatting a drive with the FAT or FAT32 system recently. Why would Windows load the FASTFAT driver?

I reviewed the properties of the drives listed in My Computer, and sure enough, there was one of them formatted with the FAT32 system. It was a virtual encrypted drive I created a while back with the TrueCrypt software. I used the drive as a backup place for sensitive files of mine. Periodically, I would burn the image to a DVD-R disc, to make a backup of it. And yes, there was a copy of this image on each of the new computers experiencing the crashes.

I reformatted the encrypted volumes with the NTFS file system.

The blue screens stopped.

It’s been almost a month since I’ve made the last change, and I have not had a single blue screen. Previously, they were happening every couple of days. I’m very confident now that I’ve found the culprit that caused me so much grief. I believe the following list describes the common conditions for the blue screens to occur:

  1. The computer should have a multi-core processor, such as Intel Core 2 Duo.
  2. The computer should have TrueCrypt 4.3a installed, and there should be an encrypted FAT32 volume mounted.

Why do I think the first condition is important? Because previously I was using TrueCrypt with FAT32 virtual drives for several years on the computers that had single-core processors, and never experienced such crashes with them. Only when I switched to the Core 2 Duo processors the crashes started to occur.

I’ve looked through the source code of TrueCrypt 4.3a and noticed that its driver was compiled with the NT_UP switch in its Makefile. This is definitely wrong. It means that the driver was targeted at the uni-processor systems. Since the multi-core processors are essentially multi-processors, defining NT_UP means asking for trouble.

Why did the crashes stopped after I’ve reformatted the encrypted drives with the NTFS file system? I don’t know. Apparently the NTFS file system driver is more robust and can tolerate the imperfect drivers such as the ones compiled with the NT_UP switch. Why didn’t I get crashes with my old two-processor computer? Again, I don’t know. Maybe the old computer was not fast enough for the error conditions to occur so frequently, and when it did crash once in a blue moon, I just dismissed that as something insignificant and did not pay attention to it.

Now, I noticed that a few days ago a new version of TrueCrypt 5.0 had been released. It uses a new driver build procedure, that does not seem to have the NT_UP flag anymore. This is good. However, looking through their support forums it seems like the new version introduced quite a few new bugs. I guess I’ll postpone upgrading to it until version 5.1 comes out. I want to get some rest from the blue screens for awhile :-)

Update: April 15, 2008

A few days ago I decided to try the latest release of TrueCrypt, 5.1a. I reformatted the NTFS encrypted volume back to FAT32 and the blue screens started to occur almost immediately. After two days of bluescreening, I reformatted the volume back to NTFS, and the blue screens stopped. It looks like TrueCrypt 5.1a still causes this problem. HTH.

Update: May 13, 2008

A week ago I started another experiment : connected a spare hard drive about the same size as the TrueCrypt volume I use, formatted the hard drive with the FAT32 system (just like the TrueCrypt volume that was giving me the blue screens), and copied everything from the encrypted volume to that hard drive. Then I dismounted the TrueCrypt volume, and assigned its drive letter to the hard drive I’ve just attached. Restarted the computer and kept using it as before, the only difference was that instead of a FAT32-formatted encrypted volume I was now using a regular FAT32-formatted unencrypted hard drive. A week passed by, no blue screens. Today I copied everything back from the hard drive to the FAT32-formatted TrueCrypt volume, and disconnected the extra hard drive. About an hour later a blue screen occurred. I think that proves conclusively that TrueCrypt is the real culprit behind these blue screens. HTH.

Update: July 16, 2008.

A week ago I’ve installed a new version 6.0a of TrueCrypt. One of the new things in it was an updated device driver with the improved support for the multi-core processors. That gave me the hope that this version might have finally fixed this bug. For a week it was running  smooth, no BSoDs, even though I’ve switched to using a FAT-formatted encrypted volume. I was thinking about reporting success here, but today - boom, blue screen with IRQL_NOT_LESS_OR_EQUAL status in fastfat.sys.  So I’m switching back to the NTFS volume and reporting for now that version 6.0a of TrueCrypt still has not fixed this problem. HTH.

Share this article:
  • Digg
  • del.icio.us
  • Netvouz
  • description
  • ThisNext
  • MisterWong
  • Wists
  • blogmarks
  • Furl
  • Linkter
  • Ma.gnolia
  • Slashdot
  • Spurl
  • StumbleUpon
  • Technorati
  • YahooMyWeb
  1. 32 Responses to “The Hunt For Blue Screen”

  2. Hey thanks for this writeup! I found your site after suspecting that a BSOD I just got was related to TrueCrypt being open. I suppose this confirms it!

    I actually did just install TC 5.0, so we’ll see how that improves things. I tried to use the fat-ntfs convert utility on the mounted drive, but it seems unable to do it! Something about an inconsistency, but chkdsk finds no such issue.

    By Ryan Dlugosz on Feb 17, 2008

  3. Ryan,

    thank you for your comment. I did not use the fat-to-ntfs conversion utility, I just created a fresh ntfs volume and copied files to it, then deleted the old one. Of course, for that to work one must have enough free space on the hard disk, or a spare hard drive.

    HTH

    By AB on Feb 17, 2008

  4. Andrei -

    No prob. Just to follow-up with you on this (perhaps to others who will follow!), using TC v5a did *not* resolve the BSOD with a FAT32 volume. I had another bomb last night while running firefox (I previously did not have my profile data in the TC volume. I started having the BSODs immediately after moving it there…)

    After that I created a new NTFS volume and copied the files over to it. I ran all morning that way and haven’t had a blue screen yet! As a side benefit the new volume uses the faster new style of access in v5.

    FYI I haven’t tried out the “encrypt the whole system disk” feature of v5… Looking at some of the troubles in the forums I may wait a bit on this until they get the bugs out. Looks like a really great feature though.

    Thanks again for posting this helpful info.

    By Ryan Dlugosz on Feb 18, 2008

  5. Awesome! Thanks for this write up. I was about to give up. I started having the BSOD in FASTFAT.SYS on my T60 after upgrading to Vista. I bought a new T61 last week and today I started getting the same BSOD again. Really the same as on my T60. Unbelievable.

    I already had a support ticket open with Microsoft without a resolution. They only pointed to faulty hardware. But their memory test run fine, though.

    I’ll rebuilt my TrueCrypt volumne and let you know.

    By Gunnar Wagenknecht on Feb 20, 2008

  6. Ryan: thanks for the information about TC5a, please keep posting here is you discover anything new.

    Gunnar: glad my post was of help. Please write here is you can confirm my findings (or not).

    Thanks!

    By AB on Feb 20, 2008

  7. Andrei, thanks a lot for the article. The fastfat.sys caused Blue Screen of Death was driving me nuts on my old build. After rebuilt I have not seen for a month and fastfat blue death came back yesterday - 3 BSODs in the last 2 days (and none in the month before). Extensive googling did bring your article.

    I do use Truecrypt 5.1a and after reading your article I thought that you have made good point. So I have tried to use
    CONVERT letter /FS:NTFS (shell command) on my Truecrypt contained and sure it could not be done! The CONVERT command found some discrepancies.

    I was forced to run CHKDSK letter /F first, and only then was able to convert my Truecrypt container to NTFS from FAT32 (no hidden partitions). This converter error message makes me think that something was wrong with FAT32 inside TC container.

    I think you nailed it! Since the only drive on my machine with FAT filesystem was TrueCrypt container - and fastfat.sys is used only for FAT file systems, the connection of this problem to the TrueCrypt container/partition seems very logical (after reading your article)

    If no blue screens will appear tomorrow I will know for sure that the problem was indeed in FAT32 TrueCrypt container.

    I appreciate that you have taken time and wrote this article, it did give some hope to get rid of blue screens of death.

    P.S. I would just add that just running CHKDSK /F on the FAT32 container (without converting it to NTFS as I did) from time to time (or when fastfat.sys BSOD happens a lot) might help too.

    Thanks
    Sergey

    Андрей, спасибо большое за статью. Пока точно не знаю, если ошибка повторится, но что-то мне говорит, что твоя догадка о TrueCrypt-e очень близка к истине. Ещё раз спасибо.

    By Sergey Reznik on Apr 15, 2008

  8. Sergey,

    thank you for your kind words, I’m glad my article was of some help. If you discover anything new relevant to this problem, please let me know.

    Spasibo :-)

    By AB on Apr 15, 2008

  9. Hi,
    Thank’s a lot for this indeed study.
    I also have BSOD’s from time to time in FastFat.sys and was suspecting TC.
    I’ve been using TC from years in my old PC (Athlon 64) on Win XP Pro SP2 and never had a BSOD.
    Recently, I changed my system to a new one with an Intel Core Duo 2 (New motherboard, memory, graphic card 8800GT, and new hard disk) and begun to have this BSOD.
    I do not make a fresh installation of Win XP Pro SP2 and only made a clone of my old HD.

    My TC disk (v5.1a) is also formated in FAT32 and is the only FAT disk in my system and that is why I also suspected the mix Core Duo 2/TC.

    I’ll also try to format my TC disk in NTFL and wait…

    Thank’s to all here and hope this will solve our problems.

    By Michel on Apr 29, 2008

  10. Hi,
    I had been using truecrypt on a number of machines for years without problems. When I upgraded to a Dell Latitude D430 (dual core) I got random BSODs and couldn’t figure out what caused it. After reading this article here I’ve stopped using truecrypt on that machine and don’t get any crashes at all. Seems definitely to be a truecrypt - FAT32 issue.

    I’ve filed a bug on truecrypt.org - hopefully they get this resolved in one of their next versions.

    By Henrik on May 16, 2008

  11. Thank you so much for this blog! I also found it through extensive Google searches after suspecting TC51a was causing the BSOD. I am also using a ThinkPad T60 as one other commenter mentioned. My TC encrypted volume is a single file on a USB flash drive. The drive’s and encrypted volume’s filesystems are FAT32. I did this mainly to prevent the “pork” of NTFS on a limited media, but also to maintain compatibility with Linux since I use the encrypted volume with XP SP2 and Linux. I guess I’ll have to maintain two TC volumes (bah!) or hope the latest reverse-engineered NTFS driver for Linux can read/write without corruption… :o(

    By Jacob on Jun 9, 2008

  12. Today I installed TC51a on 1) a Thinkpad T60 - Centrino Core Duo (32 bit CPU) with 32bit XP Pro and 2) a white box with a Intel D972XBX2 m/b, Q6600 quad core (64 bit CPU), Radeon HD2900XT, with 32 bit XP Pro. Then I made FAT32 encrypted volumes. The T60 hasn’t blue screened yet, but to be safe and reduce the frustration I’ll replace the FAT32 with NTFS. The Q6600 blue screens every time I attempt to mount the FAT32 volume. As soon as I click “Mount”. Before I can put anything in it. Instantly.

    By Andrew on Jun 26, 2008

  13. Do you have Ex2 IFS installed? That turned out to be the problem for me… hope this helps someone.

    By Earth2marsh on Jun 27, 2008

  14. Andrei,
    I’m so glad I came upon this blog ! It gives me hope :)
    I have been going through this for a while, but I am using Cryptainer PE V.7.03 with a FAT file system. I’m going to try changing to NTFS and see if my results concur with the result people have been seeing with Truecrypt.

    I’ll post a follow up when I determine an outcome.
    Thanks again…

    By Steve S on Jul 1, 2008

  15. Thanks for this blog! I’m having the same problem but my PC has a rather “old” Pentium 4HT.
    Seems like hyper threading is also causng a problem. I use NTFS TC volumes but they are stored on a FAT32 formated USB Stick (Corsair Voyager 8G).
    I will try to reformat the stick to NTFS and then try again. Will post a message with the results.

    By Eggy on Jul 1, 2008

  16. Yep, reformatted the stick with NTFS volume type.
    Used to get a BSOD as soon as I tried to mount a second TC volume. But I already mounted six now without any problems! Tried it several times (mount/unmount, various sequences, no problems!)

    By Eggy on Jul 1, 2008

  17. Eggy, thanks for reporting back, glad it worked!

    By AB on Jul 1, 2008

  18. This blog post was dead on (TC 5.1a). Just as a suggestion to others: if you’re able to mount the volume, you can run a chkdsk /f and then a convert /f:NTFS and that will convert it on the fly to NTFS, no need to reformat or create a new volume. I’ve had no problems going on a week now.

    Thanks again!

    By Chris on Jul 14, 2008

  19. Yesterday evening I created a FAT formated Truecrypt File Container on my UBUNTU machine (Athlon X2). this worked like a dream.
    Today I tried to open the container on a Vista Notebook with Core2duo processor. Instantly I got a nice bluescreen !
    Truecrypt 6.0a on both machines.
    With the UBUNTU Vresion of truecrypt the only format option is FAT - so this is nasty …

    By franco on Jul 17, 2008

  20. Hey, thanks a lot for the Blog! Wanted to say that I think you got it completely right the problem is !TrueCrypt-FAT32! AND !Dual Core! I’ve been using a 2GByte-FAT32 container since 2 years - since version 4.0 and never had an issue on my four single core desktops&laptops.
    Since a week i have a new Dell Latitude D530 with Intel Core 2 Duo and I got the BSOD FastFat.sys once a day - even at TC version 6.0a. I will go to convert the FAT32 to NTFS now…

    By MarioD on Jul 18, 2008

  21. Excellent post, I have just built myself a new quad core dev machine running Vista, I find that Trucrypt 6.0a causes an instant BSOD as soon as a mount is attemted on both FAT32 and NTFS formatted truecrypt volumes. Intersetingly, the Linux verison is fine. I run it routinely with multiple mounted fat32 volumes on a dual core x64 AMD running Fedora 8 with a 64 bit, mp enabled kernel and have no problems, whatever the issue is its specific to windows and multi-core systems. evidently ntfs is better but still not immunie to this, and more cores makes it worse.

    By Ed Johnson on Aug 8, 2008

  22. Thanx, dude :) This trouble made me seek for a long time. It’s not a big deal - I didn’t use TC very often - but I just know how it happens for now. Did you write somthng ’bout this bug to TC development team?

    By Ternia on Aug 12, 2008

  23. Removing ext2-fs solved my problems.

    By Derek on Sep 2, 2008

  24. I’ve had the same problem for about a year now, starting when I bought a new computer. Also tried everything… for no use. Nothing helped.

    A few days ago I got a new computer at my work. It started to crash the same way my computer at home. Hm… Strange.

    Since I’ve been using an encrypted USB-stick with some portable programs almost all the time, both at home and at work, I started to suspect that it could have something to do with it.

    And now I found this blog and recognize everything you write!

    And now I think it’s a scandal that Truecrypt doesn’t list this problem as an “Known issue”!

    By Ola on Sep 8, 2008

  25. Man, you saved my life!!! (And my job, too!) I use a pendrive for all my personal data, specially e-mails, and I’m running Portable Apps inside a TrueCrypt partition. Since I wanted it as portable as possible, to run on many systems, I decided to format the TrueCrypt volume as FAT32. I was already connecting the bluescreen problem with the pendrive, but I was thinking the problem was with the hardware, not the filesystem itself.

    Reading your post, and the fact that TrueCrypt FAT32 volumes were the problem, I can only assume that’s my problem too.

    I’ll re-format my crypted volume now and every day without a blue screen I’ll remembrer you and your help….. ;)

    Thank you!

    By Ricardo on Sep 10, 2008

  26. Thank you for your post. Turns out the problem totally vanished after un-installing ext2 ifs.

    My home-videos (and my girlfriend) appreciate your effort :))

    By sussa on Sep 27, 2008

  27. Thx to this site I found my problem… just unistalled that damned ext2 ifs and truecrypt came back to life!!!! (vista x64)

    thx again

    By MetalKnight on Oct 1, 2008

  28. EXT2FS - got to be it!

    TrueCrypt causes pc to bsod on mounting.

    Vista 64-bit TrueCrypt (all versions tried) Lenovo ThinkPad T61 Intel dual 4GB RAM

    EXT2FS and VISTA 64 That’s IT!!

    Get rid of EXT2FS. End of Blue Screen

    By Charley S on Oct 2, 2008

  29. I’ve had the same BSOD on my office PC a few days ago, same message, same symptoms, when trying to mount a regular Fat32 penDrive. The PC is single core, running WinXP. Yet I have other penDrives in Fat32 that do not crash, so I decided to forget about it. Then I made a file-volume with TrueCrypt today, tried to mount it and blue! So searching on google I found this excellent article, but since the bug happened with a regular volume also I was suspicious of something else.
    Then I sow this comment:
    #
    Removing ext2-fs solved my problems.
    By Derek on Sep 2, 2008

    I’ve removed the ExtFS2 driver and it’s all running now.

    By Gabriel AGM on Oct 6, 2008

  30. How about new TC 6.1?
    I also had BSOD with my FAT32 crypted drive and it is gone with converting to NTFS. Can anybody say something about this new version?

    By Serj on Nov 6, 2008

  31. i bsod when system encryption fat32 os, and it’s die, and i from google to here :)

    By xuyibo on Nov 25, 2008

  32. Hi all,
    first I have to send many thanks to Andrei for this post. He has saved a lot of hours of my time.

    Secondly, I have a bad news to all. I am using TrueCrupt 6.1a on AMD Athlon 64 X2 DualCore, and I have TC FAT32 partition mounted. I have had BSOD (fastfat.sys) nearly 2 times per day.

    Right now I’am going to reformat TC partition to NTFS. I really belive it will help, because the symptoms are the same.

    By Leon on Dec 6, 2008

  1. 1 Trackback(s)

  2. Aug 21, 2008: Medienkontor KK » Blog Archive » BSOD/ Absturz mit Truecrypt auf Mehrkernprozessoren unter Fat32 - fastfat.sys

Post a Comment

To prove that you are a human being and not an automated spambot, please answer the following question: (digits only, please)