Compressed Thoughts

A blog by Matthew Rease

The Lenovo Yoga's rather friendly looking firmware screen. Visual and mouse based, with optional keyboard controls.

A Journey Through Some Rough Lenovo Firmware

How I diagnosed a bad NVMe drive on a laptop over several hours.

I recently had the displeasure of working on a Lenovo Yoga C740 which had stopped booting into its operating system (Linux Mint). What should have taken 30 minutes, ended up taking me probably around 8 hours.

I decided to write about my experience, both for fun, and perhaps to showcase my knowledge/abilities for the IT work force (even if it is somewhat embarrassing).

# The Subject

A lightly used Lenovo Yoga C740-15IML, which the owner had had for 2 years. It had recently (within the last 6 months) upgraded to Linux Mint after Windows had started experiencing stability issues (at this time it is not known if those were related to the hardware issues being discussed today, but considering how long it worked on Mint this is considered unlikely).

When I say lightly used, understand that I do mean light. This is "your mother's" laptop, if that makes sense. 2 years and yet not a sign of dust in the fans, this was the cleanest laptop I'd ever worked on that wasn't brand new. All that is to say, this laptop should not have had any issues, statistically.

# The Assumption, or, How it Started

Before talking with the owner (only having heard that it no longer loaded the OS), I assumed it was a bad update that messed with the GRUB/EFI data. Upon doing a video call and seeing that nothing showed up in the machine's EFI boot list, this seemed to further confirm what I thought. Though, even at that time, I noted how slow the machine was to respond when powered/rebooted. In my ignorance, I thought this would be a simple 5 minute fix, where I boot into a live OS on a USB, and run some commands to rebuild GRUB and possibly mark the partition as bootable again. I nearly walked the owner through doing it themselves over the phone, but decided it would be best if I did it myself, so I had them deliver it to me. This was the only thing I did right, it would seem, because I would soon discover it was not so simple.

# How It's Going

# No Idea What's Going On

I don't recall the exact sequence of events as I did not initially take detailed notes on my steps; however, this is my best attempt to recapture what I did for the audience' sake.

The next day, when I decided it was a good time to begin work on the laptop, I grabbed my trusty 32 GB SanDisk loaded with Ventoy, one of my favorite bootable softwares, which allows me to put multiple ISO files on a USB drive and pick which one to use when booting. I inserted it into the laptop, waited a suspicious amount of time for it to POST (or whatever it was doing), only for it to complain about secure boot. Okay, no big deal, I spent several minutes figuring out the "BIOS" key (F2 if you're curious), each reboot (warm or cold) taking 30 seconds or more, eventually got in, and disabled secure boot. Okay, here we go, I probably thought. I once again went to boot, this time manually bringing up the boot menu, and selecting the VTOYEFI entry, and then... nothing. The screen stayed black.

Well, no matter. I had another flash drive sitting around with Zorin OS on it. I tried it, and this time (unlike with Ventoy) the initial bootloader appeared, giving me different options (such as default, safe graphics, and nvidia graphics). So I selected one, and behold! It was the same as Ventoy. Black screen. I tried the other options in the GRUB menu, no dice. Okay, so maybe it's a problem with Zorin too cause it's a bit of an obscure distro - though I was certainly getting more suspicious.

Fine. I opened Balena Etcher, and began flashing whatever contemporary, modern distro I had sitting around, waited 20+ minutes for it to flash and verify, stuck the USB in, and... same thing. It was likely around this time that I noticed, whenever I selected an entry on one of these GRUB screens, the flash drive's activity LED would flicker for a while, indicating it was likely being read into memory, only to eventually stop. I further discovered at some point that, while the data was loading I could turn caps lock on/off, or reboot the machine with ctrl+alt+delete, but as soon as that LED stopped, the machine became unresponsive to anything but the power button.

At this point, I'm becoming more and more certain that this is not a strange Linux or GRUB issue, but that there's something more deeply wrong with this machine. Another thing I noticed: the firmware was out of date. By itself this doesn't mean much, but it got me thinking more about the firmware, and I started to theorize that the laptop's firmware may be corrupted. After all, the boot/POST times are atrocious, and even the most stable/trusted operating systems aren't booting. I wanted to update the firmware, but the only file Lenovo provides, is a .exe file, and they don't even have one of those firmwares where you can just stick it on a FAT32 USB and then update it without an OS, this one required Windows.

Windows, to my knowledge, does not have a concept of live OS installers, like the Linux world. I found a supposed live Windows 11 ISO (made by the same people that do tiny11), but it was BIOS only, no EFI. This machine also has no internal SATA plugs, only a single M.2 slot. And I didn't readily have a Windows NVMe lying around. Still, after sending this information to a friend, they also suggested putting another NVMe in there. At first, I didn't want to do this due to how cumbersome it would be to take a drive out of my PC (I have no spares). So I first took the drive out of the laptop and inserted it into my PC. I turned it back on, and opened the KDE Partition Manager, and didn't see anything. I checked lsblk just to be sure, and saw nothing there either. Rebooted back into my firmware settings, and even there, the drive did not show up. It was as if it wasn't even connected to my motherboard. Interestingly, when connected to the Lenovo, it showed the SSD's model name in the firmware settings. It came up as an "INTEL HBRPEKNX0202AC0". Interesting, I didn't know Intel made (or at least rebranded) storage drives.

Still, all this had shown was that the SSD probably didn't work anymore, which I had already considered as a possibility. It was disappointing to be sure, but still didn't explain to me why I had been having so many problems. So, after some thought, I turned off my very heavy full tower PC, and began taking the GPU out. I did this because, I have 2 M.2 SSDs, one of which has Windows 11 installed on it (I used it for benchmarking tools when I first built the PC). It's under my annoyingly large GPU, which was not fun to remove gently. To add insult to injury, once I got it out, I tried to put the NVMe in the laptop, and it didn't fit! It's a Samsung 980 Pro with heatsink. Turns out the heatsink adds enough width, that it collides with one of the components on the laptop motherboard (there's plenty of room, they could have moved it, it's just a poor decision on Lenovo's part). So, I put that back in, and then took out my main SSD which was under the heatsink AsRock provides on the motherboard (with a sticky thermal pad).

Close up of the motherboard's M.2 slot, with a white/gray rectangular circuit
part right up against the long edge of the
NVMe.

# Finally Starting to Figure it Out

I forgot to mention it, but at some point I had tried booting Memtest86, and it actually loaded properly and ran, which showed the machine was still somewhat able to operate. I accidentally let the battery run out while it did this, so it only did 50% of its tests, but I was satisfied that the RAM was not the issue. It is still interesting that this ran, while all other bootloaders ultimately failed at some point. Back to where we left off though: with that other SSD in, this is where things began to change. Despite my system running Linux, and this drive only having Linux on it, apparently its EFI partition still contained the Windows Bootloader. So when I turned on the machine, I actually saw a Windows recovery screen of sorts (it also loaded quite quickly). Initially this was extremely disorienting, as I didn't know I had the Windows Bootloader on this drive, and I thought maybe this was built into the machine's ROM or something, but after a few minutes of thinking I realized what I was looking at (it pays to know your own tools I guess).

Windows Recovery screen on external monitors. Laptop is lying on its screen,
with the bottom cover removed exposing the
internals.

This was the closest I'd gotten, in my mind, to booting an OS since receiving this machine. I couldn't get past this screen, none of the options did anything (which makes sense when you remember there's no Windows partition on the drive), but it was something. While I was happy to see something, at this point I had been at this for over 5 hours, and just wanted to take a break.

It was late, and I decided to go to bed. Upon waking up the next morning I decided to try upgrading the firmware again. I remembered that in the Windows installer it is possible to get a command prompt (shift+F10), and reasoned I could use that to run the firmware upgrader. Problem is, once I finally had a working installer drive, it too would fail to boot on the laptop (even though I had just seen something Windows related work). I don't know if it was the sleep, or if it was just a coincidence, but on the second day, over 6 hours into this project, I had the idea to try booting something with the bad NVMe removed. Immediately the Windows installer was able to load. It was like I'd just had a Jimmy Neutron brain blast.

The source of all the laptop's problems, appeared to be this stupid little SSD from Intel. After trying to upgrade the firmware (more on that later) I tried my other flash drives, including the one with Ventoy, and they all worked as expected, and not only that but they loaded at a regular speed. Not only did this SSD prevent most bootloaders from working properly, it was also causing the extremely long POST times. This didn't happen when connected to my PC, so I can only conclude that the quality of Lenovo's firmware is rather poor.

The lesson of the day (for me at least) is that, if you can't rule out a piece of hardware, remove it. There would have been no harm in trying to boot this thing without the SSD connected, I could have saved myself a few hours at least. I also learned that Intel might make low quality SSDs, since this laptop had only 2 years of very light usage under its belt. Probably just bad silicon lottery, but since this is the only Intel branded storage I've ever encountered, it's quite a sad introduction.

Remove all non-essential pieces of hardware!

# How Things Could Have Gone

Had Lenovo's firmware acted more like AsRock's (or most any other manufacturer I assume), then I would have booted into a live Linux OS on my first try, seen that the NVMe was nowhere to be found, and then concluded it was faulty in some way. Instead, I got sent on a wild goose chase (admittedly of my own design, somewhat) trying to determine which parts of this machine, if any, worked.

# Miscellaneous Additional Info

I did try some other steps during my tests. I disconnected the battery and held the power button for a while. I also removed the firmware battery to reset everything. None of these steps made any change in the behavior or provided any significant information (aside from ruling out non-issues), hence their not being included in the "story". They weren't important for the flow, but I wanted to make clear that I didn't skip said basic steps.

My attempt to upgrade the firmware from the Windows installer didn't work. Apparently it needs a proper Windows environment to run. Took me a while to even find that out because I first tried connecting a second flash drive while in the installer, but despite going through all drive letters from A: to Z:, it never appeared. So I ended up having to shut down, mount the installer drive in Linux, copy the firmware exe over, and boot the installer again - only to find out it was a waste of time.

So, technically, I cannot be 100% sure of my findings. The firmware could still have issues, and I'm unable to update it at present. However, I will soon have a brand new Western Digital drive which I will attempt to install Windows to first, in order to upgrade the firmware, and then reinstall Linux Mint. If anything noteworthy happens, I'll update this post.