Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "file system errors"
-
I've found and fixed any kind of "bad bug" I can think of over my career from allowing negative financial transfers to weird platform specific behaviour, here are a few of the more interesting ones that come to mind...
#1 - Most expensive lesson learned
Almost 10 years ago (while learning to code) I wrote a loyalty card system that ended up going national. Fast forward 2 years and by some miracle the system still worked and had services running on 500+ POS servers in large retail stores uploading thousands of transactions each second - due to this increased traffic to stay ahead of any trouble we decided to add a loadbalancer to our backend.
This was simply a matter of re-assigning the IP and would cause 10-15 minutes of downtime (for the first time ever), we made the switch and everything seemed perfect. Too perfect...
After 10 minutes every phone in the office started going beserk - calls where coming in about store servers irreparably crashing all over the country taking all the tills offline and forcing them to close doors midday. It was bad and we couldn't conceive how it could possibly be us or our software to blame.
Turns out we made the local service write any web service errors to a log file upon failure for debugging purposes before retrying - a perfectly sensible thing to do if I hadn't forgotten to check the size of or clear the log file. In about 15 minutes of downtime each stores error log proceeded to grow and consume every available byte of HD space before crashing windows.
#2 - Hardest to find
This was a true "Nessie" bug.. We had a single codebase powering a few hundred sites. Every now and then at some point the web server would spontaneously die and vommit a bunch of sql statements and sensitive data back to the user causing huge concern but I could never remotely replicate the behaviour - until 4 years later it happened to one of our support staff and I could pull out their network & session info.
Turns out years back when the server was first setup each domain was added as an individual "Site" on IIS but shared the same root directory and hence the same session path. It would have remained unnoticed if we had not grown but as our traffic increased ever so often 2 users of different sites would end up sharing a session id causing the server to promptly implode on itself.
#3 - Most elegant fix
Same bastard IIS server as #2. Codebase was the most unsecure unstable travesty I've ever worked with - sql injection vuns in EVERY URL, sql statements stored in COOKIES... this thing was irreparably fucked up but had to stay online until it could be replaced. Basically every other day it got hit by bots ended up sending bluepill spam or mining shitcoin and I would simply delete the instance and recreate it in a semi un-compromised state which was an acceptable solution for the business for uptime... until we we're DDOS'ed for 5 days straight.
My hands were tied and there was no way to mitigate it except for stopping individual sites as they came under attack and starting them after it subsided... (for some reason they seemed to be targeting by domain instead of ip). After 3 days of doing this manually I was given the go ahead to use any resources necessary to make it stop and especially since it was IIS6 I had no fucking clue where to start.
So I stuck to what I knew and deployed a $5 vm running an Nginx reverse proxy with heavy caching and rate limiting linked to a custom fail2ban plugin in in front of the insecure server. The attacks died instantly, the server sped up 10x and was never compromised by bots again (presumably since they got back a linux user agent). To this day I marvel at this miracle $5 fix.1 -
TL;DR; I unfucked a micro sd used by a nintendo switch with one command: fsck
I had noticed that the nintendo switch displayed way more storage usage then it should. I didn't mind at first, but at some point I couldn't download any games. When I checked I saw some ridiculous storage usage.
According to the system, all games summed up ~20Gb, but >100Gb was in use? Sounds retarded, so I did the following:
* Plugged it into laptop
* Spend one our searching for a way to to access this seemingly unknown filesystem
* Find out this filesystem is actually exFAT
* Find out that 2/3 sd adapters suck
* check filesystem with dust (A visually more pleasing version of du)
* Find 20Gb of files, nothing hidden or whatever
* run fsck
* "File system contains some errors want me to fix then?"
* "Sure"
* check usage
* 17%
As for the reason why this happened in the first place, my guess is that the switch labels the whole segment of the card as used before downloading a game and it something goes wrong, it shits itself.
Anyways, fsck is a pretty useful command.1 -
tl;dr:
The Debian 10 live disc and installer say: Heavens me, just look at the time! I’m late for my <segmentation fault
—————
tl:
The Debian 10 live cd and its new “calamares” installer are both complete crap. I’ve never had any issues with installing Debian prior to this, save with getting WiFi to work (as expected). But this version? Ugh. Here are the things I’ve run into:
Unknown root password; easy enough to get around as there is no user password; still annoying after the 10th time.
Also, the login screen doesn’t work off-disc because it won’t accept a blank password, so don’t idle or you’ll get locked out.
The lock screen is overzealous and hard-locks the computer after awhile; not even the magic kernel keys work!
The live disc doesn’t have many standard utilities, or a graphical partition editor. Thankfully I’m comfortable with fdisk.
The graphical installer (calamares) randomly segfaults, even from innocuous things like clicking [change partition] when you don’t have a partition selected. Derp.
It also randomly segfaults while writing partitions to disk — usually on the second partition.
It strangely seems less likely to segfault if the partitions are already there, even if it needs to “reformat” (recreate) them.
It also defaults to using MBR instead of GPT for the partition table, despite the tooltip telling you that MBR is deprecated and limited, and that GPT is recommended for new systems. You cannot change this without doing the partitions manually.
If you do the partitions manually and it can’t figure out where to install things, it just crashes. This is great because you can’t tell it where to install things, and specifying mount points like /boot, /, and /home don’t seem to be enough.
It also tries installing 32bit grub instead of 64bit, causing the grub installer to fail.
If you tell it to install grub on /boot, it complains when that partition isn’t encrypted — fair — but if you tell it to encrypt /boot like it wants you to, it then tries installing grub on the encrypted partition it just created, apparently without decrypting it, so that obviously fails — specific error: cannot read file system.
On the rare chance that everything else goes correctly, the install process can still segfault.
The log does include entries for errors, but doesn’t include an error message. Literally: “ERROR: Installation failed:” and the log ends. Helpful!
If the installer doesn’t segfault and the install process manages to complete, the resulting install might not even boot, even when installed without any drive encryption. Why? My guess is it never bothered to install Grub, or put it in the wrong place, or didn’t mark it as bootable, or who knows what.
Even when using the live disc that includes non-free firmware (including Ath9k) it still cannot detect my wlan card (that uses Ath9k).
I’ve attempted to install thirty plus times now, and only managed to get a working install once — where I neglected to include the Ath9k firmware.
I’m now trying the cli-only installer option instead of the live session; it seems to behave at least. I’m just terrified that the resulting install will be just as unstable as the live session.
All of this to copy the contents of my encrypted disks over so I can use them on a different system. =/
I haven’t decided which I’m going with next, but likely Arch, Void, or Gentoo. I’d go with Qubes if I had more time to experiment.
But in all seriousness, the Debian devs need some serious help. I would be embarrassed if I released this quality of hot garbage.
(This same system ran both Debian 8 and 9 flawlessly for years)15 -
I once agreed to maintain and develop an application used in a different section of the school to keep inventory and make sure everything is where it is supposed to be.
At first there was enthusiasm, together with 2 of my classmates we agreed and git clone-d the .NET application that now graduated students built and maintained for the past few years. What could go wrong right?!
It became clear that the original students that worked on it followed an older curriculum, meaning they still got taught .NET instead of the core variant that we get now, not only that but it also seemed that they either did not fully grasp the Clean/Onion architecture or didn't get it in class since there were infrastructure components in the 'Domain' project of the solution. Think of 2 DBContexts in the domain model, yep.
One of us bailed in the first week, the other one and I felt bad for the people using the app so we went on and tried to work on the first bugs that were described in a document. One of these bugs was 'whenever I filter on something in the list, everybody gets to see that filter on their screen instead of only me'. Woah that's weird! Let's see how they put that together!
Oh god, they are using a _static_ variable to store filters, no wonder that it doesn't work properly. Ever heard of sessions?!
Second bug: Sometimes people can't create an account when we sign them up from the admin panel. Alright that is weird, let's figure that one out! Wait a second it seems to work in development? What's this about.
Oh wait I can't create an account on production either? Oh that's weird, wait a second... Why do I have to put my e-mail in a form that was sent to me through e-mail? Why is my address not filled in already? OOH, if someone types in the wrong e-mail address (which is easy since our school has 4 variants of the same f*cking e-mail address) it won't work since it can't recognize the user! Brilliant! Remove e-mail input box and make a token/queryparam determine the user account.
Ah that seems good, it's a mess but it seems a tiny bit better now, great! We're making progress and some sweet buck.
Next bug, trillions of 50x errors on random pages, that's a weird one.
Hm everything works in development, that's odd. Is the production data corrupted?
DID I MENTION that in order to get into the system in development we have to load in a f*cking production database backup ON OUR DEVELOPMENT MACHINE and then ask one of the users' password to login to it and create an account for ourselves? Seeding? What's that, right?!
Anyway, back to bug fixing. I e-mail the the people responsible for the app and get a production admin account, oh I also can't ssh into it because of policies so I have to do everything over e-mail and figure out what's causing the errors. I somehow also wonder if they have any kind of virtualization in place, giving students a VM to do that stuff in doesn't seem so weird does it ? Even with school policies?
Oh btw, 'deploying' means sending a .zip file to a guy in another building and telling him how to configure it, apparently this resulted in a missing folder that the application needed to work and couldn't make on its own. This after 2 weeks of e-mailing back and forth.
After 3 months i quit out of despair and sadness, and due to the fact that I just couldn't do it anymore. I separated everything into logical subprojects and let the last guy handle it, he was OK with that and understood why I left.
Luckily, around that time I already had an actual job at a software development company :)3 -
(Dev)Life in the past 12 hours
Oh boy have the last 12 hours been a roller coaster ride for me. Noob me decided to "compile" AoSP for my device to get a taste of how custom ROMs are built from source. Overall it was fun but the errors were a very good excercise for googling, SO. Couple stuff I learnt ( possibly useful for anyone who comes here )
* The shebang line ( #!/usr/bin/env python ) on my system translated to Python 3.7 environment instead of the expected Python 2.7. Best solution I think to avoid confusion is to create a python 2.7 environment and source it.
* Get your trees right. A jar file called WfdCommon.jar ( apparently known as wifi-display common ) was the cause of several hours of hunting the fault. My vendor tree somehow didn't have this file so dex2oat was borking out like mad. I'm still amazed how I figured this one out almost by myself. ( Basically I had to check every file included in the boot class path, and find the odd one )
* I wasted a lot of time in finding the right files to change version numbers and all. Maybe I didn't search XDA properly for a guide ?
Overall it was a fun experience. Also if anyone's experienced in this area could you share resources to learn more about custom ROM development? Specifically on the tweaking part where you mix features from different ROMs to make a great ROM ( like AoSP extended or Pixel Experience ). All I could find were on the zips and not on sources.7 -
Frustrated that my build system wasn't recognizing a file change I added to my code. It kept telling me that a function didn't exist in the linked object (linker error). I checked everything and stared at this shit for about 15 minutes or more. The signature matched, the function existed, the relevant source files existed. I was starting to imagine impossible scenarios. I cleaned the project and recompiled. No errors, everything linked just fine. Fuck you? I guess...
So I decided to needed to walk around so I went into my bosses office.
me: I don't want to program anymore.
boss: What do you want to do?
me: Shovel shit.
boss: They are the same thing.
me: True...
TLDR: Tool and possibly skill issue results in frustration and humor.6 -
Dependency hell is the largest problem in Linux.
On Windows, I just download an executeable (.exe) file, and it just works like a charm! But Linux sometimes needs me to install dependencies.
At one point, I nearly broke my operating system while trying to solve dependencies. I noticed that some existing applications refused to start due to some GLIBC error gore. I thought to myself "that thing ain't gonna boot the next time", so I had to restore the /usr/lib/x86_64-linux-gnu/ folder from a backup.
And then there is a new level of lunacy called "conflicting dependencies". I never had such an error on Windows. But when I wanted to try out both vsftpd and proFTPd on Linux, I get this error, whereas on Windows, I simply download an .exe file and it WORKS! Even on Android OS, I simply install an APK file of Amaze File Manager or Primitive FTPd or both and it WORKS! Both in under a minute. But on Linux, I get this crap. Sure, Linux has many benefits, but if one can't simply install a program without encountering cryptic errors that take half a day to troubleshoot and could cause new whack-a-mole-style errors, Linux's poor market share is no surprise.
Someone asked "Why not create portable applications" on Unix/Linux StackExchange. Portable applications can not just be copied on flash drives and to other computers, but allow easily installing multiple versions on a system. A web developer might do so to test compatibility with older browsers. Here is an answer to that question:
> The major argument [for shared libraries] is security, that if there is a vulnerability in a commonly-used library, then only that library has to be updated […] you don't have to have 4 different versions of a library installed
I just want my software to work! Period. I don't mind having multiple versions of libraries, I simply want it to WORK! To hell with "good reasons" for why it doesn't, and then being surprised why Linux has a poor market share. Want to boost Linux market share? SOLVE THIS DAMN ISSUE!.
Understand that the average computer user wants stuff to work out of the box, like it does in Windows.52 -
VSCode is doing really strange things to my language server, in such variety that I'm starting to suspect that it's simply incorrect because it's very unlikely that I'd misunderstand so many distinct things at once.
- The trace level is verbose, yet VSCode absolutely spams the LS with trace: off requests
- the capability update request I used to set file watchers never gets a response even though the standard clearly states that all requests must get responses or progress reports quickly, and I'm not getting file updates even after vscode responds to a file system change. By the way, if file watching is a capability, why can't I set it in the protocol handshake with all the other capabilities?
- my semantic token provider (used for syntax highlighting) is simply ignored, no requests, no errors
- the debug console is spamming editor internal errors2 -
Newbie Linux User - Story about not working GUI
I am a proud Opensuse user for about a year, still struggling with some basic stuff, terminal, etc.
The story begins when a few days ago I try to login to the system. To my trusty Gnome. I get stuck on login loop;
successful login - > black screen for a second - > back to login screen.
Zero feedback, not a single error message
Stress level increases taking in count that I am at a climax at my university with tons of projects on my computer.
I assemble the Team A:
Me, Google, Stackoverflow, and for desperate times Russian Stackoverflow
Over 4 hours, found out that my user is affected by this, tried restoring default Gnome configuration, went through bunch of logs only to find out that every user gets the same errors, still only my not working. Even KDE denied to cooperate with the same result.
So what went wrong you may be thinking.
One line in file replaced by miniconda, that changed the PATH.
Linux is the best detective game that I've ever played.
Is it something that I should get used to?2 -
Been using a jump drive compiled as NTFS for my sneaker net. I copied some file to it from Ubuntu 18.04. I take to Windows machine and it says there are errors (does this a lot). Usually it works fine. The files I copied are not there. They were downloaded web pages as Windows machine is not on net. I noticed the file names have characters like #, ? and ` in them. So I reformat the jump drive to exFat. I copy the files from Ubuntu plus a bunch of other files. It errors like crazy on stuff that copied fine before with NTFS. Not a solution. So I find an alternate downloader for web pages I want to copy (does not have funky characters in filenames). I reformat back to NTFS on jump drive.
So basically if I want to copy files from my Ubuntu system I am stuck with NTFS and always repairing the filesystem. Yes, all my libraries for exFAT are up to date in Ubuntu.
Is there ever going to be a better way?
When is Windows going to grow up and support ext4?
Why?
Its 2019 and we still have incompatible networks and filesystem formats.7 -
Please help me before I get mad,
First day with Linux Mint.
Objective: Make a 3Tb Hdd Read and Write, Right now I can use it only to Read.
Finally Installed Linux after some bumps (bad ISO).
I have 2 HDDs, the SSD with Linux and a 3Tb HDD
Right now the 3T has 4 partitions, one for windows, 3 for personal use with lots of personal stuff I can't lose.
I've been looking for videos, tutorials and the maximum I got was to had one partition mounted as a folder
<code>
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=f0a65631-ccec-4aec-bbf5-393f83e230db / ext4 errors=remount-ro 0 1
/swapfile none swap sw 0 0
UUID=F8F07052F07018D8 /mnt/3T_Rodrigo ntfs-3g rw,auto,users,uid=1000,gid=100,dmask=027,fmask=137,utf8 0 0
</code>
What am I missing?
PS.: Next: Make fingerprint work in Linux14 -
So when windows cannot open a png with its standard image viewer without a file system error and while running sfc/scannow a bsod appears, maybe this old tablet shouldn't be used anymore, which already runs slow as hell...1
-
ENOSPC = random things go wrong.
There are many synonyms for ENOSPC, like "disk full", "space storage full", "space storage exhausted", "no more space left on device", and those other repulsive errors. For the sake of simplicity, I am going to refer to it as ENOSPC.
If you are in this condition on the operating system partition, get out of it quickly or random things will go wrong. Text editors which write directly to a text file rather than creating a temporary file and then replacing the text file could end up blanking the text file, softwares' configuration files might fail saving which causes a reset, and web browsers might spontaneously reset cookies and lose history.
For example, Firefox has created a gap in the web browsing history, as shown here. The history that is now memory-holed initially appeared to have been recorded successfully. Apparently, a failed write to the places.sqlite database when closing the browser created this gap.4