For me was picking the wrong option in raid setup after a disk failure. A funny weekend indeed.

In my defense, it's my first Junior job: +20 experience point. Just curious about yours, so maybe I can learn from mistakes made by other people too.

So what about you?

    This should be Community Wiki
    http://serverfault.com/questions/7902/best-system-administrator-accident http://serverfault.com/questions/5066/biggest-command-line-mistake http://serverfault.com/questions/6844/common-mistakes-made-by-system-administrators-and-how-can-we-avoid-them
  • Zordache, our human card catalog, everybody! Give him a nice round of applause. Tip your waitresses. – Wesley Sep 27 '10 at 22:41
  • @Zoredache, I'm dissapointed. Being the rollerdex is usually *my* job! Be stealin my thunder why don'tcha! – Mark Henderson Sep 27 '10 at 23:15

11 Answers11


Nothing of note from a technical perspective but it took me at least a decade of my career to realise that IT wasn't an end in itself but a simple requirement to better serve my company and customers. This change in attitude stopped me from being that typical know-it-all IT dick that we've all come across and really start to help those around me. This was the start of what has been a far more rewarding (for all parties) period of my career that has made me genuinely happy, liked and fulfilled.

So that would be my biggest mistake - wasting the 90's by thinking the world revolved around me.

My biggest mistake was probably going into technology thinking that other people liked technology, not realizing that people view computers with as much reverence as they view their VCR (remember those?). I was naive and stupid.

I continually struggle with the idea that people have zero initiative to learn how to use their tools to a basic level of skill. My semi-Aspergian mind can't quite grasp why people will profess that they feel stupid or displace their anger at themselves towards hating the machine when they refuse to learn even how to control-alt-del, and get more angry when I ask them for details to clarify the situation as I'm trying to help them (did you do XYZ? What do you mean by you did ABC?) Part of me is always questioning that if they feel stupid having to come to me for help, why didn't they learn how to add a printer (which I've reviewed with them half a dozen times) or -insert basic task here-. I'm not asking them to replace memory or troubleshoot a bad drive sector, just know how to actually use the tool for their jobs.

I have to stand back each time and take stock of the situation and reframe it to keep from getting more frustrated with people.

Essentially I was naive to think that technology that enables us to create music, movies, and learn and explore the world in ways unheard of ten, fifteen, twenty years ago would be understood by people. Instead it's a glorified cornucopia of porn and stupid flash games and memes-of-the-week. Of the content that is being produced rather than consumed, probably ninety percent of it consists of teenagers shooting "LOOK AT ME! I SING GOOD!" webcam videos and "LOOK AT ME! DUCKFACE!" pictures from cell cams.

I came into the field with naive hopes and a complete misunderstanding of human nature. Now I struggle to change my perspective and see things in a different way. I need to remember my rules-of-thumb maxims.

  • The user isn't me.
  • The user doesn't care.
  • The user just wants to get the
    immediate task done and go home.
  • People don't see the potential value in technology.
  • The user lies. They don't want to know why it crashed. Or why it does XYZ.
  • Technology isn't an end unto itself. It's a tool. As far as non-technologists are concerned it's about as fascinating as a tractor or a hammer is to me.
  • The user may unintentionally lie, and when I discover that something else was done or altered, they'll without fail say something like, "Oh, yeah,..."
  • It doesn't matter if they're lying or not.
  • I still hide a slight inner twitch when people talk about how stupid the computer is, how stupid this or that is, knowing that I spent more than a few bucks and more than a couple years studying these "stupid" things to get a comp-sci degree in college. But that's not the user's fault either.

After I realized these things, I had a kind of slump in morale. Still do sometimes. But I think it's a combination of factors that contributes to this. But this frame of mind, the mindset that technology was a great thing that people would and could use to create wonderful content and express themselves, was my greatest technology related blunder, as it completely colored my view of people.

In hindsight I see how naive it all was and how foolish it was to think this way, so I don't need comments telling me what an idiot I am. I fill a position as a cog in a greater machine, and for the users (seeing as I don't work for a technology-based company) I fill a role about as exciting as a custodian. And they don't understand that my role can be as important as a custodian's role either (in some circles I guess they're more commonly called janitors).

I have that I need to look at things with a different perspective. The job I once loved...using computers...is just a job. I no longer try to subconsciously play the martyr of extra hours at work without compensation and get bitter over it (I end up putting in extra hours and losing my lunch times, but I don't get bitter and if something else comes up involving home life or personal life...work takes a back burner.)

The users just want to get their work done, and they don't care how it gets done. If they could most of them wouldn't touch the computer to get their work done; it's a necessary evil.

Work is not my life. I should not let it define my life. Most of us become "the computer guy." I don't find it fulfilling anymore.

I take time to create things with my limited skills. I no longer dwell on disappointment in other people for not using technology the way I think it should be used. Instead, I use it the way I think it should be used.

I got a hobby. It may involve computers, but it's not related to fixing, building, or configuring or anything else in my day job.

I'm no longer defining my life by my job. I'm putting down a line to separate them so I have something of a life.

So that was my biggest blunder, and how I'm trying to remedy it. Maybe others will find something in this philosophy to criticize, or can't relate to it. I'd be interested in finding out if there are others who can relate to something like this though.

I know you were probably looking for something along the lines of, "I rm -fr'd from the root directory!" or "I erased active directory!" and I'm sorry I couldn't really contribute a ha-ha moment of admin stupidity; I know I had more than a couple (at the ISP I used to work at we weren't officially "in" until you made your first stupid whoops moment; I just don't remember mine.) But in terms of biggest blunder, I think the naive mindset really was the blunder that has had the biggest lasting influence on me in my career.

Making changes outside of the change control procedure.

Even if any bad thing that happens unrelated to the change you made, your ass will still get kicked as badly by those higher up the pecking order.

Of course, practially speaking, some changes have to be made faster than the change control procedure allows, but in each case the risk to career is higher. Sometimes it is better to work to rule and let the system fail, than go against the system...

  • Change control procedure? Is this like resume/CV, similar but not quite related to following documented procedures? – Bart Silverstrim Sep 27 '10 at 11:28
    A bit more than that - a change control procedure means that any time you make a change it follows a procedure that ensures you took into account the effects of the change, consulted those people who would be affected by the change, documented the change, tested it, made a roll back plan, and scheduled it in with consideration to all the above. Usually there are other people as gatekeepers at critical points in the process. It means you don't tread on peoples toes, and always have a record of who did what when. And it makes it almost impossible to do anything fast. – dunxd Sep 27 '10 at 11:33

I had a RAID 10 array which had a disk fail around 4:50 PM on Friday afternoon. Instead of heading over to replace the failed disk I said "eh, it's fine... I'm going to Happy Hour." Halfway through happy hour I got some strange alert messages and realized that the other disk in the pair had failed and my raid array was trashed.

Lesson learned, whatever can go wrong will go wrong. I spent the next few days answering complaints and doing restores.

  • I've seen the same on a RAID 5. I asked the DC guys to change the drive but they never made the drive in that night and left it until morning. Of course, a second drive went :). – Jim Sep 27 '10 at 12:27
    To be fair, it sounds like it would have failed while trying to rebuild in the first place, unless you mean that second failure wouldn't have affected the rebuild. – Bart Silverstrim Sep 27 '10 at 13:15
  • I've had the second/critical drive fail during the rebuild... Sigh. – Brian Knoblauch Sep 27 '10 at 15:29
    @Brian-best reason to stagger your drive manufacture batches in RAID arrays. Increase the odds they won't all go kaput close together. – Bart Silverstrim Sep 27 '10 at 19:20

My biggest mistake was listening to an advisor in college who, when I did poorly in a particular math class, calmly informed me, "well you'll never be a programmer." Changed the course of my whole life.

Never listen to people who tell you what you are and aren't capable of.

  • Not knowing the particular class, he might have meant "you'll never be a *good* programmer" (and considering the apparently lack thereof, might have been saving us all some headaches). (No personal offense intended, I know nothing of your sills or abilities) – Chris S Sep 27 '10 at 16:53
  • Engineering calc, the course they looked at to determine whether you got into the CS program there. And she was right - I wasn't going to be a programmer, at least not in the sense that she meant for that particular school, i.e., writing device drivers, OSes, and embedded systems. This was well before .NET and Java, before business applications became the real mainstay that they are today, and there were very few well-known degree paths for such a focus at the time. – Matt DiTrolio Sep 27 '10 at 18:42
    College isn't about learning to program, at least not in a good CS type degree. It's about learning concepts behind how the things work. Today there are plenty of crap courses focused on "this is how to program in Java" or some other lang of choice, but a good CS program teaches you how to think and why it works. "You have to learn WHY things on a starship work..." -James Kirk to Saavik - but programming itself can be self-taught, and actually it must largely be, since languages evolve as quickly as the spoken word. Others will probably flame me for this opinion, though...I'm not a programmer. – Bart Silverstrim Sep 27 '10 at 19:24
  • I've definitely learned that. The best education gives you both a platform from which to absorb new knowledge as it comes out, *and* a context in which to apply it (I went the business apps route). – Matt DiTrolio Sep 27 '10 at 19:29

I once thought "hey, what's this huge executable file in the root directory? I'll just run strip on it..." and then immediately went "uh, oh, that's the kernel, I better do a tape restore Right Now."

Early in my career I allowed myself to be seduced by a flash in the pan proprietary technology that was in high demand for several years, used by large companies with big IT budgets. I felt on top of the world, having my pick of locations and rates but in the back of my mind I knew this was risky.

I kept going from project to project becoming an expert in this niche field. Then the market swung, the company overreached and their contracts and stock plummeted. I found myself suddenly obsolete, thrown into the real world dominated by emerging open source technologies and I was woefully ill-equipped. I was starting over and it took years to get back on track.

It's hard to know if the road you're on is a dead end but I'll never blindly throw all my eggs into one basket again.

I worked in a small-ish division at a multinational technology corporation. There was a database program called ASI, which ran under a terminal emulator on Windows connecting to IBM AS/400. It had a query function, not SQL, but there were individual fields to specify the tables you wanted and how they should be joined. I was querying a database of over a million electronics parts, joining the header part file to the detail/warehouse part file. I somehow managed to leave out the join condition between these two tables on my query definition, which created an effective cross join. Not realizing this, after submitting the query in interactive mode I went to lunch.

When I returned an hour later, people were sitting in the aisles outside their cubes talking to each other. They said the system was so slow it was basically unusable. After about five minutes I had this nasty sinking feeling start growing in the pit of my stomach. It had begun to dawn on me that it could possibly be my query? I checked and found to my horror that yes, indeed, the query was still running an hour or so after I'd submitted it. At the time I couldn't stop interactive queries (they wouldn't stop even if you ended your terminal session). When I checked, the process was taking 99% of all resources and I had to call the server admins and get them to terminate it for me. Surprise, surprise, the system started being responsive for everyone again!

The division using these servers employed hundreds of people and had facilities in at least 3 states spanning the United States. I found out later that people could do literally nothing for over an hour: the scan guns at the warehouse didn't work, the manufacturing plant folks were dead in the water, everyone. I shudder to think of what my little mistake cost in dollars. Chalk one up to experience, I guess.

Side note: the reason I was running my queries in interactive mode instead of submitting them in batch mode was that 1) they ran faster, and 2) I could more easily tell when they were done—I didn't have to keep listing the processes. For quite a while I couldn't "break into" interactive queries to stop them, though, because in the terminal emulation software, even though the SysRq key was bound to the System Request function, pressing the key did nothing. I eventually solved that problem by mapping the SysRq key to a VB macro which used SendKeys to send the SysRq key. Laugh.

Now I'd like to address any future prospective employers who manage to find this: Having had this experience adds value to me as an employee, since I will never make this kind of mistake again! It's much better to hire a seasoned professional than some green kid who has yet to make his big mistake. I have not taken down a system in the 12 years since then, and I work with over 70 SQL Servers, many of them on a daily basis, one of 800GB in size. No runaway cross-joins for me. I am fanatically careful with the systems I use.

  • Argh, been there and done that on an AS/400 before - I was after F3, but somehow managed to hit F5. Lucky I have *SECOFR rights on the box, so I did get to kill it - but not before everybody noticed it grinding to a halt. Bad times. – Ben Pilbrow Sep 27 '10 at 22:36

I was trying to find the time of the last reboot, and instead of entering just that or last | grep reboot. I did a last | reboot on a very important system.

Had to laugh really.

    And you were logged into a very important system as root why? – Chris S Sep 27 '10 at 16:55
    Ah Chris S don't ruin the fun. The only reason people routinely log in as root is to have these hilarious experiences. It is like those camcorder shows where people film their loved ones getting hit in the nuts by a golf ball. They want the misery in the name of humour, THEY DON'T EVEN PLAY GOLF. If people used the root account correctly you wouldn't be able to break things in these crazy ways and we would all miss out on the fun. :) . PsychoSid: +1 – Richard Holloway Sep 27 '10 at 22:39

During my teenage years I spent a few summers working for a local Health Authority in the UK doing helpdesk work. One day, when I was changing backup tapes in the server room, I accidentally kicked the UPS under the desk. Unfortunately my foot connected with the power button on the front which switched it off and cut power to the connected equipment. In this case, the equipment being the server acting as the PDC. If I recall correctly, the BDC was offline at the time.

The users weren't very happy but for some reason my boss was rather more forgiving. Fortunately (for me) I wasn't the first person to have done exactly the same thing.

    Good time to look at servers with two power supplies that split their power sources, eh? – Bart Silverstrim Sep 27 '10 at 15:21
  • ...or move/better protect the UPS. :-) – Brian Knoblauch Sep 27 '10 at 15:30
  • Perhaps they should have changed something the first time it happened... or not given a clumsy 15-year-old access to the server room. – Iain Sep 27 '10 at 17:25
    Maybe the boss was more forgiving because at some level he knew they should have done all the above by now. Of course, the reason always falls into something along the lines of budgets and any of a number of other excuses why doing the right thing is impossible... – Bart Silverstrim Sep 27 '10 at 19:21

nice question indeed..

I have tons of experience from other mistakes because of which I even got fired.

I'm a php web developer with 6 years of experience. This job was really first to me, where MVC framework was used. I really liked patterns a lot and I got through 55 other programmers contest!!! That framework was written by company itself. It's really well documented, it was kind of easy to learn, they had well automated whole work flow of this company. Anyway on the first day, of course I setup my computer and tons of user logins was created for mails, redmine, devwiki and etc... Lead programmer was talking with me about their system every day for about 1 hour explaining basic stuff and rules they are using. This process was really useful, we had the same look on most of IT stuff so it was easy to me to integrate... I've wrote first module for 2 weeks, and another one few days. WHY? BECAUSE: They had very strict rules in backend (administration) side, and the frontend was really ugly, and not so strict at all. I spent 2 weeks with frontend ajax bullshits and 1 day with backend. As the result I wrote next 5 models for just ONE WEEK! in the next week main developer even allowed me to develop framework core and fix some old problems in there, but still director was disappointed on me. Because he appears to be watching me those 2 weeks, and everything was reported to him. So because of stupidity of main developer who could gave me BACKEND rules FIRST, I would make 10x more job then all those 3 other programmers!! When I tried to explain this to director, he said that he can't take any risk more then once.. He wrote a recommendation to be about my "nice code" and paid me for second month, and I got fired.

Why new programmers should suffer for company mistakes???????

