Its not often I talk about work, as my clients and their clients identities need to be respected and confidentiality kept to, but I can't really help discussing the problems of the models we have come to know and love.
Many companies now, in terms of IT support, follow either the broad ISO20000/ITIL model or some form of similar models. Whats really critical, however, is not your model, but that what you use and pay for not only covers your needs and your customers needs, but has sufficient and appropriate support, whether you provide it through your own staff or buy the services elsewhere. Training and education is also very important, and I cannot stress enough the criticality of managing your estate in realtime, with full, up-to-date support contracts, proper cover for critical applications and services, and that your staff, contractors and external providers are kept informed on these and any changes.
Its not difficult to implement a proper change control process. Many years ago I worked for one of the biggest engineering firms in the world. For our UK IT divisions internal IT support we had a full team doing change control. And despite our size, we didn't use fancy tools or software: we used a simple spreadsheet but implemented controls through a careful network of trust, meetings, communications and monitoring. It worked. From an audit perspective, of course, nowadays such a system would not be considered ideal. But there are many reasonably simple control systems and record keeping systems for managing systems and users. Sharepoint is an excellent way to implement a simple system as it can manage versions and get sign offs from users. Remedy is a very popular tool but there are many now which are not difficult to implement or use.
Either way, there is no excuse for slipshod controls and decaying services. What seems like an easy out can turn into a very expensive mistake. Take the example of a former client who milked their employees by under-staffing to a point that all of their staff left, leaving little in terms of meaningful documentation and a degraded, broken and decaying infrastructure which is going to cost huge amounts now because the only option was to outsource entirely and hope they could get a reasonable deal. That client ended up paying 100 euro plus VAT an hour just to get an onsite engineer for "emergency" support. And naturally, even the smartest of hands on tech experts couldn't have covered the boundless of scope of what their old staff had been expected to do. They ended up paying a prorata rate of nearly 200k a year which would have paid for full salaries for 4 full time permanent staff. Their "savings" programme of driving out their brightest and best ending up costing them enormous amounts just to tick over.
How (not) to make savings that end up costing you a fortune
1. Not replacing obsolete software and hardware
Hardware outfits give you 3 years warranty for a simple reason: after this time, the odds of hardware failure starts to rise. But not only that, the cost of trying to source old parts rises exponentially after about 5 years from product launch, so many of the more saavy hardware houses no longer sell warranty once a product reaches 5 years. While many servers and PCs are actually quite good, there is a real risk that critical components may be very difficult to find once a product goes end of life. The best solution is to refresh your hardware on a 5 year cycle, and keep waranty in place for the full 5 years.
2. Allowing hardware to run out of warranty
This seems like a saving but in fact it is not. HP for example, have a policy whereby if you bring a device back into warranty in order to get a repair, they bill the cycle from the end of the warranty expiration. So if you wait a year after warranty expires, and then renew for 3 years, its backdated by 1 year so you basically are penalised. If your hardware is out of warranty, don't use it for production.
3. Using OEM versions of software
A lot of hardware comes with OEM, cut down versions of good software, and sometimes hardware houses sell "deals" that roll software in with big hardware sales. Don't be fooled by this: if its being given away cheaply, question where the cuts are being made to facilitate this. You also don't get proper support and it may have a very limited timespan. Some versions also are only cutback versions of the full packages and don't fully cover your needs.
4. Overrelying on key staff
Staff who are overworked and undervalued will get pissed off and leave, simple as that. Worse still, they may cut corners, or put in place workarounds to avoid getting caught up in major issues. My last boss has a great description of one of these guys on a client site: they have one guy looking after a huge estate by himself for several months. My boss said, what he was actually going was holding the wardrobe doors shut when all the crap was inside waiting to burst out. Which it did, when the guy left before they'd even signed a support deal with an outsourcer who was left with a total nightmare.
5. Poor management
There are two obvious ones here. Don't use promotion as a tool to get somebody out of the way or as a pedestal to put somebody on. IT can be complex and expensive and so needs great managers with fine skills in negotiation, cost control and people management. They need to be great communicators and have a real commitment to service provision. Putting somebody into these positions to shove them out of another role is a major recipe for disaster. Do not, at the peril of disaster, hire local GAA players, no matter how much you want to play up your "local team player" image. IT needs sophistication and sharp modern business skills, not the small team mindset of local amateur sports. Failed captains of once great counties are obvious no-nos and should be avoided at all costs: if your county took 5 years to recover after your knuckle dragging moron ruined their might, you certainly don't want this retard wrecking your IT services.
6. Training and documentation
You can't really have the former without the latter. The bottom line is this: know what you've got, what skills you need to support it, buy in what you cannot do to the level you require and keep records of gaps and try to upskill existing staff who at least know your customers. Update all documentation at least annually and get it signed off by senior staff members. Most importantly, incorporate your change records into your main documentation body so troubleshooters can fix problems that might have occured as a result of a prior change.
7. Grant minimum security levels
This is usually fussed about when allocating 3rd parties permissions but generally, they tend to forget what they have and can do. The ones you need to watch are your own staff. This goes to all levels - from access to data suites to admin access on servers. Most guidelines recommend using security groups to control individual rights and never using local accounts to grant administrative permissions. Always remove leavers and movers from their old groups when roles change - you don't want them meddling with other peoples work.
8. Force users to "own" their own data
This especially applies to mass file shares. Don't just keep increasing limits to accomodate lazy users: you'll just quickly run out of space. Archive as much as possible.
9. Always keep a good backup
There is just no excuse for poor backup policy. Use a managed solution if you have more than 20 servers and check that your policies on retention and frequency match your business needs. Label and store media carefully and use a reputable offsite provider if you can. Don't rely on their trust either: a former client of mine had their entire end of year set lost by a careless offsite provider.
10. Retire old stuff properly
Have a proper decommissioning process. Don't just power stuff off and leave it there. Destroy or securely wipe old disks, derack and remove old hardware, label if available for reuse and remove it from DNS, AD, and your backup set. Take a full good backup before you go and manually expire unneeded backups (Commvault in particular sets automated retention policies that will keep the last 10 days of backups on retired servers unless you manually remove them). Note what licenses are in place as some can be reused. Recover good hardware such as HBAs and disk arrays. Reuse of HBAs can actually be a cost saver.
The main things I would focus on are keeping a proper register of hardware, including location, serials, warranty status, support information, ILO/DRAC details and software license information. Know who "owns" the service provided and keep this up to date. Minimise root and admin access. Monitor and document all changes. Don't leave broken stuff in a degraded state. Finally, make sure your staff are valued and rewarded because they are the ones who will get the work done.