Skip to Content

State of IT

7 posts

A series of articles focusing on the current state of IT and my thoughts on living up to expectations

Posts tagged with State of IT

State of IT Part 7: IT in Video Game Studios

Why helpdesk fundamentals are not enough in an industry where the entire technology stack can shift twice in a decade.

In the previous installments of this series, we have discussed efficiency during lean periods, understanding colleague workflows, security posture, operational resilience, AI guardrails, and the balance between innovation and stability. These topics apply broadly across industries. But there is one sector where every single one of those themes converges with an intensity that few other environments can match.

Video game development.

This is an industry that builds some of the most technically demanding products in the world, products that ship to millions of consumers simultaneously, yet often treats IT as an afterthought. The expectation in many studios is that IT exists to set up machines, reset passwords, and keep the Wi-Fi running. That expectation is not only outdated. It is actively harmful to production.

Not Just Another Tech Company

From the outside, video game studios look like any other technology company. Developers write code. Artists use workstations. Designers collaborate in shared tools. There are servers, networks, and cloud subscriptions.

The resemblance is surface-level.

Underneath, a game studio operates more like a film production crossed with a software engineering firm, running on timelines dictated by hardware manufacturers, platform holders, and a consumer market that has no patience for technical excuses. The production pipeline in a game studio is not a simple sequence of inputs and outputs. It is a dense, interconnected web of proprietary tools, middleware, engine builds, asset management systems, version control at massive scale, render farms, build distribution platforms, QA infrastructure, and live service backends. All of it moving in parallel. All of it interdependent.

An IT team that does not understand this pipeline is not supporting the studio. It is merely occupying space within it.

The Production Pipeline Is the Product

In most industries, IT supports the business process. In game development, IT is embedded within the product pipeline itself. Consider the chain of dependencies in a single day of production at a mid-to-large studio:

  1. Artists check in high-resolution assets through version control systems like Perforce, often pushing hundreds of gigabytes per day across distributed teams.
  2. Those assets are ingested by the game engine's build pipeline, which compiles, cooks, and packages them for target platforms.
  3. Build servers run continuous integration, producing testable builds for QA, design, and leadership review.
  4. QA teams deploy those builds to dev kits or test clouds, test hardware, and cloud-streaming environments.
  5. Multiplayer engineers rely on backend services, databases, and matchmaking infrastructure that must mirror production environments.
  6. Live operations teams monitor telemetry, player data, and service health in real time once the game ships.

If any single link in this chain breaks, the downstream effect is not a minor inconvenience. It is a production stoppage. A failed build server can halt an entire studio's daily progress. A misconfigured Perforce proxy can turn a ten-second file sync into a twenty-minute ordeal, multiplied across hundreds of users. A network bottleneck during asset ingestion can delay milestone submissions by days.

IT teams that view their role as separate from this pipeline will consistently be blindsided by the urgency and complexity of the problems they are asked to solve.

The Tectonic Shifts: How the Ground Moves Twice a Decade

Most industries experience technological change gradually. New software versions roll out. Cloud migrations happen over quarters or years. Hardware refreshes follow predictable depreciation cycles.

Video games do not operate on that cadence.

The games industry is tethered to hardware generations and platform evolution in a way that few other sectors are. When a new console generation launches, when a new graphics API becomes standard, when a major engine overhauls its rendering pipeline, the ripple effects are seismic. These shifts tend to arrive roughly every five to seven years, meaning that within a single decade, the foundational technology that a studio's entire workflow depends upon can change fundamentally. Twice.

This applies to mobile studios just as acutely, though the shifts take a different shape. Mobile game development is governed by the release cycles of Apple and Google. A single iOS update can deprecate rendering frameworks, change how push notifications behave, or alter memory management rules overnight. Android fragmentation introduces its own layer of complexity, where a game must perform acceptably across thousands of device configurations with wildly different chipsets, screen resolutions, and OS versions. When Apple transitioned from OpenGL ES to Metal, or when Google began enforcing 64-bit requirements and target API level mandates, studios that were not prepared lost weeks of production time scrambling to comply.

Consider what has shifted in the last decade alone:

  • Console generations transitioned from the PS4/Xbox One era to the PS5/Xbox Series generation, requiring entirely new dev kit infrastructure, updated SDKs, and new build configurations.
  • Mobile platforms moved through multiple seismic shifts: the deprecation of OpenGL ES in favour of Metal and Vulkan, mandatory 64-bit support, App Tracking Transparency upending analytics and monetisation pipelines, and increasingly aggressive background process restrictions that changed how live games maintain persistent connections.
  • Game engines have moved from largely offline, packaged-build models to live-service, always-connected architectures requiring persistent backend infrastructure. For mobile studios, this shift was not optional. The free-to-play model that dominates mobile demands live operations from day one.
  • Asset fidelity has increased exponentially, with photogrammetry, volumetric capture, and procedural generation placing massive new demands on storage, networking, and compute. Even mobile titles now ship with gigabytes of downloadable assets and require robust CDN strategies for over-the-air content delivery.
  • Remote and distributed development, accelerated by the pandemic, has become a permanent fixture, requiring studios to rethink VPN architecture, remote workstation access, and globally distributed build systems.
  • AI-assisted workflows for content generation, testing, and localisation have begun entering production pipelines, and studios are still determining what infrastructure, governance, and access controls these tools require.

Each of these shifts does not merely add to the existing workload. It restructures it. The IT team that was expertly managing on-premise Perforce servers in 2018 may now need to architect hybrid cloud-edge solutions for globally distributed teams. The mobile studio IT team that once maintained a handful of Mac Minis for iOS builds may now be managing a fleet of Apple Silicon build agents, Android signing infrastructure across multiple keystores, and automated submission pipelines to both app stores simultaneously.

This is not incremental change. It is periodic reinvention.

The Knowledge Gap: Helpdesk Fundamentals Are Not Enough

There is nothing wrong with strong helpdesk skills. Provisioning accounts, imaging machines, managing device inventories, and handling break-fix tickets are all necessary functions. They are the foundation. But in a game studio, they are only the foundation.

The challenge is that many studios, particularly smaller or mid-sized ones, hire IT staff with generalist backgrounds and expect them to operate in an environment that demands specialist knowledge. This is especially common in mobile studios, where the early-stage team is small enough that IT responsibilities are shared informally or handled by a single person wearing multiple hats. The result is a persistent knowledge gap that only becomes visible when it is already causing damage.

An IT administrator in a game studio needs to understand, at minimum:

  1. Version control at scale. Perforce is the industry standard for large binary assets in console and PC development. Mobile studios often start with Git or Git LFS, which works adequately for a single small project but begins to strain under the weight of multiple concurrent titles with large asset repositories. Understanding when and how to migrate, or how to manage branching strategies across several live projects sharing common frameworks, is critical knowledge that a generalist background does not provide.
  2. Build infrastructure. Whether it is Jenkins, TeamCity, Unreal's BuildGraph, Fastlane for mobile, or a custom system, IT must understand how builds are compiled, distributed, and validated. In mobile studios, build infrastructure carries additional complexity: iOS builds require macOS hardware, Android builds require managing SDK versions and NDK configurations, and both platforms demand code signing workflows that are fragile and poorly documented. A build engineer and an IT administrator in this industry share a significant overlap in responsibilities.
  3. Workstation specifications and GPU workflows. Artists, programmers, and technical artists have workstation requirements that are fundamentally different from a standard corporate environment. Mobile studios may underestimate this, assuming that because the target device is a phone, the development hardware can be modest. This is a misconception. Authoring content for mobile still demands capable workstations, and the testing matrix of physical devices that IT must procure, manage, charge, update, and distribute across QA teams is a logistical challenge unto itself.
  4. Network architecture for high-throughput environments. The volume of data moving through a studio's network including asset syncs, build distribution, render output, and telemetry streams, dwarfs typical enterprise traffic. Network design must account for this or production suffers.
  5. Platform-specific compliance and security. Console development requires adherence to strict NDAs and security requirements from platform holders like Sony, Microsoft, and Nintendo. Mobile development carries its own compliance burden: App Store review guidelines that change without warning, Google Play policy updates that can pull a live game from the store, privacy regulations that affect SDK integration, and the constant management of provisioning profiles, certificates, and entitlements that silently expire and break builds at the worst possible moment. IT must understand these requirements at a level that goes well beyond standard corporate policy.
  6. Confidentiality, as we see it. Beyond platform holders, studios also manage NDA and confidentiality obligations with middleware providers, outsourcing partners, and service vendors. An IT team must understand which tools and environments are subject to these agreements, and ensure that access provisioning, data handling, and network segmentation reflect those contractual boundaries. A vendor NDA breach caused by misconfigured access is not a hypothetical. It is a career-ending event for the people responsible.

None of this is exotic knowledge. But it is specialised, and it is rarely part of a traditional IT training path. The expectation that a generalist helpdesk background prepares someone for this environment is one of the most common and most costly misconceptions in the industry.

The Scaling Problem: From One Project to Many

Perhaps nowhere is the gap between generalist IT and production-aware IT more painfully exposed than in mobile studios that experience rapid growth.

The pattern is familiar. A studio launches with a single game. The team is small. Infrastructure is lean, often held together with a combination of cloud services, manual processes, and institutional knowledge stored in a few people's heads. IT, if it exists as a distinct function at all, is reactive and informal. Tickets are Slack messages. Documentation is sparse. It works because the scale is manageable.

Then the game succeeds.

Revenue comes in. The studio greenlights a second project. Then a third. Hiring accelerates. Suddenly there are multiple teams, each with different engine versions, different backend stacks, different build requirements, and different release cadences. The infrastructure that comfortably supported thirty people working on one game cannot support a hundred and fifty people working on four.

This is where the cracks appear.

  1. Identity and access management becomes tangled. What started as a flat permission structure with everyone having access to everything must now be segmented by project, by discipline, by seniority. Platform holder NDAs may require that only specific employees can access certain repositories or dev kits. Onboarding a new hire used to take an afternoon. Now it takes days because nobody has documented which groups, tools, licences, and environments each role requires.
  2. Continuous Access Monitoring. Equally important is what happens after access is granted. Without continuous access monitoring, permissions accumulate and drift. An artist who moved from Project A to Project B six months ago may still have write access to both repositories. A contractor whose engagement ended may still have active credentials. Access reviews in a fast-moving studio feel like overhead until the audit, or the breach, arrives. Automated access monitoring and periodic entitlement reviews are not bureaucratic exercises. They are the minimum standard for a studio handling multiple projects under separate NDAs and compliance requirements.
  3. Build infrastructure does not scale linearly. A single build pipeline for one project is straightforward. Four concurrent pipelines, each with their own platform targets, signing configurations, and release branches, competing for the same build agents and artefact storage, is an entirely different problem. Build queues back up. Developers wait. Production slows.
  4. Tooling sprawl accelerates. Each new project team brings preferences. One team uses Jira, another prefers Linear. One team deploys backends on AWS, another inherited a GCP setup. Without intentional governance, the tool landscape fragments, and IT is left supporting an ever-expanding matrix of platforms with no standardisation and no leverage.
  5. Live operations multiply the surface area. A single live game requires monitoring, incident response, content deployment, and player-facing service management. Multiple live games multiply all of this. Each game has its own release calendar, its own event schedule, its own critical revenue periods. An outage during a limited-time event in one game is a revenue loss measured in real currency. IT must ensure that the infrastructure supporting these services is resilient, observable, and independently manageable.
  6. Technical debt compounds invisibly. The shortcuts that were acceptable at a smaller scale. Examples such as hardcoded configurations, manual deployment steps, undocumented server setups become liabilities. But there is rarely a mandate to address them because leadership is focused on shipping the next game. IT inherits this debt whether or not it was involved in creating it.

The studios that navigate this transition successfully are the ones where IT is involved early in the scaling conversation. Not after the third project has been greenlit and the infrastructure is already straining, but at the point where growth is being planned. IT needs a seat in that room, not to slow things down, but to ensure that the foundation can support what is being built on top of it.

The Cultural Disconnect

There is often a cultural gap between IT departments and production teams in game studios. Developers, artists, and designers are accustomed to working with cutting-edge technology. They push hardware to its limits. They customise their tools extensively. They expect rapid iteration and minimal friction.

In mobile studios, this culture runs particularly hot. The pace of live operations means that production teams are accustomed to shipping updates weekly, sometimes more frequently. They expect environments to be available, builds to be green, and deployments to be seamless. When IT introduces process — change windows, approval gates, access reviews — it can feel like friction being imposed by people who do not understand the urgency.

IT teams that approach this environment with a rigid, policy-first mindset will encounter resistance. Not because production teams are undisciplined, but because the nature of creative production demands flexibility that traditional IT governance models do not always accommodate.

This does not mean security and process should be abandoned. Far from it. As we discussed in Part 3 and Part 5 of this series, security posture and AI governance are non-negotiable. But the approach must be adapted to the context. Lockdown policies that work in a financial services firm will strangle a game studio. Approval workflows designed for quarterly software deployments will be incompatible with a production environment that deploys internal builds multiple times per day and pushes live content updates to millions of players on a weekly cadence.

The most effective IT teams in game studios are those that earn their seat at the production table. They attend sprint reviews. They understand milestone deliverables. They know what "alpha", "beta", and "gold master" mean in console development and what "soft launch", "global launch", and "LiveOps calendar" mean in mobile. They understand that a store submission deadline is not a suggestion. They are not waiting for tickets to arrive. They are anticipating the needs before they become blockers.

Building for the Next Shift

Given that the technology landscape in games will continue to shift, the question is not whether the next disruption is coming. It is whether IT is prepared to absorb it without falling behind.

For mobile studios, the next shifts are already visible on the horizon. Platform holders are tightening privacy controls further. Cross-play and cross-progression between mobile and other platforms are becoming player expectations. Cloud gaming is blurring the line between mobile and console entirely. AI-driven content pipelines are promising to accelerate production but introducing new infrastructure requirements and governance questions that most studios have not yet answered.

As AI tools enter the production pipeline, studios need clear policy frameworks governing their use - what data can be fed into third-party models, how generated assets are reviewed for IP compliance, and who approves the integration of new AI services into production workflows. IT is uniquely positioned to enforce these frameworks at the infrastructure level, controlling which services are accessible, how data flows between internal systems and external APIs, and ensuring that usage is logged and auditable. Without this, AI adoption becomes another vector for shadow IT, as discussed in Part 5 of this series.

Preparation means investing in several areas:

  1. Modular infrastructure. Design systems that can be reconfigured without being rebuilt from scratch. Containerised build environments, infrastructure-as-code, and abstracted storage layers all contribute to adaptability. For studios running multiple live games, modular infrastructure also means shared services such as centralised authentication, common monitoring stacks, and unified artefact repositories, that reduce duplication without creating dangerous single points of failure.
  2. Continuous learning. IT staff in game studios must be given time and resources to stay current with engine updates, platform SDK changes, and emerging tools. In mobile, this includes staying ahead of Apple's WWDC announcements, Google Play policy updates, and the evolving landscape of ad mediation, analytics, and attribution SDKs that live games depend upon. This is not a luxury. It is an operational necessity.
  3. Cross-functional relationships. IT should have direct lines of communication with technical directors, pipeline engineers, and production managers. When IT understands what production is building toward, it can provision proactively rather than reactively. In a multi-project studio, this means IT should have visibility into each project's roadmap, not just its current ticket queue.
  4. Documentation and knowledge transfer. The institutional knowledge of how a studio's pipeline works is often held by a handful of senior engineers. IT should actively participate in documenting these systems so that support continuity does not depend on individual availability. This is doubly important in fast-growing studios where the people who built the original infrastructure are increasingly consumed by the demands of the newest project and unavailable to support the systems they created.

A Note on Recognition

There is an uncomfortable truth worth stating plainly. IT in video game studios is frequently under-resourced, under-recognised, and under-represented in production decisions. Studios will spend millions on user acquisition campaigns and proprietary engine features while running their IT operations on minimal staff and constrained budgets. Mobile studios are especially prone to this because the perceived simplicity of the platform, phrases such as "it is just a mobile phone game" or "these are just casual games", mask the genuine complexity of the infrastructure required to develop, deploy, and operate live games at scale.

This is a structural problem, not an individual one. And it will not change until IT teams demonstrate, consistently, that they understand the production pipeline deeply enough to be considered part of it. This is not about seeking validation. It is about earning the influence needed to make infrastructure decisions that serve the studio's long-term health rather than merely reacting to its short-term emergencies.

Understanding the pipeline is not a bonus qualification. It is the baseline.

The State of IT in Games

The video game industry is similar to other technology sectors in its reliance on infrastructure, security, and operational discipline. It is fundamentally different in its pace of change, the density of its production pipelines, and the degree to which its supporting technology can be reshaped by external forces outside the studio's control.

Mobile game development amplifies these characteristics. The release cycles are faster. The platform shifts are more frequent and less predictable. The scaling challenges are more abrupt. And the expectation that IT can simply "keep things running" without deeply understanding what "things" are and how they connect is more dangerous.

IT teams in this space cannot afford to be generalists who happen to work in games. They need to be technologists who understand game production. The distinction matters because when the next platform shift arrives, when the next engine overhaul lands, when the next wave of tooling transforms how content is created, when the studio's third or fourth live game goes into production and the infrastructure must absorb it without collapsing, it will be the IT teams that understood the pipeline who adapt. Everyone else will be scrambling.

That is the reality of IT in video game studios. The ground moves. The question is whether you are building on bedrock or sand.

State of IT Part 6: Balancing Innovation and Operational Stability

A quiet tension is building within most IT teams.

On one side, there is demand to innovate. Automate more. Integrate AI into workflows. Reduce headcount dependency. Move faster. Deliver more with less. On the other side, there is the unglamorous reality of keeping systems stable. Patch cycles. Identity hygiene. Backup validation. Endpoint drift. License audits. Incident response. The daily grind that nobody celebrates until it fails.

Innovation gets applause. Stability gets silence.

Yet stability is the foundation that makes innovation survivable.

The Illusion of Acceleration

We are in a time when leadership conversations are dominated by speed.

  1. How quickly can we deploy?
  2. How fast can we automate?
  3. How much AI can we embed?

The assumption is that acceleration equals progress. But acceleration without structural maturity creates fragility. If your identity architecture is inconsistent, automating access provisioning will compound those inconsistencies. If your asset inventory is incomplete, AI-driven analytics amplify blind spots. If your governance model is unclear, automation only accelerates chaos.

Innovation in this manner does not compensate for weak foundations. It exposes them.

Stability Is Not Resistance to Change

There is a misconception that teams focused on operational discipline are resistant to innovation.

This is rarely true.

The best operations teams understand a fundamental truth. Stability is not the opposite of innovation. It is the prerequisite for it.

Resilient systems allow experimentation. Documented processes allow safe iteration. Clear ownership allows confident delegation. When fundamentals are strong, innovation becomes additive. When fundamentals are weak, innovation becomes disruptive.

We need to focus on system maturity and stability before we can consider iterating on or innovating existing tools and structures.

The Cost of Ignoring the Base Layer

When innovation initiatives outpace functional stability, the symptoms appear gradually.

Small outages become recurring patterns. Security exceptions multiply. Access reviews become performative. Shadow IT grows quietly. Eventually, the organization does not suffer from a lack of innovation. It suffers from cumulative operational debt. IT then becomes reactive instead of strategic. Teams spend their time firefighting instead of designing. The irony is that the more an organization pushes for innovation without discipline, the less innovative it actually becomes.

A Practical Balancing

Juggling innovation and stability does not require complex frameworks. It needs intentional sequencing.

First, define non-negotiables.

  1. Backup integrity.
  2. Identity hygiene.
  3. Patch compliance.
  4. Monitoring coverage.

These act as foundational controls.

Second, assess operational health before accelerating growth and experimentation. If your incident resolution time is unstable, automation should focus there first.

Third, introduce innovation in limited domains.

  1. Pilot AI in reporting before applying it to access control.
  2. Test automation in non-critical workflows before applying it to production pipelines.

Fourth, preserve human monitoring. Automation decreases manual effort. It does not remove accountability. Innovation should feel like reinforcement, not replacement. This is where, in my humble opinion, most organizations fail.

Leadership Expectations and Reality

Many IT leaders are navigating expectations shaped by headlines rather than infrastructure realities. There is a belief that AI can replace inefficiency. These tools can compensate for process gaps. That digital transformation is primarily about platform adoption.

In practice, transformation is about discipline. It is about clarity in roles. It is about visibility in systems. It is about governance that scales. Technology accelerates what already exists.

If structure exists, it accelerates efficiency. If any disorder exists in your structure, it accelerates instability.

The Human Element

There is another dimension that is often overlooked. Operational dependability is not purely technical.

It is cultural. Teams that value documentation. Teams that respect change control. Teams that escalate early rather than conceal mistakes. These are the teams that innovate sustainably.

When people feel pressured to deliver visible innovation at the expense of quiet stability work, corners are cut. Over time, trust erodes. The strongest IT environments are not the most automated. They are the most accountable.

Accountability > Automation.

Redefining Success

Perhaps the biggest shift required is revising how success is measured.

Not only by how many AI initiatives were launched.
Not only by how many systems were modernized.
But by how many incidents were prevented.
How many risks were mitigated before they happened.
How stable the environment remained during the transformation.

Innovation that destabilizes is not progress. It is a deferred cost.

The State of IT Today

We are not short of tools. We are not short of ambition. What many organizations lack is calibrated pacing.

Balancing innovation and stability is not about slowing down. It is about strengthening the base before increasing velocity. In these uncertain times, the temptation to move fast is understandable. The discipline to move deliberately is what will separate resilient IT teams from reactive ones.

Innovation should expand capability. Stability assures that expansion does not collapse under its own weight.

State of IT Part 5: Guardrails for the Generative Era

Building a Safety Net Against Unchecked AI Tool Usage in the Workplace

AI-powered tools are becoming part of everyday work. Companies now face a challenging balance. Productivity gains and creative leaps are appealing. But risks are real. Data could leak. Compliance could become a problem. To thrive, organizations must build strong protections around AI tool use. This is not just wise. It is necessary for business trust and continuity.

The Growing Attack Surface

AI adoption does not always start with leaders. Employees want to work faster and solve new problems. They may try generative AI tools before IT teams know about them. These tools include chat bots, code helpers, and quick image creators. The number and speed of new AI tools can quickly overwhelm old security methods.

Discovery: Shedding Light on Shadow AI

The first step is to see what is happening. You cannot protect what you cannot see. Some ways to find AI use include:

  • Watch network activity for connections to well-known AI services such as OpenAI, Midjourney, or Anthropic.
  • Scan devices to list browser extensions and desktop apps that use AI.
  • Ask employees through surveys or interviews. Sometimes, a simple question reveals hidden use cases.

There are also less common but important options:

  • Study internal messages for language patterns that suggest AI-generated content. Be sure to respect privacy.
  • Audit API keys. Track which keys are created and used for outside AI services.

Monitoring and Control: Keeping AI Usage in Check

Discovery is just the beginning. The next step is to set up real oversight:

  • Use data loss prevention tools to flag or block uploads to AI services.
  • Limit who can use approved AI tools based on their job, project, or the type of data involved.
  • Create alerts for strange usage. For example, large data uploads or unusual access times.

More advanced controls include:

  • Make lists of allowed or blocked apps. Update these lists as new tools appear.
  • Use special firewalls or gateways that inspect AI traffic and enforce rules.

Blocking: When to Draw a Firm Line

Not all AI tools are safe. Some carry too much risk. To block these, try:

  • Blacklist specific websites or IP addresses to prevent devices from accessing risky AI services.
  • Blacklist certain domains and services entirely if you are unsure about the service provider’s business practices with regard to training their models.
  • Enforce browser rules that stop people from installing unapproved extensions.
  • Use mobile device management to limit AI access on both company and personal devices.

AI rules and laws change fast. Companies need to take several steps to ensure compliance and protection:

  • Map how data moves when AI tools are used. Make sure this meets privacy laws such as GDPR or CCPA.
  • Define clear request and approval procedures for the use of new AI tools.
  • Specify who can submit requests, how requests are submitted, and what information must be included in each request.
  • Check all AI vendors for strong security, privacy, and ethics.
  • Set approval criteria for new AI tools, including vendor security, cost, data handling, and ability to meet regulatory standards.
  • Identify automatic rejection criteria. For example, reject tools that cannot ensure data residency or that do not grant proper intellectual property ownership. This applies to tools that do not integrate with your Identity Provider as well.
  • Keep records of AI use and any exceptions to the rules.
  • Require enterprise-level review for significant decisions, such as tools that impact budgets, require integration with sensitive systems, or could create legal exposure.
  • Consider risks related to budget overruns, unclear ownership of created content, and the difference between code generation and art generation.
  • Ensure use cases align with the company’s strategy and legal requirements.

The Foundation: A Strong AI Usage Policy

Before encouraging AI-driven creativity, set clear rules. A good AI policy should cover practical steps for control and decision-making.

  • What types of AI tool use are allowed or banned?
  • How to handle data at every stage, especially if it is sensitive or regulated.
  • Training for employees on risks and safe habits.
  • Steps for reporting problems or responding to AI misuse.
  • Define who reviews and approves requests for new AI tools. Document criteria for approval, including compliance, costs, and potential risks.
  • List automatic rejection triggers, such as lack of data protection or IP ownership.
  • Require periodic policy and tool reviews at the enterprise level.
  • Address budget risks, ownership of generated intellectual property, and the distinction between code and creative content.

Creative Environments: Fostering Innovation with Boundaries

Creative teams need room to try new things. But they also need limits.

Consider:

  • Setting up sandboxes so AI experiments do not touch real business data.
  • Introduce a new Pipeline: ie, Sandbox, Build, Dev-Test, Pre-Production, Production, and Live. Keep the Sandbox pipeline similar to the Development pipeline, but fully contained.
  • Giving trusted users more access while keeping checks in place.
  • Regularly reviewing both AI tools and the policy as technology changes.

Conclusion

AI tools can change organizations for the better. Without solid safeguards, they can also cause harm. The winners in the generative era will be the IT teams that combine smart discovery, careful monitoring, strong controls, and a clear policy. This approach allows for creativity while keeping risks low.

State of IT Part 4: Operational Resilience in an Era of AI, Automation & Expectations

As we move into 2026, the digital landscape is rapidly evolving, driven by the widespread adoption of artificial intelligence. If recent years have taught us anything, it’s that adapting to transformative technology is no longer optional but vital. In Part 1, we discussed efficiency in lean times, in Part 2, we explored understanding our colleagues’ workflows, and in Part 3, we examined elevating our security posture. Now, we must focus on how to evolve our operations by embracing AI and automation while ensuring business continuity amid constant disruption.

The New Operational Mandate

Today’s IT teams are more than support functions – they are the foundation upon which business outcomes depend. With the integration of AI and automation into daily operations, expectations have shifted: uptime is table stakes, efficiency is assumed, and innovation is demanded, often without increased resources. The ability to harness AI for predictive analytics, intelligent automation, and adaptive decision-making is now critical to resilience and business continuity.

Even small to mid-size organizations now face complexity that rivals that of large enterprises a decade ago. The question is no longer whether you are resilient – it’s how resilient you are, by design.

What Business Continuity Actually Means

Resilience goes beyond backups and firewalls. It’s about:

  1. System Continuity: Can critical services continue (or fail gracefully) during outages?
  2. Predictable Recovery: Do you know how long it will take to restore core functions when something goes wrong?
  3. Adaptive Capacity: Can your team learn from disruption and fortify weak points upstream?
  4. User Experience Stability: Are end users (internal and external) insulated from volatility as much as possible?

Resilience is the architecture, process, and culture. In other words, it is our job to make sure the wheel keeps spinning.

Four Pillars of 2026 Business Continuity

1) AI-Assisted Observability and Response

AI and ML-based monitoring tools are no longer “nice to have”. They are essential for business continuity.

They enable us to:

  • Be able to deliver results with a smaller, tighter team size.
  • Spot abnormal patterns before they become outages.
  • Predict degradation based on trend data.
  • Automate initial diagnostics so human responders can concentrate on resolution

However, automation without context can create noise. Ensure your AI tools are configured to minimize false positives and that human experts validate critical alerts. The synergy between AI-driven insights and human judgment strengthens operational reliability.

2) Fail-Safe Automation

Many organizations have begun to discover that careless automation can lead to fragility. AI-powered automation should not just execute tasks, it should detect and respond to its own failures.

Build checks such as:

  • End-to-end validation after each automated workflow
  • Rollback triggers when thresholds are crossed
  • Simulation/testing environments that mirror production

In 2026, automation without error containment is a recipe for compound outages (self-inflicted disasters, in effect).

3) Decentralized Redundancy

Gone are the days when a single cloud region, a single identity provider, or a single primary data store was sufficient.

BC / Resilience planning in 2026 includes:

  • Multi-region deployments and backups
  • Identity and access alternatives (gateways, multi-auth setups)
  • Cross-provider failover plans

These strategies don’t always require sophisticated AI. Sometimes, even basic, well-documented failover playbooks paired with intelligent automation can significantly increase resilience.

4) Human-Centric Resilience Training

While technology is crucial, people remain at the heart of resilience, even in an AI-driven environment. Wish organizations would realize this sooner than later. Regular resilience drills, such as tabletop exercises where teams simulate incidents, help identify weaknesses that automated systems might miss.

Training should include:

  • Incident command roles
  • Communication rules
  • After-action reviews

A culture that normalizes incident simulation is far better prepared than one that treats outages as rare catastrophes. Meaning, be prepared for constant, smaller threats to be acted upon rather than waiting for a rare, large-scale attack to surface.

The Invisible Imperative: Expectations Management

Resilience isn’t simply technical; it’s communicative. Too often, IT teams build great systems but fail to meet stakeholder expectations. This leads to eleventh-hour crisis mode when SLAs aren’t met, even if a system is technically sound.

You don’t need to promise perfection, but you do need to promise clarity on:

  • What IT can guarantee
  • What IT can reasonably aim for
  • What happens when assumptions break

Clarity creates calm, for the most part.

A Simple Starting Point

If you’re unsure where to begin in your organization, start with a single checklist:

  1. What are the top three services we cannot afford to lose?
    Document dependencies and failure modes for each.
  2. Do we have automated monitoring with useful alerts?
    If not, enable it. Even basic uptime and threshold alerts are a start.
  3. Have we practiced a recovery scenario in the last 90 – 120 days?
    If not, schedule one.
  4. Can non-technical teammates explain how to get help during an outage?
    If not, craft and distribute a simple internal guide.

These steps don’t require budget authorizations, only intention.

Conclusion

In 2026, the State of IT is about more than keeping systems running; it’s about building adaptive, AI-enabled systems that continue to deliver business value even during disruption. Resilience is not the absence of risk; it’s the presence of preparation, adaptability, and the ability to leverage emerging technologies.

As we continue this series, I’ll be exploring specific architectural patterns, real infrastructure setup projects, and stories from teams that built IT infrastructure the hard way: with tight timelines, stringent budgets, and a lean team.

Stay curious, stay adaptive, and use any tool, including AI, to build resilience and stability. Proper automation helps small IT teams be more efficient despite the challenges. Just remember to be diligent with your configurations and human oversight.

State of IT Part 3: Navigating the Threat Landscape

Practical Security Steps for the Modern IT Administrator

As our digital ecosystem evolves, so do the tactics of malicious actors. Cybersecurity is now a fundamental part of every IT administrator’s role, not just a specialized concern for security teams. In this third installment of the State of IT series, we delve into the growing threats targeting both large enterprises and smaller environments, providing effective steps that even novice IT administrators can implement to enhance their security posture.

The Expanding Threat Landscape

Today’s threats are increasingly sophisticated, ranging from ransomware-as-a-service to phishing kits, supply chain attacks, and deepfake-driven social engineering. High-profile breaches may make the news, but many attacks succeed due to a lack of basic security practices.

The misconception that only large organizations are at risk is fading. Small businesses, remote work configurations, and poorly managed environments are increasingly vulnerable to attacks. In this landscape, even the simplest IT practices can offer substantial protection.

Five Simple but Effective Steps Every IT Admin Should Take

  1. Establish a Baseline Security Policy
    A basic security policy, even if it’s just one page, can define acceptable practices for your organization. Include requirements such as:
  • Mandatory use of strong, unique passwords.
  • Locking the screen after a period of inactivity.
  • Prohibiting certain software or plug-ins.
    Tools like Microsoft Intune, Google Workspace Admin Console, or open-source alternatives like Wazuh can help enforce these policies.
  1. Use Multi-Factor Authentication (MFA) Everywhere
    Credentials remain the primary target for attackers. Enabling MFA on all critical accounts and systems, such as email and admin dashboards, adds a vital layer of security. For smaller teams, services like Authy, Microsoft Authenticator, or Google Authenticator are straightforward to implement and train for.
  2. Harden the Network Perimeter
    Even without a dedicated security appliance, you can:
  • Disable unused ports.
  • Change default router credentials.
  • Segregate guest Wi-Fi from internal networks.
  • Use DNS filtering (e.g., Quad9, NextDNS, Cloudflare for Teams) to block known malicious domains.
    If your network includes a firewall such as Fortigate, pfSense, or OPNsense, ensure logging is enabled and alerts configured for suspicious activities.
  1. Secure Endpoint Devices
    While EDR tools may not be feasible for all organizations, you still have options:
  • Uninstall unnecessary software.
  • Set devices to auto-lock after inactivity.
  • Disable USB autorun.
  • Use free or open-source tools like Malwarebytes, ClamAV, or OSQuery for regular endpoint scans and monitoring.
    Encourage regular updates for systems and software. Automating updates with tools such as Patch My PC can alleviate some of the burden.
  1. Prepare for the Worst – Backups and Incident Response
    Security isn’t solely about prevention; it’s also about recovery. Be sure that:
  • At least one automated, offline backup exists.
  • Admins know who to contact during an attack.
  • A simple “what to do if compromised” flowchart is available (even in print form).
    Open-source solutions like Duplicati or Restic, as well as platforms like Backblaze or Wasabi, can provide cost-effective and reliable backup options.

Bonus Tip: Create a Culture, Not Just Controls
Regardless of how advanced your tools are, human error remains a significant vulnerability. Foster a culture of security awareness by:

  • Sharing quick security tips in team communications.
  • Explaining the rationale behind specific security measures.
  • Recognizing and rewarding secure behavior, particularly among non-technical staff.

Make security awareness engaging and relevant. Gamify the learning experience with “phishing simulations” using tools like GoPhish, and discuss actual incidents during team meetings.

Conclusion
As an IT administrator, you are tasked not only with resolving issues as they arise but also with preventing them from occurring in the first place. While security can seem overwhelming, it doesn’t have to be. By taking small, consistent steps towards fortifying your environment, you lay the foundation for long-term resilience.

Although we may not control external threats, we can manage our preparedness from within. Whether you are beginning your IT journey or leading a small team with limited resources, consistency, awareness, and a proactive mindset are crucial.

Stay vigilant, stay resilient and continue to build systems that are worth protecting.

State of IT Part 2: Understanding Organizational Workflows

Alleviating common workflow blockers for colleagues and teams to increase overall process efficiency.

As we venture deeper into our series on enhancing operational efficiency within IT and system administration teams, it’s crucial to extend our focus beyond mere technology. To effectively support our colleagues in different departments, we must develop a thorough understanding of their day-to-day operations and the common challenges they encounter. This level of insight allows us to identify minor yet significant blockers, those small frustrations that can hinder productivity and morale. In addressing these challenges, we foster a collaborative environment that strengthens our organization as a whole.

Understanding the Workflow

Each department has its unique workflows and requirements, and by engaging with our peers, IT can gain valuable insight into their processes. This involves not only listening to their concerns but also observing how they interact with the tools and resources available to them. What often becomes clear is that it’s the minor issues, rather than major technical failures, that frequently impede progress.

Common Blockers and Solutions

  1. Formatting Requirements for Documents: Many teams regularly create reports or presentations that can be time-consuming to format. By creating and distributing standardized templates for common documents, IT can save teams valuable hours. For instance, designing a template for weekly status reports can help streamline the process and ensure consistency.
  2. Translation Needs for Recruitment Documents: In an increasingly diverse workforce, the need for translated materials can be a real challenge, especially in recruitment. IT can assist HR by facilitating the use of translation tools or integrating platforms that allow for quick and easy translations of key documents without disrupting workflow.
  3. Data Tabulation and Visualization in Excel/Word: Teams often find themselves spending excessive time organizing and visualizing data, which can detract from their core responsibilities. Providing training on Excel’s more advanced features or offering automated tools for data analysis can significantly enhance efficiency. Additionally, creating a library of pre-built macros for common tabulations can empower teams to handle data more effectively.
  4. Simplifying Approval Processes: Many departments encounter delays due to cumbersome approval workflows. IT can work collaboratively with these teams to streamline approval processes by leveraging digital signatures, automating notifications, or implementing workflow management tools that keep everyone informed and accountable.
  5. Improving Communication Channels: Oftentimes, miscommunication or lack of clarity around requests can lead to delays. By standardizing communication protocols or investing in project management platforms like Trello or Asana, IT can help ensure that tasks are clearly assigned and tracked, alleviating confusion and enhancing accountability.

Fostering Collaboration and Continuous Improvement

While addressing minor blockers may feel like a patchwork approach, these small improvements can have a substantial cumulative effect on overall productivity. When IT teams take the initiative to identify and resolve minor challenges, they demonstrate their commitment to facilitating smoother operations across the organization.

Moreover, by actively involving ourselves in the daily activities of other teams, we nascently cultivate a culture of collaboration. Regularly scheduled check-ins or workshops with different departments can open lines of communication and be instrumental in continuous improvement efforts.

Conclusion

In this second installment, we have explored the importance of fully understanding the daily workings of other teams within our organizations. IT can significantly enhance departmental productivity and collaboration by identifying common minor blockers and implementing targeted solutions. Our goal is not just to empower the IT department but to create a robust support system that augments the work of all teams. Small improvements can lead to major appreciation for the teams and setting up robust support structures is the key to effectively increasing the value of the IT team in any organization.