Skip to Content

Vijay

17 posts

Posts by Vijay

State of IT Part 7: IT in Video Game Studios

Why helpdesk fundamentals are not enough in an industry where the entire technology stack can shift twice in a decade.

In the previous installments of this series, we have discussed efficiency during lean periods, understanding colleague workflows, security posture, operational resilience, AI guardrails, and the balance between innovation and stability. These topics apply broadly across industries. But there is one sector where every single one of those themes converges with an intensity that few other environments can match.

Video game development.

This is an industry that builds some of the most technically demanding products in the world, products that ship to millions of consumers simultaneously, yet often treats IT as an afterthought. The expectation in many studios is that IT exists to set up machines, reset passwords, and keep the Wi-Fi running. That expectation is not only outdated. It is actively harmful to production.

Not Just Another Tech Company

From the outside, video game studios look like any other technology company. Developers write code. Artists use workstations. Designers collaborate in shared tools. There are servers, networks, and cloud subscriptions.

The resemblance is surface-level.

Underneath, a game studio operates more like a film production crossed with a software engineering firm, running on timelines dictated by hardware manufacturers, platform holders, and a consumer market that has no patience for technical excuses. The production pipeline in a game studio is not a simple sequence of inputs and outputs. It is a dense, interconnected web of proprietary tools, middleware, engine builds, asset management systems, version control at massive scale, render farms, build distribution platforms, QA infrastructure, and live service backends. All of it moving in parallel. All of it interdependent.

An IT team that does not understand this pipeline is not supporting the studio. It is merely occupying space within it.

The Production Pipeline Is the Product

In most industries, IT supports the business process. In game development, IT is embedded within the product pipeline itself. Consider the chain of dependencies in a single day of production at a mid-to-large studio:

  1. Artists check in high-resolution assets through version control systems like Perforce, often pushing hundreds of gigabytes per day across distributed teams.
  2. Those assets are ingested by the game engine's build pipeline, which compiles, cooks, and packages them for target platforms.
  3. Build servers run continuous integration, producing testable builds for QA, design, and leadership review.
  4. QA teams deploy those builds to dev kits or test clouds, test hardware, and cloud-streaming environments.
  5. Multiplayer engineers rely on backend services, databases, and matchmaking infrastructure that must mirror production environments.
  6. Live operations teams monitor telemetry, player data, and service health in real time once the game ships.

If any single link in this chain breaks, the downstream effect is not a minor inconvenience. It is a production stoppage. A failed build server can halt an entire studio's daily progress. A misconfigured Perforce proxy can turn a ten-second file sync into a twenty-minute ordeal, multiplied across hundreds of users. A network bottleneck during asset ingestion can delay milestone submissions by days.

IT teams that view their role as separate from this pipeline will consistently be blindsided by the urgency and complexity of the problems they are asked to solve.

The Tectonic Shifts: How the Ground Moves Twice a Decade

Most industries experience technological change gradually. New software versions roll out. Cloud migrations happen over quarters or years. Hardware refreshes follow predictable depreciation cycles.

Video games do not operate on that cadence.

The games industry is tethered to hardware generations and platform evolution in a way that few other sectors are. When a new console generation launches, when a new graphics API becomes standard, when a major engine overhauls its rendering pipeline, the ripple effects are seismic. These shifts tend to arrive roughly every five to seven years, meaning that within a single decade, the foundational technology that a studio's entire workflow depends upon can change fundamentally. Twice.

This applies to mobile studios just as acutely, though the shifts take a different shape. Mobile game development is governed by the release cycles of Apple and Google. A single iOS update can deprecate rendering frameworks, change how push notifications behave, or alter memory management rules overnight. Android fragmentation introduces its own layer of complexity, where a game must perform acceptably across thousands of device configurations with wildly different chipsets, screen resolutions, and OS versions. When Apple transitioned from OpenGL ES to Metal, or when Google began enforcing 64-bit requirements and target API level mandates, studios that were not prepared lost weeks of production time scrambling to comply.

Consider what has shifted in the last decade alone:

  • Console generations transitioned from the PS4/Xbox One era to the PS5/Xbox Series generation, requiring entirely new dev kit infrastructure, updated SDKs, and new build configurations.
  • Mobile platforms moved through multiple seismic shifts: the deprecation of OpenGL ES in favour of Metal and Vulkan, mandatory 64-bit support, App Tracking Transparency upending analytics and monetisation pipelines, and increasingly aggressive background process restrictions that changed how live games maintain persistent connections.
  • Game engines have moved from largely offline, packaged-build models to live-service, always-connected architectures requiring persistent backend infrastructure. For mobile studios, this shift was not optional. The free-to-play model that dominates mobile demands live operations from day one.
  • Asset fidelity has increased exponentially, with photogrammetry, volumetric capture, and procedural generation placing massive new demands on storage, networking, and compute. Even mobile titles now ship with gigabytes of downloadable assets and require robust CDN strategies for over-the-air content delivery.
  • Remote and distributed development, accelerated by the pandemic, has become a permanent fixture, requiring studios to rethink VPN architecture, remote workstation access, and globally distributed build systems.
  • AI-assisted workflows for content generation, testing, and localisation have begun entering production pipelines, and studios are still determining what infrastructure, governance, and access controls these tools require.

Each of these shifts does not merely add to the existing workload. It restructures it. The IT team that was expertly managing on-premise Perforce servers in 2018 may now need to architect hybrid cloud-edge solutions for globally distributed teams. The mobile studio IT team that once maintained a handful of Mac Minis for iOS builds may now be managing a fleet of Apple Silicon build agents, Android signing infrastructure across multiple keystores, and automated submission pipelines to both app stores simultaneously.

This is not incremental change. It is periodic reinvention.

The Knowledge Gap: Helpdesk Fundamentals Are Not Enough

There is nothing wrong with strong helpdesk skills. Provisioning accounts, imaging machines, managing device inventories, and handling break-fix tickets are all necessary functions. They are the foundation. But in a game studio, they are only the foundation.

The challenge is that many studios, particularly smaller or mid-sized ones, hire IT staff with generalist backgrounds and expect them to operate in an environment that demands specialist knowledge. This is especially common in mobile studios, where the early-stage team is small enough that IT responsibilities are shared informally or handled by a single person wearing multiple hats. The result is a persistent knowledge gap that only becomes visible when it is already causing damage.

An IT administrator in a game studio needs to understand, at minimum:

  1. Version control at scale. Perforce is the industry standard for large binary assets in console and PC development. Mobile studios often start with Git or Git LFS, which works adequately for a single small project but begins to strain under the weight of multiple concurrent titles with large asset repositories. Understanding when and how to migrate, or how to manage branching strategies across several live projects sharing common frameworks, is critical knowledge that a generalist background does not provide.
  2. Build infrastructure. Whether it is Jenkins, TeamCity, Unreal's BuildGraph, Fastlane for mobile, or a custom system, IT must understand how builds are compiled, distributed, and validated. In mobile studios, build infrastructure carries additional complexity: iOS builds require macOS hardware, Android builds require managing SDK versions and NDK configurations, and both platforms demand code signing workflows that are fragile and poorly documented. A build engineer and an IT administrator in this industry share a significant overlap in responsibilities.
  3. Workstation specifications and GPU workflows. Artists, programmers, and technical artists have workstation requirements that are fundamentally different from a standard corporate environment. Mobile studios may underestimate this, assuming that because the target device is a phone, the development hardware can be modest. This is a misconception. Authoring content for mobile still demands capable workstations, and the testing matrix of physical devices that IT must procure, manage, charge, update, and distribute across QA teams is a logistical challenge unto itself.
  4. Network architecture for high-throughput environments. The volume of data moving through a studio's network including asset syncs, build distribution, render output, and telemetry streams, dwarfs typical enterprise traffic. Network design must account for this or production suffers.
  5. Platform-specific compliance and security. Console development requires adherence to strict NDAs and security requirements from platform holders like Sony, Microsoft, and Nintendo. Mobile development carries its own compliance burden: App Store review guidelines that change without warning, Google Play policy updates that can pull a live game from the store, privacy regulations that affect SDK integration, and the constant management of provisioning profiles, certificates, and entitlements that silently expire and break builds at the worst possible moment. IT must understand these requirements at a level that goes well beyond standard corporate policy.
  6. Confidentiality, as we see it. Beyond platform holders, studios also manage NDA and confidentiality obligations with middleware providers, outsourcing partners, and service vendors. An IT team must understand which tools and environments are subject to these agreements, and ensure that access provisioning, data handling, and network segmentation reflect those contractual boundaries. A vendor NDA breach caused by misconfigured access is not a hypothetical. It is a career-ending event for the people responsible.

None of this is exotic knowledge. But it is specialised, and it is rarely part of a traditional IT training path. The expectation that a generalist helpdesk background prepares someone for this environment is one of the most common and most costly misconceptions in the industry.

The Scaling Problem: From One Project to Many

Perhaps nowhere is the gap between generalist IT and production-aware IT more painfully exposed than in mobile studios that experience rapid growth.

The pattern is familiar. A studio launches with a single game. The team is small. Infrastructure is lean, often held together with a combination of cloud services, manual processes, and institutional knowledge stored in a few people's heads. IT, if it exists as a distinct function at all, is reactive and informal. Tickets are Slack messages. Documentation is sparse. It works because the scale is manageable.

Then the game succeeds.

Revenue comes in. The studio greenlights a second project. Then a third. Hiring accelerates. Suddenly there are multiple teams, each with different engine versions, different backend stacks, different build requirements, and different release cadences. The infrastructure that comfortably supported thirty people working on one game cannot support a hundred and fifty people working on four.

This is where the cracks appear.

  1. Identity and access management becomes tangled. What started as a flat permission structure with everyone having access to everything must now be segmented by project, by discipline, by seniority. Platform holder NDAs may require that only specific employees can access certain repositories or dev kits. Onboarding a new hire used to take an afternoon. Now it takes days because nobody has documented which groups, tools, licences, and environments each role requires.
  2. Continuous Access Monitoring. Equally important is what happens after access is granted. Without continuous access monitoring, permissions accumulate and drift. An artist who moved from Project A to Project B six months ago may still have write access to both repositories. A contractor whose engagement ended may still have active credentials. Access reviews in a fast-moving studio feel like overhead until the audit, or the breach, arrives. Automated access monitoring and periodic entitlement reviews are not bureaucratic exercises. They are the minimum standard for a studio handling multiple projects under separate NDAs and compliance requirements.
  3. Build infrastructure does not scale linearly. A single build pipeline for one project is straightforward. Four concurrent pipelines, each with their own platform targets, signing configurations, and release branches, competing for the same build agents and artefact storage, is an entirely different problem. Build queues back up. Developers wait. Production slows.
  4. Tooling sprawl accelerates. Each new project team brings preferences. One team uses Jira, another prefers Linear. One team deploys backends on AWS, another inherited a GCP setup. Without intentional governance, the tool landscape fragments, and IT is left supporting an ever-expanding matrix of platforms with no standardisation and no leverage.
  5. Live operations multiply the surface area. A single live game requires monitoring, incident response, content deployment, and player-facing service management. Multiple live games multiply all of this. Each game has its own release calendar, its own event schedule, its own critical revenue periods. An outage during a limited-time event in one game is a revenue loss measured in real currency. IT must ensure that the infrastructure supporting these services is resilient, observable, and independently manageable.
  6. Technical debt compounds invisibly. The shortcuts that were acceptable at a smaller scale. Examples such as hardcoded configurations, manual deployment steps, undocumented server setups become liabilities. But there is rarely a mandate to address them because leadership is focused on shipping the next game. IT inherits this debt whether or not it was involved in creating it.

The studios that navigate this transition successfully are the ones where IT is involved early in the scaling conversation. Not after the third project has been greenlit and the infrastructure is already straining, but at the point where growth is being planned. IT needs a seat in that room, not to slow things down, but to ensure that the foundation can support what is being built on top of it.

The Cultural Disconnect

There is often a cultural gap between IT departments and production teams in game studios. Developers, artists, and designers are accustomed to working with cutting-edge technology. They push hardware to its limits. They customise their tools extensively. They expect rapid iteration and minimal friction.

In mobile studios, this culture runs particularly hot. The pace of live operations means that production teams are accustomed to shipping updates weekly, sometimes more frequently. They expect environments to be available, builds to be green, and deployments to be seamless. When IT introduces process — change windows, approval gates, access reviews — it can feel like friction being imposed by people who do not understand the urgency.

IT teams that approach this environment with a rigid, policy-first mindset will encounter resistance. Not because production teams are undisciplined, but because the nature of creative production demands flexibility that traditional IT governance models do not always accommodate.

This does not mean security and process should be abandoned. Far from it. As we discussed in Part 3 and Part 5 of this series, security posture and AI governance are non-negotiable. But the approach must be adapted to the context. Lockdown policies that work in a financial services firm will strangle a game studio. Approval workflows designed for quarterly software deployments will be incompatible with a production environment that deploys internal builds multiple times per day and pushes live content updates to millions of players on a weekly cadence.

The most effective IT teams in game studios are those that earn their seat at the production table. They attend sprint reviews. They understand milestone deliverables. They know what "alpha", "beta", and "gold master" mean in console development and what "soft launch", "global launch", and "LiveOps calendar" mean in mobile. They understand that a store submission deadline is not a suggestion. They are not waiting for tickets to arrive. They are anticipating the needs before they become blockers.

Building for the Next Shift

Given that the technology landscape in games will continue to shift, the question is not whether the next disruption is coming. It is whether IT is prepared to absorb it without falling behind.

For mobile studios, the next shifts are already visible on the horizon. Platform holders are tightening privacy controls further. Cross-play and cross-progression between mobile and other platforms are becoming player expectations. Cloud gaming is blurring the line between mobile and console entirely. AI-driven content pipelines are promising to accelerate production but introducing new infrastructure requirements and governance questions that most studios have not yet answered.

As AI tools enter the production pipeline, studios need clear policy frameworks governing their use - what data can be fed into third-party models, how generated assets are reviewed for IP compliance, and who approves the integration of new AI services into production workflows. IT is uniquely positioned to enforce these frameworks at the infrastructure level, controlling which services are accessible, how data flows between internal systems and external APIs, and ensuring that usage is logged and auditable. Without this, AI adoption becomes another vector for shadow IT, as discussed in Part 5 of this series.

Preparation means investing in several areas:

  1. Modular infrastructure. Design systems that can be reconfigured without being rebuilt from scratch. Containerised build environments, infrastructure-as-code, and abstracted storage layers all contribute to adaptability. For studios running multiple live games, modular infrastructure also means shared services such as centralised authentication, common monitoring stacks, and unified artefact repositories, that reduce duplication without creating dangerous single points of failure.
  2. Continuous learning. IT staff in game studios must be given time and resources to stay current with engine updates, platform SDK changes, and emerging tools. In mobile, this includes staying ahead of Apple's WWDC announcements, Google Play policy updates, and the evolving landscape of ad mediation, analytics, and attribution SDKs that live games depend upon. This is not a luxury. It is an operational necessity.
  3. Cross-functional relationships. IT should have direct lines of communication with technical directors, pipeline engineers, and production managers. When IT understands what production is building toward, it can provision proactively rather than reactively. In a multi-project studio, this means IT should have visibility into each project's roadmap, not just its current ticket queue.
  4. Documentation and knowledge transfer. The institutional knowledge of how a studio's pipeline works is often held by a handful of senior engineers. IT should actively participate in documenting these systems so that support continuity does not depend on individual availability. This is doubly important in fast-growing studios where the people who built the original infrastructure are increasingly consumed by the demands of the newest project and unavailable to support the systems they created.

A Note on Recognition

There is an uncomfortable truth worth stating plainly. IT in video game studios is frequently under-resourced, under-recognised, and under-represented in production decisions. Studios will spend millions on user acquisition campaigns and proprietary engine features while running their IT operations on minimal staff and constrained budgets. Mobile studios are especially prone to this because the perceived simplicity of the platform, phrases such as "it is just a mobile phone game" or "these are just casual games", mask the genuine complexity of the infrastructure required to develop, deploy, and operate live games at scale.

This is a structural problem, not an individual one. And it will not change until IT teams demonstrate, consistently, that they understand the production pipeline deeply enough to be considered part of it. This is not about seeking validation. It is about earning the influence needed to make infrastructure decisions that serve the studio's long-term health rather than merely reacting to its short-term emergencies.

Understanding the pipeline is not a bonus qualification. It is the baseline.

The State of IT in Games

The video game industry is similar to other technology sectors in its reliance on infrastructure, security, and operational discipline. It is fundamentally different in its pace of change, the density of its production pipelines, and the degree to which its supporting technology can be reshaped by external forces outside the studio's control.

Mobile game development amplifies these characteristics. The release cycles are faster. The platform shifts are more frequent and less predictable. The scaling challenges are more abrupt. And the expectation that IT can simply "keep things running" without deeply understanding what "things" are and how they connect is more dangerous.

IT teams in this space cannot afford to be generalists who happen to work in games. They need to be technologists who understand game production. The distinction matters because when the next platform shift arrives, when the next engine overhaul lands, when the next wave of tooling transforms how content is created, when the studio's third or fourth live game goes into production and the infrastructure must absorb it without collapsing, it will be the IT teams that understood the pipeline who adapt. Everyone else will be scrambling.

That is the reality of IT in video game studios. The ground moves. The question is whether you are building on bedrock or sand.

To Serve a God

In the hallowed Halls of Anorra, everyone must follow the Clear Doctrine. The holy words of guidance that the Queen of Wails herself had put forth for her followers. These words were sculpted out of Ghise, an extremely valuable form of eternal ice. The massive, polished, twenty foot tall tablet stood proud at the entrance of the black ice structure dominating the Northern glacier of the world.

"Every visitor must be asked, politely if they are esteemed guests, and ordered if they are not, to read the words of the queen before they are allowed to enter."

Rillegh's calm yet clear voice gently echoed the empty entrance of the palace. The old Elf was using a form of self-created magic to clean and shine the Ghise tablet. As usual, he was lecturing the younger members of the servant force on the proper etiquette to be followed inside the Halls. Most of the people standing around the old Elf and nodding already knew his entire speech and some were even able to recite it in reverse. But none would dare skip the daily speech provided by old Rillegh.

Despite his overly fussy attitude when it comes to the Doctrine and staff manners, he was considered as the most knowledgeable of the servant force in matters pertaining to the queen, everyday magic and conduct.

A few ancient mages and wizards among the Halls' staff knew him by an old title, the Western Whirlwind of the Black Feather. Rillegh would laugh gently if someone mentioned it to him these days. He had completely surrendered himself to the service of the queen once he had witnessed her power in person.

If one could observe the movements of the members of the staff force around the massive structure throughout a day, they could see the orderly manner each and every member moved. Their movement patterns seem predefined, their tasks seemed preordained and every single person seemed to know where to move next.


A young boy of fifteen was sweating profusely as he lay on his small, feather bed. He was currently inside his allocated servant room. He could not believe his rotten luck, and the unexpectedly strange path his life had taken in the last year. His seemingly great stroke of luck for having passed the Flight of Crows exam and getting selected as a trainee Crow assigned to Terand ended abruptly with the complete and utter destruction of the training castle, Terand.

The vicious and explosive death of one of the most feared wizards of the continent, Blood Crow, happened just outside his room. And since a strange fire had started in his room, he was suspected of colluding with the blood mage and was interrogated for weeks. His former mentor vouched for his innocence, and his barely average abilities in magic craft, leading to him being let outside with only five fire lashes.

The boy wanted to resign and move back to his family, and take up hawking like they did. Instead, he was assigned to the Lost Crows Legion, a legion that he had never heard of. It took him six months of gruelling journey from the Terand in the County of Kull, all the way across the Sea of Saffron and the Northern Pearls ocean. He had lost count of how many weeks he had spent being sea sick on the ships. The guards kept handing him over to other guards along the way. Their coats of arms kept changing colours every other week that he lost track after the fourth set.

He now knew where he had reached. The Halls of Anorra.

The crazy myth about the Queen of Wails was true. The Queen of Wails was alive! She lives! She walks these halls! And I am here to serve that bitch!

His body shook. And sweat was drenching his back. He had heard stories of how the evil queen razed the entire Western Empire to dust. The Empire was expanding its influence everywhere. They had massive battalions of Humans, Elves and Iguli. The lizardfolk were the best craftsmen in the world. But, all of them, every single one of the soldiers who stood against her, were wiped out in a single attack. The millions of ice sculptures that remained standing after she had waved her arm, had melted by the following noon leaving the entire Empire only a remnant of memories.

Travis shook again. He had heard stories, and myths about the Queen of Wails. And now seeing her in person was causing his entire body to freeze up in fear. He wanted to talk to old Rillegh alone. He had heard that Rillegh was once a Crow. So, maybe he would know a way out of here. Just anywhere else was fine. Travis had wanted a simple life, he wanted to work in research, with books, and maybe in a library somewhere warm. And being at the epicenter of the Northern Ice Storm was not his idea of calm.


Today was a special day.

Rillegh was inspecting the staff force working diligently around the black ice outer palace, the ice gardens and the Halls. Everyone had proper tasks assigned to them. The old Elf was walking in a proud, and noble manner with a young boy holding a small notebook following him.

"Are the hedges trimmed yet?"
"Yes, sir Mister Rillegh. I checked it last night and again this morning."
"And the gates polished?"
"Yes, sir Mister Rillegh. I checked the shine on them an hour ago."
"Good. Everything seems to be in order. Travis, go stand with the welcoming team. The Queen must be arriving very soon."
"Yes, sir."

Travis bowed once and quickly walked over to the entrance of the palace where three dozen servants and staff members were standing evenly on both sides of the massive black ice gate ready to welcome their queen. He joined the one side with an odd member count.

Few minutes in, a loud 'Woo... woo...' noise could be heard and everyone looked up the sky. They could see a beautiful and regal chariot being pulled by Griffins was slowly landing in front of the castle. The doors with the blue crossed wings crest opened and Travis saw a long shapely leg step outside. Wearing a white and cyan diamond studded dress that seemed to be flowing like the surface of clear water, a woman stepped out.

Her face could not be described in words. She was fair, if it was the fairest description of the word. She had red, luscious lips the color of blood. Her long flowing, silken hair was white, as if the famed White Silk of the East had come to life. And her tall and curvy figure was attractive, if attraction also meant complete surrender.

The woman looked regal and royal. But she could not be compared to the Counts and High nobles of the Elven lineage that Travis had seen visiting the Terand. No, she was regal as if she deserved to lord over every known and unknown race of mortals. With her first step, all the servants bowed in unison. Travis stood there stunned and too shocked to move for a second. The next second, he felt an invisible power force his upper body to bend. The boy came to his senses and immediately bowed as he was supposed to.

The queen walked facing forward and gently greeted the staff with a smile and a wave as she passed them. Travis felt like he was in a lot of trouble and his intuition soon became real. As soon as the queen had entered the castle and the inner doors had closed, he felt his body being lifted up by a strong magical force as he was turned around to face an angry old Elf. A vein was popping in his forehead as Rillegh spat out, "What... in her name, was that?"

Travis stuttered his response, "I-I... did not know sir! My body refused to move! I was frozen in place, sir!"

The angry Elf raised his hand as if he was going to slap the boy. But with a flick of his wrist, he dropped him to the ground. There were a lot of eyes watching. He quickly turned around and motioned the other servants and staff.

"Back to work! All of you! Break is over!"

Travis stood up massaging his now bruised elbow. He looked at the old Elf in fear. But to his surprise, the Elf had a different expression in his face. "The queen is very attractive to look at the first time." He paused, looking at the confused state the boy seemed to be in. "Get up. And walk with me."

Travis brushed the dirt off his robes and followed after the old Elf who was now walking towards the western side of the castle outskirts. When Rillegh confirmed that Travis and himself were the only two people still outside in the vicinity, he resumed talking in a lower volume. "I have cast a tiny sound barrier around us."

He looked at the boy once and waited until he nodded. Then continued, "The queen is not of any known race in the world. I should have warned you earlier, she carries an aura around her that would make anyone with a weak will want to prostrate themselves under her feet. The stronger her intention, the stronger that aura gets."

The old Elf was looking ahead while talking and the boy was slowly nodding and listening.

"What you had mentioned last week. That you were shipped here for as punishment, was a lie." Travis looked shocked. He did not lie about his situation. He wanted to argue. But the old Elf raised his hand up and continued talking. "They lied to you. You were sent here because of the ancient agreement between the Queen of Wails and the Eastern Bismuth."

Travis' eyes shook wildly. He bit his lips to prevent himself from crying. He was betrayed by the people he trusted. His master, his teachers, the Count, the Crow Perch, they had all lied to him. He felt small, and insignificant. He felt angry, and wanted to scream.

The old Elf continued as if he did not even notice the changes in the boy, "She needs to feed on energy to maintain her looks. The life force and will of the servants and the staff force here is what she feeds on. Every three months, the Eastern Bismuth and the Western Empire sends her a low quality mage, a high quality soldier and two civilians as per the agreement. You know which category you fall under." His pale blue eyes looked at the fifteen year old Human boy with pity. Travis knew, a low quality mage. That's all he was.

Travis then asked a question. "If that is the case, how come you are here, sir? Aren't you from the Western continent? And you must have been a powerful Crow. You don't fall into any of the three categories."

Rillegh sighed slowly.

"The current Western Empire is but a shadow of the old one. The old Emperor was a beggar by birth. He clawed his way to the top of the food chain, becoming the most powerful commander known in the history of the continent. But, alas! Everything he had built crumbled to dust... no, to ice."

Travis nodded in agreement. He had heard the story before. The old Emperor was said to be one step closer to godhood. Travis was too inexperienced to understand what that meant or how powerful gods were. Because until last month, he did not believe gods existed. Now, he worked for one.

"One small mistake..."

Travis looked up at the old Elf.

"He made one small mistake in calculating the direction of the flow." Rillegh had a wistful look in his face. And his eyes became sharp and angry.

"S-sir? Are you alright? What was this mistake you mentioned?"

The Elf sighed again. "He touched a different river of power by accident." Travis was confused. River of power? Is he talking about the source of magic?

"Are there?... Is-Is there more than one river of power, sir?"

The Elf suddenly looked back at the boy incredulously. "What are they even teaching kids these days?" He sighed once more before continuing. "There are hundreds of thousands of rivers, boy! Each one having a strong root. Old gods, new gods, known gods, unknown gods, forgotten gods that are nothing but puddles of dead power. The rivers of magic each lead to one. You just have to be powerful enough to know where you are drawing from."

Travis felt like he was slapped across the face. His entire education on magic sources felt meaningless. Is this old Elf telling the truth? It does make more sense than everyone pulling on the same river but drawing varying amounts of magic. But, but then?

"Can we choose the river, or rather, the god that we draw from?"
"The trained wizards, no Crows and Hawks, can."

The old Elf continued walking around the garden with the boy tailing him closely. "You need to understand the pathways well, and be able to draw an overwhelming amount of power at once to be able to see your river's source. And you need a great amount of control of the drawn magic to be able to switch the source. Like I said, there are hundreds of thousands. The old Emperor was an accomplished Crow. But unfortunately, in his greed, made a slight mistake and pulled on the wrong river power."

He stopped and shivered. As if he recalled something grim. "He pulled on her river." Both Travis and Rillegh looked in unison at the black ice building. The Halls of Anorra was buzzing with activity by the sounds they could hear from the outer garden. "The old Emperor woke her from her slumber. He was supposed to pull on the river of a dead Dragon god. But instead he touched her with his greed, and she pulled back." The Elf now hugged himself as if he needed to support his body from shaking.

"She awoke that day and laid waste to the entire army the continent had ever seen." And from that day, both the East and the West agreed to send her willing sacrifices to keep her satisfied and out of their lands. The Queen of Wailing and Screams, Anorra had awoken two thousand seventy three years ago.

His voice quickly lowered to barely a whisper. Travis barely managed to hear what he said next. "The bitch needs to die tonight!" Rillegh had a cruel look in his face. Travis felt like he had stepped into something even more dangerous than serving a god. He had an ominous feeling that he only had a few hours left in this world.

That is when he heard the rest of what old Rillegh said, "Have you ever thought if a god could be torn by a whirlwind? How about one drawing on a river so ancient that even those Ancient bastards have forgotten about it?" Travis shivered like a paper caught in a tornado. He shook his head slowly looking at the old Elf's manic smile.

Maybe... maybe, he could learn how to become stronger if he learnt from the old Elf? He was already going to die. Either by getting his will sucked from his soul, or in fear of working for this ruthless god that he could not even look at. Travis decided that he was not going to die without trying anything.

"I want to learn, sir."

The old Elf suddenly looked at the visible green aura surrounding the boy as he said that. 'Interesting,' he thought to himself. 'He still has so much willpower that it is even reaching out to the rivers on its own.'

"Then, follow me. We have a god to kill." Rillegh waved his hand lightly to create a very thin tear in the air in front of him. Travis had heard about this. It was hailed as the highest form of magic a Crow could perform. "A... pocket dimension...!" he gasped.

"Nothing of the sort. Just a small room where I am going to whip you until you bleed. Oh, and also tap into a powerful river in the process." The old Elf smiled as he waved for Travis to enter.

Travis was scared. But he still parted the tear like he would a curtain of an entrance and walked into the void that was waiting to swallow him whole. Once the boy had walked in, the old Elf followed with the same noble posture he normally held. The tear quickly faded leaving nothing in the air once he passed through.

The cold winds that blew through the black ice palace of the Northern Ice Storm glacier were chipper this morning. But they will sing a different song that night.

Directing the Move: Learning by Shifting a Game Production Studio

Moving is a challenging task on its own, but moving a production studio without major disruptions is another beast altogether. I have planned and executed a full studio migration a couple times now.

This is the story of when I moved an active mobile production studio from one end of Abu Dhabi to the other. I designed the network, coordinated the contractors and kept production running through all of it.

The Setup

In late 2021, an active mobile production studio in Abu Dhabi began planning a move from Park Rotana Complex in Khalifa Park to the newly constructed Yas Creative Hub on Yas Island.

By the time IT was brought into the planning process, the broad decisions had already been made. The timeline existed. The building was signed. What remained was the execution, and the question of how to move a live production environment across the city with a weekend of maximum downtime.

I had one person on my team. That person was me.

The planning phase ran from November 2021 to March 2022. Physical execution began in March and concluded with the studio officially opening in October 2022. In between those two dates lived approximately eleven months of contractor negotiations, municipal certifications, regulatory checks, ISP transitions, and the specific kind of creative problem-solving that only emerges when something has gone genuinely wrong.

What follows are the lessons I took out of that process. Each one is the product of something not going according to plan.

Lesson 1: Contractors Will Interpret Your Plans Creatively

The Townhall area was one of the centrepiece spaces in the new studio. Staircase-style tiered seating, designed to hold the full studio for all-hands presentations and company-wide events. Significant square footage. Significant investment.

At some point during construction, the contractor expanded the footprint of the Townhall seating into the adjacent office space.

This was not in the drawings. This was not discussed. This was a unilateral creative decision made by someone who either misread the plans or chose not to read them at all.

The options at that point were: tear down the built staircase structure and rebuild within the correct boundaries, or reduce the Townhall seating to fit the original allocation and accept the smaller configuration.

Tearing down the staircase would have added weeks to the timeline and reopened a cost conversation nobody wanted to have. I made the call to reduce the seating and move on.

Key Lesson: Treat every contractor deliverable as a draft until you have physically walked it. Do not assume that architectural drawings translate into accurate builds without supervision. The gap between what is planned and what is built is the gap you do not check on.

Lesson 2: "That's Normal" Is Not a Technical Answer

The Yas Creative Hub building was new. The HVAC systems were new. The air conditioning units for the studio floor were new.

When the AC units were tested, they leaked water.

All of them.

The contractor's position was that this was normal behaviour during initial testing. Condensation. Expected. Nothing to worry about.

I did not accept this.

Water leaking from ceiling-mounted AC units directly above workstations, server infrastructure, and production hardware is not a commissioning quirk. It is a liability. I pushed back, escalated, and held the sign-off until the units were inspected, the drainage systems corrected, and a dry test was completed.

The contractor was not happy about this. The contractor was also wrong.

Key Lesson: When a contractor tells you a failure mode is normal, ask them to put it in writing. The willingness to document tends to clarify the situation quickly. You are not an expert in HVAC engineering. You are an expert in what happens to your infrastructure when water falls on it.

Lesson 3: Certification Will Slip. Plan for It to Fail Three More Times.

The datacenter required fireproofing certification from the Abu Dhabi municipality before it could be approved for occupation. This is not optional. This is not a formality. Without it, the datacenter does not open.

The first inspection failed. The contractor had fireproofed the upper walls but not the lower portion where the raised floor began. The inspector identified the gap immediately.

The contractor returned, made corrections, and scheduled the second inspection.

The second inspection failed. Same issue, different section of wall.

By the third scheduled inspection, I attended in person. I walked the inspector through the space before the formal review. I had already identified and flagged the remaining gaps to the contractor the day prior. We tore down the wall completely, sealed the entirety from the ceiling all the way to the gaps in the raised floor that were flagged the previous times and then rebuilt it in a day.

Eleven months of planning. Three fireproofing inspections later, the datacenter was finally approved.

Key Lesson: Regulatory certification does not run on your project timeline. Build buffer into every sign-off that involves a third-party authority. Assume one failure at minimum. Assume two if the contractor has already demonstrated they are reading the requirements selectively. The municipality inspector is not your adversary, they are the only person in the room who has no reason to cut corners.

Lesson 4: Someone Has to Be in the Room

Network cabling for floor boxes sounds like an unattractive task. Route the cables, terminate the connections, test the links. Standard process.

The studio floor had clearly defined user desk areas. It also had collaboration zones, separate open spaces designed for informal meetings, breakout sessions, and team clusters away from the main desk rows.

The cabling team forgot about the collaboration zones entirely.

This was not discovered during the cabling phase. It was discovered when the floor was nearly complete and someone asked why the collaboration areas had no connectivity.

The answer was that nobody had been watching.

Separately, the meeting rooms, conference room, and Townhall all arrived with furniture below the specified quality. Table microphones and ceiling speakers were installed without acoustic testing. The first time a meeting was held with both active simultaneously, the room fed back on itself.

The soundproofing was inadequate. I had to require the contractor to return and correct the acoustic treatment before the rooms were signed off.

Neither of these failures were inevitable. Both would have been caught earlier with consistent on-site supervision.

Key Lesson: Delegation to a contractor is not the same as oversight. Also, expecting the contractor to read the plans carefully is a mistake on its own. For any build task that involves multiple phases or multiple teams, assign a human being, ideally yourself, to physically verify completion at each stage. The cabling team did not maliciously skip the collaboration zones. They were not reminded those zones existed.

Lesson 5: Build for Airflow Before You Build for Aesthetics

The build machines, the systems used for compiling and packaging the game builds, were racked in a dedicated enclosure. The rack was purpose-built for the space. Solid construction. Clean cable management. Doors on all sides.

The machines were fully sealed.

When I tested the systems with the doors closed, temperatures climbed immediately. The enclosure had no rear ventilation. Hot air had nowhere to go. Left unaddressed, the build machines would have begun throttling and eventually failing within weeks of production use.

The fix was not elegant. I instructed the team to cut the back panel off the enclosure to create airflow. A purpose-built rack, freshly installed in a new studio, was modified with a cutting tool before it ever went into use.

It worked. The temperatures normalized. Production was never impacted.

Key Lesson: Thermal management is not an afterthought. When specifying any enclosure, rack, or cabinet that will house active compute hardware, airflow path is a primary requirement, not a secondary consideration. A rack that looks correct but traps heat is worse than no rack at all. Ask where the hot air goes before you approve the build.

Lesson 6: WiFi is a People Problem, Not a Space Problem

The wireless network for the new studio was designed based on the floor plan. Access points were positioned to achieve even signal coverage across the studio area. The planning looked correct on paper.

When the studio opened, one team reported consistently poor wireless performance. The QA team.

QA teams in mobile game production operate with a high density of devices. Each tester runs multiple handsets simultaneously, sometimes four to six devices per person. A team of five QA testers represents potentially twenty-five to thirty active wireless devices in a single area.

The QA team had been seated in a corner of the studio. The AP coverage in that corner was designed for standard user density. It was not designed for the device density of a QA floor.

The fix required repositioning access points and adjusting the wireless design to reflect actual usage patterns. It was not a complicated fix. It was a fix that should not have been necessary if the seating plan was completed earlier on.

Key Lesson: Plan your wireless network after you have confirmed where every team is sitting and not before. Coverage maps measure signal strength across physical space. They do not account for the number of devices a given team will connect. Seat your highest-density teams first. Design the wireless network around them.

Lesson 7: When the Plan Fails, Build a Fallback

Building contract dates can be challenging to manage when there is a large scale migration like a studio move happening. But these are the kind of scenarios that you should include buffers and fallback plans for.

The contract with the Park Rotana building ended in August 2022. The original agreement with the contractor was that the new studio would be ready by then. It was not.

Electrical approvals had not completed. The site safety inspection had failed. The physical move approval was delayed until a full cleanup could be completed, estimated at end of September at the earliest. The old building contract was expiring. The new building could not legally be occupied.

There was a gap. A real one.

I had an internal discussion with the Yas Creative Hub facilities team and confirmed one thing: moving equipment in for installation and testing purposes was permitted. The building did not need an occupancy permit for machines. Only for people.

That was enough.

I formulated a plan around that single fact. The ISP line was installed and activated in the new building. I made the decision to break the firewall high-availability pair and move one unit to Yas Creative Hub, making it the primary. The network was built out and tested. Then on a single weekend, I shut down the datacenter at Park Rotana, moved the servers and build machines to the new site, and brought the full environment back up, accessible over VPN.

The HA pair was reconstructed. The build pipeline came online. On the last week of August, the studio went fully remote.

For six weeks, an active mobile game production studio operated entirely over VPN from a datacenter that sat inside a building nobody was allowed to enter yet. Production continued without disruption. No milestones missed. No pipeline failures.

When the second safety inspection passed and the occupancy permit was issued in mid-October, the studio opened. The physical move for employees was the final step, a natural transition from working remotely to coming to a brand new studio.

Key Lesson: When the original plan becomes impossible, the question is not how to restore it. Rather, think about what new plan can be made with the new constraints. The building facilities conversation was not a workaround, it was a requirements discovery session. Understanding the exact boundary of what was permitted revealed a path that the original plan had never considered.

The initial plan failed. The migration and the timeline did not.

Commonalities and Uncommon Paths

Reading these back, there is a common thread across the first six lessons that I did not fully understand until I was well past the project.

Every one of these six challenges was a visibility failure.

The contractor expanded into the wrong space because nobody caught it early. The AC units leaked because I was expected to take someone's word for it. The fireproofing failed twice because I was not in the room. The collaboration zones had no cabling because nobody was watching. The rack had no airflow because aesthetics were evaluated and thermals were not. The WiFi was weak in the QA corner because people density was never part of the wireless planning conversation.

The lessons here were earned, not studied.

Handling situations like the seventh however, requires an understanding of the situation beyond what the defined IT jurisdictions are. If I had simply worked with what IT Infrastructure teams were supposed to work within, I would not have thought of a Hail Mary such as moving a portion of the equipment over even though there was no entry permit to the new building.

A studio migration is not an infrastructure project with a construction component. It is a coordination problem that happens to involve infrastructure. Contractors, inspectors, furniture vendors, ISPs, and municipality authorities are all operating on their own timelines, with their own definitions of done.

Your definition of done is the one that matters. The only way to enforce it is to be present, to verify directly, and to treat every sign-off as provisional until you have seen it yourself.

Checklists are great to have. Structured plans are amazing to create. Without expecting failures and planning buffers in the project however, you invite chaos.

I had no predecessor to learn from. No internal playbook. No one who had done this before and left notes.

This is me leaving the notes.

AI in IT: Raising the Bar

AI’s True Impact on IT: Raising the Bar, Not Replacing the Workforce

Artificial Intelligence is accelerating across industries. Creative roles are already being re-evaluated. Operational workflows are being automated. Entire teams are being restructured around automation and efficient, lean pipelines.

Within IT, the reaction often swings between two extremes.

One side believes IT is next.
The other dismisses the shift entirely believing IT is immune.

Both responses miss the point.

AI is unlikely to replace IT departments in the near term. But it will redefine which IT teams remain strategic and which become overhead.

What Will Shrink?

The parts of IT built on repetition are vulnerable.

These are the first areas that AI will streamline.

  • Level 1 Support
  • Basic ticket triage
  • Routine systems provisioning
  • Template policy drafting
  • First-pass vulnerability reviews
  • Log filtering and alert classification

These tasks follow patterns. AI thrives on patterns.

If a portion of your daily task can be scripted, it must be. Those are the portions of your job role that are at risk of vanishing immediately. If an IT team defines its value by the volume of tickets closed or manual tasks performed, that team is operating on a layer that will vanish entirely.

This is not to say the team failed. This is structural evolution with the available technology.

What Must Grow?

The next generation of IT teams must shift toward control-plane thinking. More than just executing your tasks, think higher.

Take one step back and understand why something must be done.
Think what the management might want from this task.
Think how you can make it more efficient.

Focus less on operating systems manually. Instead, focus on designing systems that operate themselves.

The durable layers of IT will be:

  • Architecture design
  • Automation strategy
  • Governance modeling
  • Risk orchestration
  • Vendor integration oversight
  • Identity and Access strategy
  • Resilience engineering
  • Multi-cloud decision framing

Notice how most of the layers are not at the execution level. AI can execute tasks, as long as there is a strong and capable team providing direction, correcting flow errors, monitoring at level 2, and controlling the boundaries. AI can assist with your tasks, it cannot own them. Accountability, trade-offs, and contextual judgement remain human responsibilities.

Practical Steps for L1 / L2 Teams

Start with the below and make them your own over time. These are the skills I look for when I hire for my team.

1 - Document and Script Repetition

If a task is performed more than three times, it should not remain manual.

  • Create PowerShell / Bash scripts for recurring fixes.
  • Build standard provisioning templates.
  • Maintain shared script repositories.

An L1 engineer who writes automation becomes harder to replace than one who executes tasks manually.

2 - Convert Tickets Into Patterns

Instead of resolving tickets individually, do the following. It improves your team efficiency by a margin greater than you expect.

  • Identify the top 10 recurring issues.
  • Map root causes.
  • Propose structural fixes.

Reducing ticket volume through systemic correction is higher leverage than resolving tickets faster. This, in my experience, has always received the most amount of push back from IT teams because there is the inherent fear that team sizes are proportional to the volume of tickets received and resolved every quarter.

The managements that measure their team efficiency this way are places where communication is key. Clearly explaining the long-term cost difference between having a higher volume of L1 tickets vs having the team focus more on automation is needed.

AI adoption by the IT team to better support the operational teams without additional 'subject-matter expert' hires must be emphasized.

3 - Build Self-Service Layers

The following will help reduce noise in the tickets immediately.

  • Introduce password self-service tools.
  • Automate onboarding templates.
  • Create internal knowledge portals.

The goal is not to protect ticket volume. It is to eliminate avoidable load, freeing up the team to focus on better initiatives.

4 - Learn AI-Assisted Operations

AI tools can already:

  • Draft scripts.
  • Summarize logs.
  • Suggest remediation paths.
  • Parse audit outputs.

Teams that learn to use AI to accelerate analysis will outperform those who resist it.

5 - Shift from Task Completion to Risk Awareness

L1/L2 engineers should begin asking themselves:

  • What is the business impact of this failure?
  • Is this symptom masking a systemic issue?
  • Is there an automation opportunity here?

This mindset transition is the bridge toward architectural relevance and organizational importance.

The Complacency Risk

Becoming overly worried about AI is unproductive. Becoming complacent however, is more dangerous.

Complacency in IT sounds like this:

  • "I am doing my job, and that is enough."
  • "Infrastructure cannot be automated."
  • "Cloud is just someone else's server."
  • "AI is just another tool."
  • "AI usage makes the IT team relaxed."

The teams that shrink fast will be those that treat AI as noise rather than signal. The ones that evolve will be those that deliberately redesign their operating model to adapt with the industry.

The Required Mindset Shift

IT teams must stop equating being busy with value. You might feel like you do a lot for your organization and you are thus invulnerable to the restructuring efforts. This is a dangerous thought path to assume.

Closing more tickets is not strategic leverage. Firefighting faster is not operational maturity. Future-ready IT will work on the following today.

  • Reduce manual dependency.
  • Design automation intentionally.
  • Accept smaller, higher-leverage structures and systems.
  • Measure success by system stability and risk reduction.
  • Treat AI as an operational accelerator, not a threat.

The shift is from executor to architect.
From operator to orchestrator.
From cost center to continuity guarantor.

Closing Thoughts

AI will not eliminate IT departments overnight. That fear is overstated. But it will expose which teams are tactical operators and which are strategic designers.

The future of IT provides strong growth paths to teams that embrace automation at their core. This future belongs to the teams who design the systems that automation runs on.

Those who design systems, not just operate them, will remain indispensable.

Data Security Posture Management in Practice

Data Security Posture Management is often discussed in abstract terms: Discovery. Classification. Governance. Remediation.

In reality, posture failures surface during high-pressure events: Migrations. Audits. Incidents.

This story from my experience illustrates how incomplete visibility can translate into operational disruption.

The Scenario

During a large-scale Microsoft tenant-to-tenant cloud migration, the IT team executed a structured migration plan:

  • Exchange mailboxes migrated
  • SharePoint sites migrated
  • OneDrive data migrated
  • Teams environments migrated
  • Permissions mapped and validated

From an infrastructure perspective, the migration was comprehensive. What was missing was discovery. The production team had been using Microsoft Loop as their primary planning environment. Critical project-planning data lived entirely within Loop workspaces. IT had no inventory of this usage. No classification. No tracking. No migration mapping.

When the production team accessed the new tenant, their planning data was incomplete.

The migration had technically succeeded. Operationally, it had not.

What Went Wrong

This was not a tooling failure. It was a visibility failure.

There was:

  • No centralized inventory of SaaS workloads in use
  • No monitoring of newly adopted Microsoft 365 services
  • No sensitivity tagging tied to workload discovery
  • No structured data ownership validation before migration

Loop usage had never been formally onboarded into governance oversight. It existed within the tenant, but not in IT's operational awareness or the business-critical software inventory.

This is a classic posture management gap.

The Consequence

Once the gap was discovered, the organization faced a time-critical recovery scenario.

The only viable path was manual intervention:

  • Identifying affected Loop workspaces
  • Exporting data from the source tenant
  • Recreating workspaces in the destination tenant
  • Copying content manually
  • Validating completeness with production stakeholders

The remediation effort took six full days.

Six days of cross-team coordination, late hours, manual verification, and elevated stress. The migration timeline was disrupted. Trust was strained. Risk exposure increased. The damage to reputation and team trust was far harder to repair than the actual missing data.

All because discovery had not preceded execution.

Where Data Security Posture Management Would Have Helped

A mature posture management capability would have reduced or eliminated this disruption.

1. Continuous Discovery

Automated workload inventory would have revealed:

  • Active Microsoft Loop workspaces
  • Volume of content stored
  • User adoption patterns

Loop would have been visible as a production-critical workload rather than an unnoticed collaboration tool.

2. Data Classification and Sensitivity Mapping

If planning artefacts had been labelled according to sensitivity or business criticality:

  • High-value workspaces would have been flagged
  • Migration planning could have prioritized them
  • Data validation checklists would have included them

Classification provides a signal. Without it, all data appears equal.

3. Pre-Migration Posture Assessment

A structured posture review before migration would have asked:

  • Which workloads are actively used
  • Which contain business-critical data
  • Which services fall outside standard migration tooling

That assessment would likely have surfaced Loop usage early, while remediation was still simple.

4. Ownership and Accountability Mapping

Posture management also clarifies data ownership. If each collaboration workspace had a defined business owner:

  • Owners would have been engaged during migration validation
  • Confirmation of completeness would have occurred before cutover

Instead, ownership discovery happened after the disruption.

The Operational Lesson

Data Security Posture Management is not only about compliance and regulatory alignment. It is about operational continuity. When IT lacks visibility into:

  • Emerging SaaS workloads
  • Shadow adoption of collaboration tools
  • Data criticality distribution

Strategic initiatives such as tenant migrations become risk multipliers. Infrastructure execution without data awareness creates blind spots.

From Discovery to Remediation

In this case, remediation was manual and reactive. It consumed six painful days because the discovery occurred after the impact. A mature posture management lifecycle would follow a different sequence:

  1. Discover workloads and data locations
  2. Assess sensitivity and criticality
  3. Validate ownership
  4. Incorporate findings into migration design
  5. Execute with verified scope

Remediation then becomes exception handling, not crisis response.

Conclusion

The tenant migration did not fail technically. It failed from a posture perspective. The absence of continuous discovery and workload awareness turned a standard cloud migration into a six-day-long recovery exercise.

There is an additional lesson that is often overlooked. In a fast-moving or rapidly-growing environment, it is common for teams to adopt new collaboration tools outside formal governance workflows. Without structured discovery and verification mechanisms, these adoptions remain invisible to migration planning.

In this case, there was an implicit assumption that all production critical planning data was known and accounted for. That assumption proved incorrect.

Verbal confirmation is not validation. IT leadership must independently verify workload usage, data locations, and service dependencies before executing high-impact changes. This means conducting technical discovery scans, usage analysis, access reviews, and controlled testing rather than relying solely on stakeholder declarations.

Data Security Posture Management formalizes that discipline. It replaces assumption with evidence. It ensures that the business teams' beliefs are technically validated before transformation begins.

Infrastructure planning without independent verification is highly risky. Continuous posture management closes that gap and converts uncertainty into measurable control.

State of IT Part 6: Balancing Innovation and Operational Stability

A quiet tension is building within most IT teams.

On one side, there is demand to innovate. Automate more. Integrate AI into workflows. Reduce headcount dependency. Move faster. Deliver more with less. On the other side, there is the unglamorous reality of keeping systems stable. Patch cycles. Identity hygiene. Backup validation. Endpoint drift. License audits. Incident response. The daily grind that nobody celebrates until it fails.

Innovation gets applause. Stability gets silence.

Yet stability is the foundation that makes innovation survivable.

The Illusion of Acceleration

We are in a time when leadership conversations are dominated by speed.

  1. How quickly can we deploy?
  2. How fast can we automate?
  3. How much AI can we embed?

The assumption is that acceleration equals progress. But acceleration without structural maturity creates fragility. If your identity architecture is inconsistent, automating access provisioning will compound those inconsistencies. If your asset inventory is incomplete, AI-driven analytics amplify blind spots. If your governance model is unclear, automation only accelerates chaos.

Innovation in this manner does not compensate for weak foundations. It exposes them.

Stability Is Not Resistance to Change

There is a misconception that teams focused on operational discipline are resistant to innovation.

This is rarely true.

The best operations teams understand a fundamental truth. Stability is not the opposite of innovation. It is the prerequisite for it.

Resilient systems allow experimentation. Documented processes allow safe iteration. Clear ownership allows confident delegation. When fundamentals are strong, innovation becomes additive. When fundamentals are weak, innovation becomes disruptive.

We need to focus on system maturity and stability before we can consider iterating on or innovating existing tools and structures.

The Cost of Ignoring the Base Layer

When innovation initiatives outpace functional stability, the symptoms appear gradually.

Small outages become recurring patterns. Security exceptions multiply. Access reviews become performative. Shadow IT grows quietly. Eventually, the organization does not suffer from a lack of innovation. It suffers from cumulative operational debt. IT then becomes reactive instead of strategic. Teams spend their time firefighting instead of designing. The irony is that the more an organization pushes for innovation without discipline, the less innovative it actually becomes.

A Practical Balancing

Juggling innovation and stability does not require complex frameworks. It needs intentional sequencing.

First, define non-negotiables.

  1. Backup integrity.
  2. Identity hygiene.
  3. Patch compliance.
  4. Monitoring coverage.

These act as foundational controls.

Second, assess operational health before accelerating growth and experimentation. If your incident resolution time is unstable, automation should focus there first.

Third, introduce innovation in limited domains.

  1. Pilot AI in reporting before applying it to access control.
  2. Test automation in non-critical workflows before applying it to production pipelines.

Fourth, preserve human monitoring. Automation decreases manual effort. It does not remove accountability. Innovation should feel like reinforcement, not replacement. This is where, in my humble opinion, most organizations fail.

Leadership Expectations and Reality

Many IT leaders are navigating expectations shaped by headlines rather than infrastructure realities. There is a belief that AI can replace inefficiency. These tools can compensate for process gaps. That digital transformation is primarily about platform adoption.

In practice, transformation is about discipline. It is about clarity in roles. It is about visibility in systems. It is about governance that scales. Technology accelerates what already exists.

If structure exists, it accelerates efficiency. If any disorder exists in your structure, it accelerates instability.

The Human Element

There is another dimension that is often overlooked. Operational dependability is not purely technical.

It is cultural. Teams that value documentation. Teams that respect change control. Teams that escalate early rather than conceal mistakes. These are the teams that innovate sustainably.

When people feel pressured to deliver visible innovation at the expense of quiet stability work, corners are cut. Over time, trust erodes. The strongest IT environments are not the most automated. They are the most accountable.

Accountability > Automation.

Redefining Success

Perhaps the biggest shift required is revising how success is measured.

Not only by how many AI initiatives were launched.
Not only by how many systems were modernized.
But by how many incidents were prevented.
How many risks were mitigated before they happened.
How stable the environment remained during the transformation.

Innovation that destabilizes is not progress. It is a deferred cost.

The State of IT Today

We are not short of tools. We are not short of ambition. What many organizations lack is calibrated pacing.

Balancing innovation and stability is not about slowing down. It is about strengthening the base before increasing velocity. In these uncertain times, the temptation to move fast is understandable. The discipline to move deliberately is what will separate resilient IT teams from reactive ones.

Innovation should expand capability. Stability assures that expansion does not collapse under its own weight.