Leading a Technology Team Through Platform Migration
Platform migrations are where good technology leaders earn their reputation or lose it. I have led five major platform migrations over the course of my career, and each one taught me something new about how these projects go wrong. The technical challenges are real, but they are rarely what kills a migration. What kills migrations is scope creep, parallel running costs that erode executive patience, and team fatigue from maintaining two systems at once. Understanding these failure modes is the prerequisite for avoiding them.
Why Migrations Fail
The most common failure pattern starts with optimism. The team scopes the migration, estimates the timeline, and adds a buffer for unknowns. Leadership approves. Work begins. Then, around month three, someone realizes that the new platform does not handle a specific edge case that the old platform manages through a workaround nobody documented. Fixing this requires extending the timeline. While that is happening, the business asks for a new feature, and now the team has to decide: do we build it on the old platform (knowing we are migrating away) or the new platform (which is not ready yet)? Both answers are bad, and this decision point is where scope creep enters.
Parallel running is the other killer. For the duration of the migration, you are operating two platforms. That means double the infrastructure costs, double the monitoring, and a team that is context-switching between two codebases. Every week of parallel running drains budget and morale. I have seen migrations that were technically on track but got canceled because the organization could not stomach another quarter of dual-platform costs.
Team fatigue is the quietest risk and the hardest to manage. Migrations are not exciting work for most engineers. You are rebuilding something that already exists. The dopamine hit of creating something new is absent. Combine that with the pressure of maintaining the old system while building the new one, and you get a team that is stretched thin and increasingly disengaged. I have watched senior engineers leave mid-migration because they did not sign up to spend a year porting business logic from one framework to another.
The Cutover Planning Process
Cutover planning should start on day one of the migration, not in the final weeks. The cutover plan is not just a checklist of steps to execute on go-live day. It is a comprehensive document that covers data migration sequencing, rollback triggers and procedures, communication plans for internal teams and external customers, performance baselines and monitoring thresholds, and a decision framework for go/no-go.
I structure cutover planning around what I call “migration rehearsals.” Before the actual cutover, we run the entire process in a staging environment at least three times. The first rehearsal is slow and messy, identifying gaps in the plan. The second rehearsal is about timing, understanding how long each step actually takes versus our estimates. The third rehearsal is the dress rehearsal, executed at production pace with the actual on-call team.
Each rehearsal produces a detailed log of what happened, how long it took, and what went wrong. These logs are invaluable. They reveal dependencies you did not know existed, steps that take ten times longer than estimated, and failure modes that only appear under realistic conditions. I have never done a migration rehearsal that did not change the cutover plan in some material way.
The rollback plan deserves special attention. Every migration needs a clearly defined point of no return: the moment after which rolling back is no longer feasible because data has diverged too far between the old and new systems. Before that point, you need tested, documented rollback procedures. After that point, you need a forward-fix strategy. The worst possible outcome is being halfway through a cutover, hitting a problem, and not knowing whether to push forward or pull back.
Feature Parity vs. Feature Improvement
This is the single most important strategic decision in any migration, and getting it wrong is the primary driver of scope creep. The question is simple: does the new platform need to do everything the old platform does, or is the migration an opportunity to improve and rationalize features?
My strong default is to target feature parity for the initial migration and defer improvements to a post-migration phase. The reasoning is pragmatic. Every feature improvement you add to the migration scope increases the timeline, increases the testing surface, and introduces uncertainty about whether a post-migration bug is a migration issue or a new-feature issue. Feature parity gives you a clean baseline. Once you are running on the new platform, you can iterate on improvements with a stable foundation underneath you.
The exception is when the migration is specifically motivated by a capability gap. If you are migrating to a new platform because the old one cannot support a critical business requirement, then that specific capability is in scope for the migration. Everything else is not.
In practice, enforcing this boundary requires discipline and executive support. Business stakeholders will see the migration as an opportunity to get their pet features built. Engineers will want to “do it right this time” and redesign systems that work fine but offend their sensibilities. Both impulses are understandable and both will sink your timeline. I maintain a public backlog of “post-migration improvements” and add every request to it visibly. This acknowledges the request without letting it into the migration scope.
Managing Stakeholder Expectations
Communication during a migration needs to be more frequent and more honest than most leaders are comfortable with. I send weekly migration status updates to all stakeholders, and the format is deliberately simple: what we completed this week, what we plan to complete next week, what risks have emerged, and the current projected go-live date.
The projected go-live date is the most sensitive element. My approach is to provide a date range rather than a single date, and to update it honestly as the migration progresses. If the range is shifting right, I say so immediately along with the reason. Surprises kill trust, and trust is the resource you need most during a migration. When things go wrong, and they will, you need stakeholders who believe you when you say the issue is manageable. That belief comes from a track record of honest communication, not from optimistic status reports.
I also hold monthly migration demos where we show the new platform in its current state to business stakeholders. These serve two purposes. First, they build confidence that progress is real and tangible, not just a status report claiming thirty percent completion. Second, they surface problems early. A stakeholder watching a demo will catch user experience issues, missing workflows, and data display problems that would otherwise not emerge until user acceptance testing, which is far too late.
Hard Deadlines as Forcing Functions
I am a strong believer in setting hard deadlines for migrations, even when the team pushes back on them. The reason is that migrations without deadlines never finish. There is always one more edge case to handle, one more data quality issue to clean up, one more integration to validate. Without a forcing function, the migration enters a limbo state where it is perpetually “almost done.”
The hard deadline should be tied to something real: a contract expiration for the old platform, a compliance requirement, a business initiative that depends on the new platform’s capabilities. If no natural deadline exists, create one. I have committed to external partners that we would be on the new platform by a specific date, explicitly burning the boats to create urgency.
The deadline also forces priority decisions that would otherwise be avoided. When you have unlimited time, everything is important. When you have twelve weeks, the team quickly identifies what actually matters for a successful cutover versus what would be nice to have. Those priority conversations are uncomfortable but essential, and they only happen under time pressure.
The risk of hard deadlines is shipping something that is not ready. I mitigate this with clear quality gates: specific, measurable criteria that must be met before cutover. The deadline is firm, but if quality gates are not met, the cutover does not happen. This creates productive tension between speed and quality, which is better than the alternative of optimizing for one at the expense of the other.
Post-Migration Retrospectives That Actually Work
Most retrospectives are forgettable. The team gathers, shares what went well and what did not, creates a list of action items, and then nobody follows up on the action items. Migration retrospectives need to be better than this because the lessons are expensive and the next migration will happen sooner than you think.
I run migration retrospectives in two phases. The first phase happens one week after cutover, while details are fresh. This session focuses on the tactical: what went wrong during cutover, what was missing from the plan, which estimates were off and by how much. I document these with specifics, not “communication could have been better” but “the database migration took four hours instead of the estimated ninety minutes because the index rebuild on the transactions table was not accounted for.”
The second phase happens six weeks after cutover, once the team has lived with the new platform long enough to evaluate the strategic decisions. This session asks harder questions. Was the migration worth it? Did we achieve the business outcomes that justified the project? What capabilities did we lose that we did not expect to lose? What would we do differently if we were starting over?
The output of both sessions goes into a migration playbook that I maintain across projects. This playbook is one of the most valuable documents I own. It contains specific, context-rich lessons from real migrations, not generic best practices from a blog post. When the next migration starts, and it always does, the playbook is the first thing the team reads. It does not prevent every mistake, but it prevents the same mistakes from happening twice, and in platform migrations, that is worth its weight in gold.