Data Migration: Understanding the Process and Avoiding Common Mistakes
Introduction
Data migration is a complex process that involves moving critical data from one system to another. It is a crucial activity that requires careful planning, execution, and verification to ensure that the data is moved correctly and without any loss. However, data migration projects have a high failure rate, with nearly 40% of projects being over time, over budget, or failing entirely. In this article, we will define data migration, look at the different types and phases involved, and explore the common mistakes that lead to project failure. We will also discuss the responsibility gap and how it can be instrumental in most failing data migration projects.
Definition of Data Migration
Data migration is the selection, preparation, extraction, transformation, and permanent movement of appropriate data that is of the right quality to the right place at the right time and the decommissioning of legacy data stores. This definition highlights the key technical and project activity stages involved in data migration. The process involves selecting the right data sources, preparing, extracting, and transforming the data to ensure its quality, moving the data permanently to the target location, and decommissioning the legacy data stores.
Types and Phases of Data Migration
There are five types of data migration, each with its own unique requirements. These include database migration, storage migration, business process migration, application migration, and cloud migration. The data migration process involves three phases: plan, execute, and verify. The planning phase is the most critical and involves assessing and cleaning the source data, analyzing business requirements and dependencies, developing and testing migration scenarios, and codifying a formal data migration plan. The execution phase involves implementing the migration, while the verification phase involves validating the migration and decommissioning the old systems.
Common Mistakes in Data Migration
Despite the importance of data migration, the process has a high failure rate. There are several common mistakes that organizations make when embarking on a data migration project. These include:
Techno-Centricity: Seeing data migration as a purely technical problem and not involving the business in the process.
Lack of Specialist Skills: Data migration analysts need to have an eclectic mix of skills, including business-facing skills, technical understanding, project leadership skills, and an understanding of formal processes.
Underestimating: Not knowing the scale of the activities that need to be undertaken leads to underestimating, especially the amount of data preparation activity required.
Uncontrolled Recursion: Falling into the vortex of uncontrolled recursion where problems get batted back and forth across an unnecessary boundary between the project and the business.
The Responsibility Gap
The responsibility gap is a common issue in data migration projects. It arises when there is a disconnect between the business and technical teams responsible for the migration. The naive view of data migration involves moving data from legacy systems to the target system, with data extraction, transformation, and loading in between. However, legacy databases often have data of questionable value, and semantic issues arise where there is a genuine disagreement about the definition of a business term or the use of fields in corporate systems. Resolving semantic issues is beyond the competency of technologists, and they can only implement the definition once it is agreed upon.
How to Avoid the Responsibility Gap
To avoid the responsibility gap, it is essential to involve the business in the data migration process and ensure that there is a clear understanding of the data being moved. This involves assessing and cleaning the source data, analyzing business requirements and dependencies, developing and testing migration scenarios, and codifying a formal data migration plan. It is also crucial to have data migration analysts with a mix of business-facing and technical skills, project leadership skills, and an understanding of formal processes. Finally, it is essential to avoid underestimating the scale of the activities that need to be undertaken and to ensure that there is a clear understanding of the semantic issues involved.
Conclusion
Data migration is a complex process that involves moving critical data from one system to another. It is a crucial activity that requires careful planning, execution, and verification to ensure that the data is moved correctly and without any loss. However, data migration projects have a high failure rate, with nearly 40% of projects being over time, over budget, or failing entirely. To avoid common mistakes in data migration, it is essential to involve the business in the process, have data migration analysts with a mix of business-facing and technical skills, avoid underestimating the scale of the activities, and ensure that there is a clear understanding of the semantic issues involved. By following these best practices, organizations can increase their chances of success in data migration projects.
#datamigration #dataengineering #bestpractices
References:
Morris, J. Practical Data Migration. BCS Learning & Development Limited, 2012.
Qlik. Data Migration, Available at https://www.qlik.com/us/data-migration, 2023.