如何确认数据迁移过程的数据完整性问题!

来源:哔哩哔哩时间:2023-07-10 09:52:50

Introduction简介

Due to the dynamic nature of technological changes and dataintegrity requirements companies are faced with the challenge of frequentlymigrating data from legacy systems.

基于不断变化的技术新变革和数据完整性新需求,公司常常面对着如何从旧有系统系统中迁移数据的挑战。


(相关资料图)

Data migration can be very challenging and can create dataintegrity issues if not executed with an adequate strategy.

如果没有一个合适的指导策略,数据迁移可能会存在风险,并且引起数据完整性问题。

One of the biggest challenges is that companies fail to understandthe magnitude and impact of migrating data from legacy systems whilemaintaining data integrity.

其中一个最大最普遍的问题就是在评估数据迁移行动中的数据完整性风险时,受影响公司常常没有正确评估从旧有系统迁移数据的体量和其造成的影响。

Prior to the execution of projects that require data migrationcompanies need to understand the risk impact of the legacy data that requiresmigration.

在实施数据迁移项目前,公司应该充分理解评估从旧有系统迁移数据引起的关联影响。

Based on the risk impact of the legacy data companies need todefine a data migration strategy that enables compliance with data integrity regulatoryrequirements.

基于旧有数据迁移的风险评估,公司需要定义数据迁移策略来确保行动全程的数据完整性合规性。

This article will discuss data integrity strategies during thedata migration of legacy systems. 

本文将主要讨论旧有系统数据迁移过程中数据完整性策略:

Data Migration Challenges

数据迁移过程中的风险挑战

The drivers for migrating data from legacy systems may bedifferent from one company to the other but they typically include thefollowing:

驱使公司进行旧有系统数据迁移的原因可能各不相同,但典型理由如下:

Obsolete technology

技术更新换代

New regulatory requirements that legacy systems are unable to meet

新法律法规导致旧有系统无法满足合规性需求

Vendor no longer supporting the technology

供应商不再提供技术支持(例如: 美东时间4月8日夜晚12点之后,微软就停止支持Windows XP)

Vendor goes out of business

供应商破产

Vendor is taken over by another business

供应商业务被第三方收购

Business decision to replace legacy system

高层商业决定,决定旧有系统淘汰

There are several challenges that need to be well understood andmanaged to reduce or eliminate any data integrity compliance risk.

以下有关数据迁移的事项,需要特别注意以减少或者消除数据完整性合规风险

Some of these challenges include the following:

包括如下:

Clear understanding of which data must be migrated

清晰了解哪些数据需要被迁移 (例如master file, metadata,audit trail record, etc.)

Clear understanding of the risk impact of the data

清晰了解这些数据的风险影响(例如Impact in business/function/process)

Ensuring data integrity during the migration process

在整个迁移过程中确保数据完整性(例如 test in test environment with live data firstly, parallel run, go-liveperiod )

Defining an adequate data migration strategy

定义一个合适的数据迁移策略

Defining an adequate data migration verification strategy

定义一个合适的数据迁移验证策略(Data availability test, Data Integrity test)

Another challenge is developing a clear understanding of the typeof data that will be migrated including the following attributes:

另一方面,我们需要从以下方面对所迁移数据的数据类型进行评估:

Legacy data format

旧有数据的格式(Read-only or editable ? Open data or Encrypted Data?)

New system data format

新系统要求数据的格式

Legacy data size

旧有数据的大小容量

Legacy data regulatory impact

旧有数据在法规上影响(GxP, Privacy Information,海关,SoX美国业务, GDPR欧盟业务)

Legacy data retention period  

旧有数据的保存年限(药品的有效期, GDP的定义)

One of the biggest issues with data migration is that companiesnormally don’t understand the data that needs to be migrated.  Thiscreates the challenge that very often companies end up migrating data that doesnot need to be migrated to the new environment. This issue is driven by thelack of understanding of the record retention requirements of the legacy migration should only include data that is currently under recordretention requirements that must be migrated to the new system. Migrating onlyrelevant data enables efficiency and cost control during data migrationprojects.

数据迁移问题里最大的一个挑战就是公司实际上一般不理解(哪些)数据需要迁移。所以公司最后经常把旧系统的所有数据迁移到新系统来完成项目(举例:而这样操作的问题是:并不是所有数据都需要迁入新系统;迁入过多无用数据导致 -1.迁移后数据完整性测试任务量增加, 2. 新系统因为数据过多导致运行效率降低和备份时间过长)。这个(简单粗暴迁移所有数据)问题的产生源自对旧有数据的数据保存要求没正确的认知。数据迁移应当仅限于数据保存期的数据方才被要求迁移入新系统。这样仅迁移相关数据的做法有助于帮助数据迁移项目实现高效率和合理成本控制。

The risk impact of the legacy data need to be very well understoodprior to data migration . Data that has direct impact on Critical ProcessParameters and Critical Quality Attributes is considered critical data that hasa high-risk impact. The risk impact of the legacy data isa critical input intothe data migration strategy and planning. 

在实施数据迁移之前,旧有数据的风险应该被充分理解评估。直接影响CPP(关键工艺参数)和CQA(关键质量属性)的数据将被定义为高风险。旧有系统的风险评估报告将作为一个重要参考引入数据迁移的策略和计划的制定。(举例:在准备 Data Migration Plan之前,应完成Blue Prints of Process Flow, High RiskField Assessment according to CPP and CQA.)

Maintaining data integrity during data migration projects iscritical from a compliance and business perspective. Data integrity is acritical regulatory requirement that needs to be part of the data migrationstrategy.  The data migration strategy needs to be documented along withdeliverables required to execute the data migration. The data migrationstrategy needs to ensure that the data will not be altered or lost during thedata migration activities. The data migration strategy need to include alertsand notification of errors and failures during the data migration process.

从合规及商业角度来看,在整个数据迁移项目中维持数据完整性都是十分重要的。数据完整性是合规性要求的重要一环,需要在数据迁移策略中得以体现。数据迁移战略需要与执行数据迁移所需的可交付成果一起记录。数据迁移策略需要确保在数据迁移活动期间数据不会被改变或丢失。(举例:为防止潜在的data loss,在新系统go alive后,试运行三个月之前,旧有系统可以并行运行,将来也可以冻结使用而非直接删除)数据迁移策略需要在数据迁移过程中包含警报和错误和失败通知功能。

To ensure that the goals of the data migration strategy areproperly executed companies need to develop a solid data migration verificationprocess. The data migration verification activities are intended to demonstrateand provide documented evidence that data integrity was maintained during thedata migration process.为确保数据迁移策略的目标得到正确执行,公司需要设计一个可靠的数据迁移验证流程。数据迁移验证活动旨在展示数据迁移过程中保持数据完整性的文件证据。(举例:完整的证明文件链Data Migration Plan,Impact and Risk Assessment, Re-Qualification test Protocol andReport,etc)

Data formats are a critical attribute that needs to be very wellunderstood prior to data migration and included as an input in the issues related to data format are a common problem oftenoverlooked by companies prior to data migration.  

在数据迁移之前,数据的格式问题也是一个需要提前考虑的问题。在很多公司的数据迁移项目,兼容性问题常常被忽视。(举例:如果单纯的GMP Manufacturing – QA、生产部和QC 作为Process Owner 可以发起数据迁移,但是新系统可行性论证最好有system owner IT,工程部和 technical specialist from vendor 参与)

The size of the legacy data it is another critical attribute thatneeds to be well understood prior to data migration. The size of the legacydata can be impacted by any data conversion that need to occur during the migrationprocess. When migrating data to a cloud environment companies need to clearlyunderstand the legacy data size requirements and future disk and memory spaceneeded to manage the data during its lifecycle. 

旧有数据的大小是数据迁移之前需要充分理解的另一个关键属性。旧有数据的大小可能会受到迁移过程中需要发生的任何数据转换的影响。同时将数据迁移到云环境时,公司需要清楚地了解传统数据大小要求以及在其生命周期中管理数据所需的未来磁盘和内存空间。(简单来说:新系统的设计不能简单考虑目前,需要考虑未来5-10年的data growth。同时基于商业数据敏感及客户数据私隐的要求,cloud service在医药企业并不非常普及,医药企业并不倾向于委托第三方存储数据,ERP等软件的运营也由企业自己承担,但是作为IT技术先行者,银行业和保险业已经逐渐普及Cloud Service,甚至是取代ERP的SaaS - Software as a Service)有时被作为“即需即用软件”【好处在于避免了频繁系统升级对客户带来的Re-validation 工作,Service供应商中央一次升级完成】)。

In summary, one of the biggest challenges with data migrationprojects is the lack of adequate understanding of this task, includingplanning, strategy and understanding of the critical data attributes that needto be well managed and understood.

总之,数据迁移项目面临的最大挑战之一是缺乏对此任务的充分理解,包括需要妥善管理和理解的关键数据属性的规划,策略和理解。

The next section of this article will provide strategies and ideasabout managing and controlling the data integrity risk associated with datamigration projects.

本文的下一部分将提供有关管理和控制与数据迁移项目相关的数据完整性风险的策略和理念。

Data Migration Strategies & Solutions

数据迁移策略与方案

The migration strategy and planning can be initiated oncecompanies develop an understanding of the data that needs to be migrated, therisk and regulatory impact.

当公司充分了解需要迁移的数据,风险和监管影响,就可以启动迁移策略和规划的制定。

During the strategy and planning companies need to evaluate theiroptions prior to data migration which includes the following:

在策略和规划期间,公司需要在数据迁移实施之前评估他们的选择,其中包括以下内容

Archiving归档

Keeping legacy data read-only将旧有数据转为“只读性”

Hybrid approach纸质和电子并行

Full migration整体迁移

Data archiving is the process of moving data that is no longeractively used to a separate storage location for long term retention. Beforeconsidering archiving, companies need to consider the following factors:数据归档是将不再主动使用的数据移动到单独的存储位置进行长期保留的过程。在考虑归档之前,公司需要考虑以下因素:

How often the company need to access the data

公司访问(旧有)数据的频率有多高(因为Archived Data 需要re-install 才能被访问,太高访问频率则不建议归档)

Criticality of the data

数据的重要性

High amount of data manipulation required to move to the newsolution

迁移到新系统所需要的大量数据操作(因为归档的数据越多,则迁移的数据越少,数据迁移的实施与验证越容易)

Project budget constraints

项目预算限制

Regulatory impact

法规要求(参考Data Retention Period in Company Policy and Regulatory Requirement)

Data must be retrievable and accessible during retention period

数据保留期间,数据必需可以被访问(因为Archived Data 需要re-install 才能被访问,这个re-stall的动作和archiving data的数据完整性必需被验证)

Keeping the data read-only also requires consideration of thefollowing factors:

保持(迁移)数据只读性主要基于以下考虑:

Cost associated with maintenance and licenses

(旧系统)维持和授权费用(因为冻结旧有系统,保持数据只读性之后,maintenanceand licenses cost 大大减少)

Limiting access to data 

限制数据的访问

No data changes

防止数据篡改(如果冻结退休旧有系统,只读性数据可以保护这部分数据的数据完整性)

Data record retention requirements

数据保留的要求

Archiving and making data read-only are options that should beconsider as part of the strategy and planning phase during data migrationprojects.在数据迁移项目中,将数据归档和数据转换为只读应作为战略和计划阶段的一部分。

Part of the data migration strategy and planning requiresdocumented deliverables that are intended to define and document the datamigration strategy and related verification.

数据迁移策略和规划的一部分需要记录的可交付成果,旨在定义和记录数据迁移策略和相关验证。

For data migration projects companies need to create and executethe following deliverables:

可交付成果的举例如下:

Data risk and impact assessment数据风险及影响评估

Data Migration Plan数据迁移计划

Data Migration Protocol数据迁移方案

Data Migration Summary Report数据迁移总结报告

The data risk impact assessment is a critical activity that needsto be documented very early prior to any data migration activities. The datarisk impact assessment is intended to provide the risk impact and level of thedata that needs to be migrated. The data risk impact assessment is a key inputto the overall migration strategy and plan.

数据风险及影响评估是一项重要的活动,需要在任何数据迁移活动之前尽早实施。数据风险影响评估旨在提供需要迁移的数据的风险影响和级别。数据风险影响评估是整体迁移策略和计划的重要内容。

Once the data risk and impact assessment is completed then theData Migration Plan. The Data Migration Plan is intended to define and documentthe overall migration strategy and deliverables.一旦数据风险和影响评估完成,那么数据迁移计划。数据迁移计划旨在定义和记录整个迁移策略和可交付成果。

The Data Migration Plan should include the following information:数据迁移计划应包含以下信息:

Purpose目的

Scope范围

Roles and Responsibilities角色与责任

Migration Strategy迁移策略

Deliverables可交付文件

Data Migration Verification Strategy数据迁移验证策略

Data Risk Impact Assessment数据风险及影响评估

Identify Data to be Migrated将被迁移数据

Migration Tools迁移工具(Technical Part by IT or Vendor)

Sampling Strategy取样策略(Technical Part by IT or Vendor)

Acceptance Criteria验收标准

Deviation Handling偏差处理

The Data Migration Plan is a key deliverable of the project thatmust be approved prior to initiating any migration activities. The DataMigration Plan requires Quality review and approval. The plan needs to definethe sampling strategy which should be driven by the risk impact of the % sampling is not feasible and adequate therefore standards such as ANSI/AQLshould be used to define the risk based sampling strategy. The data migrationverification strategy need to provide evidence that the migration wassuccessfully completed and data integrity was maintained. The data migrationverification strategy should define any tools that will be used for themigration activities. Tool or techniques that can be used for data migrationverification include source vs target data integrity checks that can beperformed using the following tools:

数据迁移计划是该项目的关键交付项目,必须在启动任何迁移活动之前获得批准。数据迁移计划要求质量审核和批准。该计划应基于数据的风险影响确定抽样策略。 100%抽样是不可行和充分的,因此应使用ANSI / AQL等标准来定义基于风险的抽样策略。(释义:ANSI:美国国家标准化组织,是一个核准多种行业标准的组织。SQL:结构化查询语言,是与关系型数据库进行通信的标准语言)数据迁移验证策略需要提供证据证明迁移已成功完成并保持了数据完整性。数据迁移验证策略应定义将用于迁移活动的任何工具。可以用于数据迁移验证的工具或技术包括可以使用以下工具执行的源数据完整性检查和目标数据完整性检查:(迁移中和迁移后数据完整性验证的软件工具)

Cryptographic hash function加密哈希函数

CheckSum总和检查码

Cryptographic hash function is a mathematical function that takesan input of any size and returns as an output an alpha numeric digest of afixed size. Any alterations to the inputs will drastically change the hash functions can be used to compute a digest of a data set atthe source and then compute a digest at the target.

加密哈希函数是一种数学函数,它接受任意大小的输入并将固定大小的字母数字摘要作为输出返回。对输入的任何更改都会彻底改变数据摘要。(运用这个进行加密,确保数据迁移过程中,数据没有丢失或者被篡改, Checksum 也是相同目的,不同技术实现手段)加密散列函数可用于计算源数据集的摘要,然后计算目标处的摘要。

The Data Migration Protocol is a critical deliverable that isintended to provide documented evidence that migration activities weresuccessfully executed and that they meet the established acceptance Data Migration Protocol is an executable document that need to include howdeviations and failures will be managed during the execution.数据迁移方案是一项重要的交付项目,旨在提供书面证据证明迁移活动已成功执行并且符合既定的接受标准。数据迁移方案是一个可执行文件,需要包括在执行过程中如何管理偏差和失败。

The Data Migration Protocol should include the followinginformation:

数据迁移方案应包含以下信息:

Purpose目的

Scope范围

Roles & Responsibilities角色与职责

Data Sampling Approach Strategy 数据抽样策略

Migration Verification Tools迁移验证工具

Migration Verification Strategy 迁移验证策略

Deviations偏差

Acceptance Criteria验收标准

Once the Data Migration Protocols is executed a Data MigrationSummary Report need to be created. The Data Migration Summary Report isintended to summarize the results and deviation found during the execution ofthe protocol.

当执行数据迁移方案被执行完成,就需要创建数据迁移总结报告。数据迁移总结报告旨在总结执行方案期间发现的结果和偏差。

The Data Migration Summary Report should include the followinginformation:数据迁移总结报告应包含以下信息:

Purpose目的

Scope范围

Roles & Responsibilities角色与职责

Migration Verification Results迁移验证结果

Migration Verification Summary迁移验证总结

Deviation Summary偏差总结

Acceptance Criteria Summary验收总结

These deliverables and related activities provide a structured andconsistent approach to manage and control data migration projects. Thesedeliverables are also intended to provide objective documented evidence thatdata integrity is maintained during data migration projects.

这些可交付成果和相关活动为管理和控制数据迁移项目提供了结构化且一致的方法。这些可交付成果还旨在提供客观的书面证据,用来证明数据迁移项目期间数据完整性得到维护。

Summary总结

Data migrations require adequate strategic planning to reduce thebusiness and compliance risk. The Data Migration Plan is a critical deliverablethat needs to be created to document the migration strategy.

数据迁移需要进行充分的战略规划以降低业务和合规风险。数据迁移计划 (DMP)是需要创建的关键交付物,以记录迁移策略。

The data risk impact must be assessed and documented and it is aninput into the migration strategy and plan.

(迁移)数据风险及影响必需经过评估和文件记录,它将被用于数据迁移策略和计划之中。

In summary, data migration is a critical activity that must bewell managed and controlled to maintain the integrity of the data that need tobe migrated.

总而言之,数据迁移是一个极为重要的行为,需要妥善管理和控制来保证迁移数据的数据完整性

免责声明:本文所用视频、图片、文字部分来源于互联网,版权属原作者所有。如涉及到版权问题,请及时和我们联系,核实后协商处理或删除。

关键词:

相关阅读

推荐阅读

如何确认数据迁移过程的数据完整性问题!

如何确认数据迁移过程的数据完整性问题!

Introduction简介Duetothedynamicnatureoftechnologicalchangesanddata更多

2023-07-10 09:52:50
蓝盾转退:交易异常波动,盘中停牌

蓝盾转退:交易异常波动,盘中停牌

2023年7月10日,蓝盾转退交易异常波动,盘中停牌,从09:33:31起暂停交更多

2023-07-10 09:33:35
韩国市民团体和在野党连日抗议日本强推核污染水排海计划

韩国市民团体和在野党连日抗议日本强推核污染水排

韩国市民团体和在野党连日抗议日本强推核污染水排海计划,人民政协网是更多

2023-07-10 08:58:50
高温天气新能源汽车动力电池如何保证安全?来工厂一探究竟

高温天气新能源汽车动力电池如何保证安全?来工厂

近日的高温天气,对各行各业以及人们的生活产生了不小的影响。我们首先更多

2023-07-10 08:26:28
中国职工之家与青海省总工会在北京举办助力青海拉面品牌建设暨拉面技能工匠工作室揭牌仪式

中国职工之家与青海省总工会在北京举办助力青海拉

青海新闻网·大美青海客户端讯为推动青海拉面产业提质增效、品牌化发展更多

2023-07-10 07:47:42
关于苏堤的古诗有哪些
1.关于苏堤的诗句以及对联

关于苏堤的古诗有哪些 1.关于苏堤的诗句以及对联

抄写作文网小编为大家提供关于苏堤的古诗有哪些1 关于苏堤的诗句以及对更多

2023-07-10 06:30:37
罗照辉署长会见乌干达外交部长奥东戈

罗照辉署长会见乌干达外交部长奥东戈

2023年7月9日,国家国际发展合作署署长罗照辉会见来华出席全球共享发展更多

2023-07-10 04:34:35
【七筒】风的季节(中)

【七筒】风的季节(中)

本文纯属虚构,请勿上升到运动员。日子就这样一天天过去,周启豪的私生更多

2023-07-10 00:38:32
+ 点击查看更多精彩
字节跳动计划投资“大量资金”开发VR领域
    据 Protocol 报道,字节跳动正在认真考虑进入虚拟现实(VR)领域...
任天堂Switch曝光:合作伙伴招聘新游戏机开发工程师
    据外媒报道称,任天堂即将推新一代Switch,其中一个最有力的证据...
途牛发布纳斯达克股价不合规通知函
    4月18日,在途牛收到纳斯达克股价不合规通知函后,途牛方面发布声...
一加新机现身:搭载天玑8100处理器,主打快充和游戏
    今天,一款型号显示为PGKM10的一加新机现身GeekBench,这款新机搭...
紫光国微:预计2022年第一季度净利润同比增长69.9%
    紫光国微发布公告称,预计2022年第一季度归属于上市公司股东的净...
斯坦福大学建造小型太阳能电池板原型
    日前,斯坦福大学的研究人员已经建造了一个小型太阳能电池板原型...
    资讯播报