Home Big Data Migrate Microsoft Azure Synapse Analytics to Amazon Redshift utilizing AWS SCT

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift utilizing AWS SCT

0
Migrate Microsoft Azure Synapse Analytics to Amazon Redshift utilizing AWS SCT

[ad_1]

Amazon Redshift is a quick, absolutely managed, petabyte-scale information warehouse that gives the flexibleness to make use of provisioned or serverless compute in your analytical workloads. With Amazon Redshift Serverless and Question Editor v2, you may load and question massive datasets in just some clicks and pay just for what you employ. The decoupled compute and storage structure of Amazon Redshift allows you to construct extremely scalable, resilient, and cost-effective workloads. Many purchasers migrate their information warehousing workloads to Amazon Redshift and profit from the wealthy capabilities it gives, comparable to the next:

  • Amazon Redshift seamlessly integrates with broader information, analytics, and AI or machine studying (ML) providers on AWS, enabling you to decide on the suitable device for the suitable job. Fashionable analytics is way wider than SQL-based information warehousing. With Amazon Redshift, you may construct lake home architectures and carry out any sort of analytics, comparable to interactive analytics, operational analytics, massive information processing, visible information preparation, predictive analytics, machine studying, and extra.
  • You don’t want to fret about workloads comparable to ETL (extract, remodel, and cargo), dashboards, ad-hoc queries, and so forth interfering with one another. You’ll be able to isolate workloads utilizing information sharing, whereas utilizing the identical underlying datasets.
  • When customers run many queries at peak instances, compute seamlessly scales inside seconds to supply constant efficiency at excessive concurrency. You get 1 hour of free concurrency scaling capability for twenty-four hours of utilization. This free credit score meets the concurrency demand of 97% of the Amazon Redshift buyer base.
  • Amazon Redshift is simple to make use of with self-tuning and self-optimizing capabilities. You will get quicker insights with out spending priceless time managing your information warehouse.
  • Fault tolerance is in-built. All information written to Amazon Redshift is mechanically and repeatedly replicated to Amazon Easy Storage Service (Amazon S3). Any {hardware} failures are mechanically changed.
  • Amazon Redshift is straightforward to work together with. You’ll be able to entry information with conventional, cloud-native, containerized, serverless net providers or event-driven functions. You too can use your favourite enterprise intelligence (BI) and SQL instruments to entry, analyze, and visualize information in Amazon Redshift.
  • Amazon Redshift ML makes it simple for information scientists to create, practice, and deploy ML fashions utilizing acquainted SQL. You too can run predictions utilizing SQL.
  • Amazon Redshift offers complete information safety at no further price. You’ll be able to arrange end-to-end information encryption, configure firewall guidelines, outline granular row-level and column-level safety controls on delicate information, and extra.

On this put up, we present how one can migrate an information warehouse from Microsoft Azure Synapse to Redshift Serverless utilizing AWS Schema Conversion Instrument (AWS SCT) and AWS SCT information extraction brokers. AWS SCT makes heterogeneous database migrations predictable by mechanically changing the supply database code and storage objects to a format appropriate with the goal database. Any objects that may’t be mechanically transformed are clearly marked in order that they are often manually transformed to finish the migration. AWS SCT also can scan your software code for embedded SQL statements and convert them.

Resolution overview

AWS SCT makes use of a service account to connect with your Azure Synapse Analytics. First, we create a Redshift database into which Azure Synapse information can be migrated. Subsequent, we create an S3 bucket. Then, we use AWS SCT to transform Azure Synapse schemas and apply them to Amazon Redshift. Lastly, emigrate information, we use AWS SCT information extraction brokers, which extract information from Azure Synapse, add it into an S3 bucket, and replica it to Amazon Redshift.

The next diagram illustrates our resolution structure.

This walkthrough covers the next steps:

  1. Create a Redshift Serverless information warehouse.
  2. Create the S3 bucket and folder.
  3. Convert and apply the Azure Synapse schema to Amazon Redshift utilizing AWS SCT:
    1. Hook up with the Azure Synapse supply.
    2. Hook up with the Amazon Redshift goal.
    3. Convert the Azure Synapse schema to a Redshift database.
    4. Analyze the evaluation report and deal with the motion objects.
    5. Apply the transformed schema to the goal Redshift database.
  4. Migrate information from Azure Synapse to Amazon Redshift utilizing AWS SCT information extraction brokers:
    1. Generate belief and key shops (this step is optionally available).
    2. Set up and configure the information extraction agent.
    3. Begin the information extraction agent.
    4. Register the information extraction agent.
    5. Add digital partitions for big tables (this step is optionally available).
    6. Create an area information migration process.
    7. Begin the native information migration process.
  5. View information in Amazon Redshift.

Conditions

Earlier than beginning this walkthrough, you have to have the next stipulations:

  • A workstation with AWS SCT, Amazon Corretto 11, and Redshift drivers.
  • A database person account that AWS SCT can use to connect with your supply Azure Synapse Analytics database.
    • Grant VIEW DEFINITION and VIEW DATABASE STATE to every schema you are attempting to transform to the database person used for migration.

Create a Redshift Serverless information warehouse

On this step, we create a Redshift Serverless information warehouse with a workgroup and namespace. A workgroup is a group of compute sources and a namespace is a group of database objects and customers. To isolate workloads and handle totally different sources in Redshift Serverless, you may create namespaces and workgroups and handle storage and compute sources individually.

Comply with these steps to create a Redshift Serverless information warehouse with a workgroup and namespace:

  1. On the Amazon Redshift console, select the AWS Area that you just wish to use.
  2. Within the navigation pane, select Redshift Serverless.
  3. Select Create workgroup.

  1. For Workgroup title, enter a reputation that describes the compute sources.

  1. Confirm that the VPC is identical because the VPC because the EC2 occasion with AWS SCT.
  2. Select Subsequent.
  3. For Namespace, enter a reputation that describes your dataset.
  4. Within the Database title and password part, choose Customise admin person credentials.
  5. For Admin person title, enter a person title of your alternative (for instance, awsuser).
  6. For Admin person password, enter a password of your alternative (for instance, MyRedShiftPW2022).

  1. Select Subsequent.

Be aware that information within the Redshift Serverless namespace is encrypted by default.

  1. Within the Evaluate and Create part, select Create.

Now you create an AWS Id and Entry Administration (IAM) function and set it because the default in your namespace. Be aware that there can solely be one default IAM function.

  1. On the Redshift Serverless Dashboard, within the Namespaces / Workgroups part, select the namespace you simply created.
  2. On the Safety and encryption tab, within the Permissions part, select Handle IAM roles.
  3. Select Handle IAM roles and select Create IAM function.
  4. Within the Specify an Amazon S3 bucket for the IAM function to entry part, select one of many following strategies:
    1. Select No extra Amazon S3 bucket to permit the created IAM function to entry solely the S3 buckets with names containing the phrase redshift.
    2. Select Any Amazon S3 bucket to permit the created IAM function to entry all S3 buckets.
    3. Select Particular Amazon S3 buckets to specify a number of S3 buckets for the created IAM function to entry. Then select a number of S3 buckets from the desk.
  5. Select Create IAM function as default.
  6. Seize the endpoint for the Redshift Serverless workgroup you simply created.
  7. On the Redshift Serverless Dashboard, within the Namespaces / Workgroups part, select the workgroup you simply created.
  8. Within the Basic data part, copy the endpoint.

Create the S3 bucket and folder

In the course of the information migration course of, AWS SCT makes use of Amazon S3 as a staging space for the extracted information. Comply with these steps to create an S3 bucket:

  1. On the Amazon S3 console, select Buckets within the navigation pane.
  2. Select Create bucket.
  3. For Bucket title, enter a singular DNS-compliant title in your bucket (for instance, uniquename-as-rs).

For extra details about bucket names, consult with Bucket naming guidelines.

  1. For AWS Area, select the Area during which you created the Redshift Serverless workgroup.
  2. Select Create bucket.

  1. Select Buckets within the navigation pane and navigate to the S3 bucket you simply created (uniquename-as-rs).
  2. Select Create folder.
  3. For Folder title, enter incoming.
  4. Select Create folder.

Convert and apply the Azure Synapse schema to Amazon Redshift utilizing AWS SCT

To transform the Azure Synapse schema to Amazon Redshift format, we use AWS SCT. Begin by logging in to the EC2 occasion that you just created beforehand and launch AWS SCT.

Hook up with the Azure Synapse supply

Full the next steps to connect with the Azure Synapse supply:

  1. On the File menu, select Create New Venture.
  2. Select a location to retailer your venture information and information.
  3. Present a significant however memorable title in your venture (for instance, Azure Synapse to Amazon Redshift).
  4. To connect with the Azure Synapse supply information warehouse, select Add supply.
  5. Select Azure Synapse and select Subsequent.
  6. For Connection title, enter a reputation (for instance, olap-azure-synapse).

AWS SCT shows this title within the object tree in left pane.

  1. For Server title, enter your Azure Synapse server title.
  2. For SQL pool, enter your Azure Synapse pool title.
  3. Enter a person title and password.
  4. Select Check connection to confirm that AWS SCT can hook up with your supply Azure Synapse venture.
  5. When the connection is efficiently validated, select Okay and Join.

Hook up with the Amazon Redshift goal

Comply with these steps to connect with Amazon Redshift:

  1. In AWS SCT, select Add goal.
  2. Select Amazon Redshift, then select Subsequent.
  3. For Connection title, enter a reputation to explain the Amazon Redshift connection.

AWS SCT shows this title within the object tree in the suitable pane.

  1. For Server title, enter the Redshift Serverless workgroup endpoint you captured earlier.
  2. For Server port, enter 5439.
  3. For Database, enter dev.
  4. For Consumer title, enter the person title you selected when creating the Redshift Serverless workgroup.
  5. For Password, enter the password you selected when creating the Redshift Serverless workgroup.
  6. Deselect Use AWS Glue.
  7. Select Check connection to confirm that AWS SCT can hook up with your goal Redshift workgroup.
  8. When the take a look at is profitable, select OK.
  9. Select Join to connect with the Amazon Redshift goal.

Alternatively, you should use connection values which can be saved in AWS Secrets and techniques Supervisor.

Convert the Azure Synapse schema to a Redshift information warehouse

After you create the supply and goal connections, you will notice the supply Azure Synapse object tree within the left pane and the goal Amazon Redshift object tree in the suitable pane. We then create mapping guidelines to explain the supply goal pair for the Azure Synapse to Amazon Redshift migration.

Comply with these steps to transform the Azure Synapse dataset to Amazon Redshift format:

  1. Within the left pane, select (right-click) the schema you wish to convert.
  2. Select Convert schema.
  3. Within the dialog field, select Sure.

When the conversion is full, you will notice a brand new schema created within the Amazon Redshift pane (proper pane) with the identical title as your Azure Synapse schema.

The pattern schema we used has three tables; you may see these objects in Amazon Redshift format in the suitable pane. AWS SCT converts all of the Azure Synapse code and information objects to Amazon Redshift format. You too can use AWS SCT to transform exterior SQL scripts, software code, or extra information with embedded SQL.

Analyze the evaluation report and deal with the motion objects

AWS SCT creates an evaluation report back to assess the migration complexity. AWS SCT can convert nearly all of code and database objects, however some objects might require guide conversion. AWS SCT highlights these objects in blue within the conversion statistics diagram and creates motion objects with a complexity connected to them.

To view the evaluation report, change from Predominant view to Evaluation Report view as proven within the following screenshot.

The Abstract tab reveals objects that have been transformed mechanically and objects that weren’t transformed mechanically. Inexperienced represents mechanically transformed objects or objects with easy motion objects. Blue represents medium and complicated motion objects that require guide intervention.

The Motion objects tab reveals the advisable actions for every conversion subject. If you happen to select an motion merchandise from the listing, AWS SCT highlights the thing that the motion merchandise applies to.

The report additionally incorporates suggestions for how one can manually convert the schema merchandise. For instance, after the evaluation runs, detailed stories for the database and schema present you the hassle required to design and implement the suggestions for changing motion objects. For extra details about deciding how one can deal with guide conversions, see Dealing with guide conversions in AWS SCT. AWS SCT completes some actions mechanically whereas changing the schema to Amazon Redshift; objects with such actions are marked with a crimson warning signal.

You’ll be able to consider and examine the person object DDL by choosing it in the suitable pane, and it’s also possible to edit it as wanted. Within the following instance, AWS SCT modifies the ID column information kind from decimal(3,0) in Azure Synapse to the smallint information kind in Amazon Redshift.

Apply the transformed schema to the goal Redshift information warehouse

To use the transformed schema to Amazon Redshift, choose the transformed schema in the suitable pane, right-click, and select Apply to database.

Migrate information from Azure Synapse to Amazon Redshift utilizing AWS SCT information extraction brokers

AWS SCT extraction brokers extract information out of your supply database and migrate it to the AWS Cloud. On this part, we configure AWS SCT extraction brokers to extract information from Azure Synapse and migrate to Amazon Redshift. For this put up, we set up the AWS SCT extraction agent on the identical Home windows occasion that has AWS SCT put in. For higher efficiency, we advocate that you just use a separate Linux occasion to put in extraction brokers if doable. For very massive datasets, AWS SCT helps the usage of a number of information extraction brokers working on a number of cases to maximise throughput and improve the pace of knowledge migration.

Generate belief and key shops (optionally available)

You need to use Safe Socket Layer (SSL) encrypted communication with AWS SCT information extractors. If you use SSL, all information handed between the functions stays personal and integral. To make use of SSL communication, you must generate belief and key shops utilizing AWS SCT. You’ll be able to skip this step if you happen to don’t wish to use SSL. We advocate utilizing SSL for manufacturing workloads.

Comply with these steps to generate belief and key shops:

  1. In AWS SCT, select Settings, International settings, and Safety.
  2. Select Generate belief and key retailer.

  1. Enter a reputation and password for the belief and key shops.
  2. Enter a location to retailer them.
  3. Select Generate, then select OK.

Set up and configure the information extraction agent

Within the set up bundle for AWS SCT, yow will discover a subfolder known as brokers (aws-schema-conversion-tool-1.0.newest.zipagents). Find and set up the executable file with a reputation like aws-schema-conversion-tool-extractor-xxxxxxxx.msi.

Within the set up course of, observe these steps to configure AWS SCT Information Extractor:

  1. For Service port, enter the port quantity the agent listens on. It’s 8192 by default.
  2. For Working folder, enter the trail the place the AWS SCT information extraction agent will retailer the extracted information.

The working folder may be on a distinct laptop from the agent, and a single working folder may be shared by a number of brokers on totally different computer systems.

  1. For Enter Redshift JDBC driver file or information, enter the placement the place you downloaded the Redshift JDBC drivers.
  2. For Add the Amazon Redshift driver, enter YES.
  3. For Allow SSL communication, enter sure. Enter No right here if you happen to don’t wish to use SSL.
  4. Select Subsequent.

  1. For Belief retailer path, enter the storage location you specified when creating the belief and key retailer.
  2. For Belief retailer password, enter the password for the belief retailer.
  3. For Allow consumer SSL authentication, enter sure.
  4. For Key retailer path, enter the storage location you specified when creating the belief and key retailer.
  5. For Key retailer password, enter the password for the important thing retailer.
  6. Select Subsequent.

Begin the information extraction agent

Use the next process to begin extraction brokers. Repeat this process on every laptop that has an extraction agent put in.

Extraction brokers act as listeners. If you begin an agent with this process, the agent begins listening for directions. You ship the brokers directions to extract information out of your information warehouse in a later part.

To begin the extraction agent, navigate to the AWS SCT Information Extractor Agent listing. For instance, in Microsoft Home windows, use C:Program FilesAWS SCT Information Extractor AgentStartAgent.bat.

On the pc that has the extraction agent put in, from a command immediate or terminal window, run the command listed in your working system. To cease an agent, run the identical command however exchange begin with cease. To restart an agent, run the identical RestartAgent.bat file.

Be aware that you need to have administrator entry to run these instructions.

Register the information extraction agent

Comply with these steps to register the information extraction agent:

  1. In AWS SCT, change the view to Information Migration view select Register.
  2. Choose Redshift information agent, then select OK.

  1. For Description, enter a reputation to establish the agent.
  2. For Host title, if you happen to put in the extraction agent on the identical workstation as AWS SCT, enter 0.0.0.0 to point native host. In any other case, enter the host title of the machine on which the AWS SCT extraction agent is put in. It is strongly recommended to put in extraction brokers on Linux for higher efficiency.
  3. For Port, enter the quantity you used for the listening port (default 8192) when putting in the AWS SCT extraction agent.
  4. Choose Use SSL to encrypt AWS SCT connection to Information Extraction Agent.

  1. If you happen to’re utilizing SSL, navigate to the SSL tab.
  2. For Belief retailer, select the belief retailer you created earlier.
  3. For Key retailer, select the important thing retailer you created earlier.
  4. Select Check connection.
  5. After the connection is validated efficiently, select OK and Register.

Create an area information migration process

Emigrate information from Azure Synapse Analytics to Amazon Redshift, you create, run, and monitor the native migration process from AWS SCT. This step makes use of the information extraction agent emigrate information by making a process.

Comply with these steps to create an area information migration process:

  1. In AWS SCT, beneath the schema title within the left pane, select (right-click) the desk you wish to migrate (for this put up, we use the desk tbl_currency).
  2. Select Create Native process.

  1. Select from the next migration modes:
    1. Extract the supply information and retailer it on an area PC or digital machine the place the agent runs.
    2. Extract the information and add it to an S3 bucket.
    3. Extract the information, add it to Amazon S3, and replica it into Amazon Redshift. (We select this selection for this put up.)

  1. On the Superior tab, present the extraction and replica settings.

  1. On the Supply server tab, ensure you are utilizing the present connection properties.

  1. On the Amazon S3 settings tab, for Amazon S3 bucket folder, present the bucket and folder names of the S3 bucket you created earlier.

The AWS SCT information extraction agent uploads the information in these S3 buckets and folders earlier than copying it to Amazon Redshift.

  1. Select Check Job.

  1. When the duty is efficiently validated, select OK, then select Create.

Begin the native information migration process

To begin the duty, select Begin or Restart on the Duties tab.

First, the information extraction agent extracts information from Azure Synapse. Then the agent uploads information to Amazon S3 and launches a duplicate command to maneuver the information to Amazon Redshift.

At this level, AWS SCT has efficiently migrated information from the supply Azure Synapse desk to the Redshift desk.

View information in Amazon Redshift

After the information migration process is full, you may hook up with Amazon Redshift and validate the information. Full the next steps:

  1. On the Amazon Redshift console, navigate to the Question Editor v2.
  2. Open the Redshift Serverless workgroup you created.
  3. Select Question information.

  1. For Database, enter a reputation in your database.
  2. For Authentication, choose Federated person
  3. Select Create connection.

  1. Open a brand new editor by selecting the plus signal.
  2. Within the editor, write a question to pick out from the schema title and desk or view title you wish to confirm.

You’ll be able to discover the information, run ad-hoc queries, and make visualizations, charts, and views.

The next screenshot is the view of the supply Azure Synapse dataset we used on this put up.

Clear up

Comply with the steps on this part to scrub up any AWS sources you created as a part of this put up.

Cease the EC2 occasion

Comply with these steps to cease the EC2 occasion:

  1. On the Amazon EC2 console, within the navigation pane, select Situations.
  2. Choose the occasion you created.
  3. Select Occasion state, then select Terminate occasion.
  4. Select Terminate when prompted for affirmation.

Delete the Redshift Serverless workgroup and namespace

Comply with these steps to delete the Redshift Serverless workgroup and namespace:

  1. On the Redshift Serverless Dashboard, within the Namespaces / Workgroups part, select the workspace you created
  2. On the Actions menu, select Delete workgroup.
  3. Choose Delete the related namespace.
  4. Deselect Create closing snapshot.
  5. Enter delete within the affirmation textual content field and select Delete.

Delete the S3 bucket

Comply with these steps to delete the S3 bucket:

  1. On the Amazon S3 console, select Buckets within the navigation pane.
  2. Select the bucket you created.
  3. Select Delete.
  4. To substantiate deletion, enter the title of the bucket.
  5. Select Delete bucket.

Conclusion

Migrating an information warehouse could be a difficult, complicated, and but rewarding venture. AWS SCT reduces the complexity of knowledge warehouse migrations. This put up mentioned how an information migration process extracts, downloads, and migrates information from Azure Synapse to Amazon Redshift. The answer we introduced performs a one-time migration of database objects and information. Information adjustments made in Azure Synapse when the migration is in progress gained’t be mirrored in Amazon Redshift. When information migration is in progress, put your ETL jobs to Azure Synapse on maintain or rerun the ETL jobs by pointing to Amazon Redshift after the migration. Think about using the greatest practices for AWS SCT.

To get began, obtain and set up AWS SCT, sign up to the AWS Administration Console, try Redshift Serverless, and begin migrating!


Concerning the Authors

Ahmed Shehata is a Senior Analytics Specialist Options Architect at AWS primarily based on Toronto. He has greater than twenty years of expertise serving to prospects modernize their information platforms. Ahmed is obsessed with serving to prospects construct environment friendly, performant, and scalable analytic options.

Jagadish Kumar is a Senior Analytics Specialist Options Architect at AWS targeted on Amazon Redshift. He’s deeply obsessed with Information Structure and helps prospects construct analytics options at scale on AWS.

Anusha Challa is a Senior Analytics Specialist Resolution Architect at AWS targeted on Amazon Redshift. She has helped many shoppers construct large-scale information warehouse options within the cloud and on premises. Anusha is obsessed with information analytics and information science and enabling prospects obtain success with their large-scale information tasks.

[ad_2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here