{"id":65202,"date":"2024-09-15T10:56:00","date_gmt":"2024-09-15T05:26:00","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=65202"},"modified":"2024-09-16T14:51:29","modified_gmt":"2024-09-16T09:21:29","slug":"unlocking-seamless-data-integration-in-the-cloud-with-azure-data-factory","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/unlocking-seamless-data-integration-in-the-cloud-with-azure-data-factory\/","title":{"rendered":"Unlocking Seamless Data Integration in the Cloud with Azure Data Factory"},"content":{"rendered":"<h3 style=\"text-align: justify;\">Introduction<\/h3>\n<p style=\"text-align: justify;\">In today&#8217;s data-driven world, managing and transforming data from various sources is a very cumbersome task for organizations. Azure Data Factory (ADF) stands out as an extensive and robust ETL and cloud-based data integration service that helps enable businesses to streamline their complex data-driven workflows timely and with ease. \u00a0Azure Data Factory provides a scalable and flexible platform that is designed to meet your data integration needs, whether you\u2019re orchestrating your ETL processes, automating data movement between on-premise and cloud, or transforming raw data into meaningful insight.<\/p>\n<p style=\"text-align: justify;\">Let\u2019s understand core ADF concepts:<\/p>\n<div id=\"attachment_65285\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65285\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65285\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/relationship-between-data-factory-entities.png\" alt=\"Azure Data Factory Components\" width=\"741\" height=\"234\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/relationship-between-data-factory-entities.png 781w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/relationship-between-data-factory-entities-300x95.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/relationship-between-data-factory-entities-768x243.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/relationship-between-data-factory-entities-624x197.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65285\" class=\"wp-caption-text\">Azure Data Factory Components<\/p><\/div>\n<p>&nbsp;<\/p>\n<ol style=\"text-align: justify;\">\n<li><strong><span style=\"color: #4d4040;\">Pipelines<\/span><\/strong>: It is a logical grouping of activities.<\/li>\n<li><strong><span style=\"color: #4d4040;\">Activities:<\/span><\/strong> Each activity performs a task. An activity could be a data movement e.g. taking an ingestion, pulling data into the cloud, and performing a transformation on it. There are different types of activities:\n<ol>\n<li><strong><em>Data Movement:<\/em><\/strong> For example a copy activity. It moves data between sources and sinks so that we can perform operations on the data.<\/li>\n<li><em><strong>Data Transformation<\/strong>:<\/em> These are external activities where computation is done outside ADF on a data store.<br \/>\nFor example, a stored procedure that is executed on a\u00a0 Database, Azure Functions, Spark, etc.<\/li>\n<li><strong><em>Control:<\/em> <\/strong>Native to ADF. For eg., Web-calls, validate pipelines, etc.<\/li>\n<\/ol>\n<\/li>\n<li><strong><span style=\"color: #4d4040;\">Data Sets<\/span><\/strong> represent data from any data store. Data sets get consumed by an activity in a pipeline as shown in the above figure.<\/li>\n<li><span style=\"color: #4d4040;\"><strong>Linked Services:<\/strong><\/span> These are\u00a0 the connection strings needed to connect to the data sets<\/li>\n<li><span style=\"color: #4d4040;\"><strong>Integration Runtime<\/strong><\/span>: It provides the compute that allows ADF to run the pipelines.<\/li>\n<\/ol>\n<h3 style=\"text-align: justify;\">Use Case Scenario<\/h3>\n<p style=\"text-align: justify;\">Now, let\u2019s try to understand the workings of ADF with the help of a use case. Suppose we have a storage account in which users upload all sorts of data, such as CSVs, images, HTML or text files, etc. We want only the images from the first storage account, for this, we\u2019ll copy the images from the source storage account to the destination storage account. We have to use wildcards in the pipeline to select only images. This is a simple task when you just have to use a single wildcard. You can straightaway just use the Copy Activity and achieve it. But here we have to move all images be it JPEG or PNG, which means we\u2019re dealing with multiple wildcards.<\/p>\n<h3 style=\"text-align: justify;\">Prerequisites<\/h3>\n<ol style=\"text-align: justify;\">\n<li>Blank Azure Data Factory<\/li>\n<li>Two storage accounts (source, destination). In our example, <em>sourceadfaccount<\/em> is the source storage account and <em>destinationadfaccount <\/em>is the destination storage account.\n<div id=\"attachment_65196\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65196\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65196\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05.png\" alt=\"Source Storage Account\" width=\"741\" height=\"244\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05.png 1913w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05-300x99.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05-1024x337.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05-768x253.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05-1536x505.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-43-05-624x205.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65196\" class=\"wp-caption-text\">Source Storage Account<\/p><\/div>\n<p>&nbsp;<\/p>\n<div id=\"attachment_65198\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65198\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65198\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07.png\" alt=\"Destination Storage account contents before running the pipeline:\" width=\"741\" height=\"244\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07.png 1913w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07-300x99.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07-1024x337.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07-768x253.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07-1536x505.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-45-07-624x205.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65198\" class=\"wp-caption-text\">Destination Storage Account contents before running the pipeline<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Contributor access over the subscription.<\/li>\n<\/ol>\n<h3>Solution<\/h3>\n<ol style=\"text-align: justify;\">\n<li>Go to your ADF Studio by clicking on the Launch Studio button.\n<div id=\"attachment_65199\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65199\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65199\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10.png\" alt=\"Azure Data Factory Launch Studio\" width=\"741\" height=\"244\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10.png 1913w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10-300x99.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10-1024x337.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10-768x253.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10-1536x505.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-14-47-10-624x205.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65199\" class=\"wp-caption-text\">Azure Data Factory Launch Studio<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Now we need to connect to our source and destination storage accounts, which means we need to create two linked services. For this, click on Manage in the left-hand side menu and select Linked Services. Click on + and then select \u201cAzure Blob Storage\u201d as the Data store.\n<div id=\"attachment_65286\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65286\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65286\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57.png\" alt=\"Add a new Linked Service\" width=\"741\" height=\"320\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57.png 1859w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57-300x129.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57-1024x442.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57-768x331.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57-1536x663.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-24-57-624x269.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65286\" class=\"wp-caption-text\">Add a new Linked Service<\/p><\/div>\n<p>Fill in all the information for the source storage account and test the connection. Once successful, repeat the steps for the destination storage account.<\/p>\n<div id=\"attachment_65287\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65287\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65287\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24.png\" alt=\"Choose Azure Blob Storage Data Store as the Linked Service\" width=\"741\" height=\"396\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24.png 1889w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24-300x160.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24-1024x547.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24-768x410.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24-1536x820.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-26-24-624x333.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65287\" class=\"wp-caption-text\">Choose Azure Blob Storage Data Store as the Linked Service<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Next, we have to create the two datasets, <em>sourceimg<\/em> and <em>destimg<\/em>. For this, select the Author menu item on the left side. Click on the + button and select Dataset. Now, again select the data storage as Blob Storage and because our image files are of type binary, we\u2019ll be selecting binary format.\n<div id=\"attachment_65288\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65288\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65288\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38.png\" alt=\"Add a new Data Set\" width=\"741\" height=\"429\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38.png 1056w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38-300x174.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38-1024x592.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38-768x444.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-29-38-624x361.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65288\" class=\"wp-caption-text\">Add a new Data Set<\/p><\/div>\n<p>Select source linked service, browse the file path, and click on OK.<\/p>\n<div id=\"attachment_65289\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65289\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65289\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24.png\" alt=\"Add Source Data Set\" width=\"741\" height=\"399\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24.png 1888w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24-300x162.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24-1024x552.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24-768x414.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24-1536x827.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-31-24-624x336.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65289\" class=\"wp-caption-text\">Add Source Data Set<\/p><\/div>\n<p>Repeat the same steps for the destination dataset.<\/li>\n<li>Now, we\u2019ll start with creating our pipeline. Click on the + button again in the Author Tab and this time select Pipeline.\n<p>&nbsp;<\/p>\n<div id=\"attachment_65266\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65266\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65266\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06.png\" alt=\"Create a new pipeline\" width=\"741\" height=\"307\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06.png 1076w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06-300x124.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06-1024x424.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06-768x318.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-38-06-624x259.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65266\" class=\"wp-caption-text\">Create a new pipeline<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>The first task for us is to traverse our source storage account and get all the files that need to be scanned. For this we\u2019ll be using the \u201cGet Metadata\u201d activity.\n<p>&nbsp;<\/p>\n<div id=\"attachment_65268\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65268\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65268\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17.png\" alt=\"GetMetadata\" width=\"741\" height=\"416\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17.png 1436w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17-300x168.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17-1024x575.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17-768x431.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-12-53-17-624x350.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65268\" class=\"wp-caption-text\">Get Metadata Activity<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Go to settings and add \u201cChild Items\u201d as an argument. This returns a list of sub-folders and files in the given folder. Returned value is a list of the name and type of each child item.<\/p>\n<div id=\"attachment_65269\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65269\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65269\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-29-44.png\" alt=\"Modify Activity Settings\" width=\"741\" height=\"260\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-29-44.png 1017w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-29-44-300x105.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-29-44-768x270.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-29-44-624x219.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65269\" class=\"wp-caption-text\">Modify Activity Settings<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Now we have all the files from the source and next we need to filter out the image files (jpeg + png). For this task, we\u2019ll be using the ADF \u201cFilter\u201d activity.&nbsp;\n<div id=\"attachment_65270\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65270\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65270\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29.png\" alt=\"Filter Activity\" width=\"741\" height=\"465\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29.png 1363w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29-300x188.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29-1024x642.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29-768x482.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-17-39-29-624x391.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65270\" class=\"wp-caption-text\">Filter Activity<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Use the following expressions for items and conditions respectively:<br \/>\n<strong><span style=\"color: #615656;\">Items:\u00a0<\/span><\/strong> This gets the output files from the previous activity<\/p>\n<pre>@activity('getimagesfromsource').output.childItems<\/pre>\n<p><strong><span style=\"color: #615656;\">Conditions:<\/span><\/strong> This is used to filter out the jpeg and png files from the output we received from the last activity.<\/p>\n<pre>@or(contains(item().name,'.jpeg'),contains(item().name,'.png'))\r\n\r\n<\/pre>\n<\/li>\n<li>The filter activity will give us the desired filtered image files. Next, we need to iterate the output to copy these to the destination storage account. For this task we\u2019ll be using the \u201cFor Each\u201d activity.Under the settings, for the items click on \u201cAdd dynamic content\u201d and select the filter activity&#8217;s output value.\n<p>&nbsp;<\/p>\n<div id=\"attachment_65271\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65271\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65271\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46.png\" alt=\"For Each Activity\" width=\"741\" height=\"388\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46.png 1910w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46-300x157.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46-1024x536.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46-768x402.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46-1536x803.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-18-58-46-624x326.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65271\" class=\"wp-caption-text\">For Each Activity<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Now go to the \u201cActivities\u201d tab for the \u201cFor Each\u201d\u00a0 activity and add the \u201cCopy\u201d\u00a0 Activity.<\/li>\n<li>In the \u201cCopy\u201d Activity choose the source as the <em>sourceimg<\/em> dataset. In the File path type, select &#8220;Wildcard file path&#8221; and for the filename add \u201c@item().name\u201d.&nbsp;\n<div id=\"attachment_65275\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65275\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65275\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52.png\" alt=\"Copy image source\" width=\"741\" height=\"408\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52.png 1435w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52-300x165.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52-1024x564.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52-768x423.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-09-06-13-06-52-624x344.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65275\" class=\"wp-caption-text\">Copy image source<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>For the sink, add the <em>destimg<\/em> dataset.<\/p>\n<div id=\"attachment_65276\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65276\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65276\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48.png\" alt=\"Copy image destination\" width=\"741\" height=\"329\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48.png 1445w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48-300x133.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48-1024x455.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48-768x341.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-03-48-624x277.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65276\" class=\"wp-caption-text\">Copy image destination<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Go back to the main pipeline. Make sure you have connected all the activities. It will look something like this:&nbsp;\n<div id=\"attachment_65277\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65277\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65277\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09.png\" alt=\"Final Data Pipeline\" width=\"741\" height=\"350\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09.png 1439w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09-300x142.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09-1024x483.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09-768x362.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-05-09-624x294.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65277\" class=\"wp-caption-text\">Final Data Pipeline<\/p><\/div>\n<p>&nbsp;<\/li>\n<li>Next, we have to\u00a0 \u201cTrigger\u201d the pipeline. For that, you need to validate and publish your pipeline. You can also choose to trigger it now or anytime manually.&nbsp;\n<div id=\"attachment_65278\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65278\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65278\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-32-16.png\" alt=\"Trigger Now\" width=\"741\" height=\"172\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-32-16.png 819w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-32-16-300x70.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-32-16-768x178.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-32-16-624x145.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65278\" class=\"wp-caption-text\">Trigger Now<\/p><\/div>\n<p>Click on \u201cNew\/Edit\u201d. Here you can choose to run the pipeline based on a schedule or various events:<\/p>\n<div id=\"attachment_65279\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65279\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65279\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-33-51.png\" alt=\"Create a New Trigger\" width=\"741\" height=\"793\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-33-51.png 949w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-33-51-280x300.png 280w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-33-51-768x822.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-33-51-624x668.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65279\" class=\"wp-caption-text\">Create a New Trigger<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>We\u2019ll be choosing to trigger the pipeline now, as we already have the data. You can see the logs and output from the activity runs window at the bottom of the screen.<\/p>\n<div id=\"attachment_65280\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65280\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65280\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23.png\" alt=\"Activity Runs\" width=\"741\" height=\"523\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23.png 1458w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23-300x212.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23-1024x723.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23-768x542.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-06-23-624x440.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65280\" class=\"wp-caption-text\">Activity Runs<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Destination Storage account contents after running the pipeline:<\/p>\n<div id=\"attachment_65281\" style=\"width: 751px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-65281\" decoding=\"async\" loading=\"lazy\" class=\"wp-image-65281\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49.png\" alt=\"Destination Storage Account\" width=\"741\" height=\"340\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49.png 1911w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49-300x138.png 300w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49-1024x469.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49-768x352.png 768w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49-1536x704.png 1536w, \/blog\/wp-ttn-blog\/uploads\/2024\/09\/Screenshot-from-2024-08-26-19-08-49-624x286.png 624w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><p id=\"caption-attachment-65281\" class=\"wp-caption-text\">Destination Storage Account<\/p><\/div>\n<p>&nbsp;<\/li>\n<\/ol>\n<p style=\"text-align: justify;\">So as you can see above, the ADF pipeline has successfully copied the image files from the source to the destination storage account. You can also add multiple activities in between if you want to perform any data transformations before actually storing the contents in the destination. For example, if you want the files to be zipped you can select the compression type while setting up the dataset etc.<\/p>\n<h3 style=\"text-align: justify;\">Summary<\/h3>\n<p style=\"text-align: justify;\">In this blog, we understood what an Azure Data Factory (ADF) is and how it solves our use case. One can achieve a lot using ADF, whether you\u2019re modernizing legacy systems, building new data workflows, or scaling your analytics capabilities. ADF can adapt to your evolving needs. It makes managing and organizing data a lot easier so that you focus on more of the decision-making.\u00a0 If you\u2019re looking to streamline your data processes, Azure Data Factory could be just the tool you need.<\/p>\n<p style=\"text-align: justify;\">\n","protected":false},"excerpt":{"rendered":"<p>Introduction In today&#8217;s data-driven world, managing and transforming data from various sources is a very cumbersome task for organizations. Azure Data Factory (ADF) stands out as an extensive and robust ETL and cloud-based data integration service that helps enable businesses to streamline their complex data-driven workflows timely and with ease. \u00a0Azure Data Factory provides a [&hellip;]<\/p>\n","protected":false},"author":1465,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":55},"categories":[2348],"tags":[3457,1197,6445,1892],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/65202"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/1465"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=65202"}],"version-history":[{"count":22,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/65202\/revisions"}],"predecessor-version":[{"id":66129,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/65202\/revisions\/66129"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=65202"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=65202"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=65202"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}