Create Data Pipelines – Create and Manage Batch Processing and Pipelines

  1. Log in to the Azure portal at https://portal.azure.com ➢ navigate to the Azure Synapse Analytics workspace you created in Exercise 3.3 ➢ click the Open link in the Open Synapse Studio tile on the Overview blade ➢ select the Integrate hub ➢ select the TransformSessionFrequencyToMedian pipeline from the Pipelines list ➢ with no activity selected, select the Parameters tab ➢ add the parameters as provided in Table 6.3 (refer to Figure 6.23) ➢ and then replace the default value with your unique values.

TABLE 6.3Exercise 6.6 pipeline parameters

NameTypeDefault value
storageAccountNameString<storageAccountName>
storageAccountContainerNameString<ADLS containerName>
inputLocationString<Path to files for processing>
outputLocationString<Path to place files after processing>
accountKeyString<storageAccountKey>
  1. Select the Calculate Frequency Median activity ➢ select the Settings tab ➢ click the multiline text box that currently contains run.bat ➢ click the Add Dynamic Content link that is rendered ➢ place the following syntax (available in the Chapter06/Ch06Ex06 directory on GitHub) into the dynamic content editor ➢ and then click OK.
  2. With no activity selected, open the Variables tab ➢ add variables named containerAccount and datetimePath of type String with no default value ➢ drag and drop two Set Variable activities from the General group ➢ and then place both activities between the Calculate Frequency Median and the To Avro activity, as shown in Figure 6.29.

FIGURE 6.29 Azure Synapse Analytics pipeline variables

  1. On the Variables tab of the Set containerAccount activity, select containerAccount from the drop‐down list box ➢ set the focus to the Value multiline text box ➢ click the Add Dynamic Content link ➢ enter the following syntax into the dynamic editor ➢ and then click OK.
  2. Add the following syntax to the Set datetimePath activity using the same procedure as described in step 4:
    @concat(utcnow(‘yyyy’), ‘/’, utcnow(‘MM’), ‘/’, utcnow(‘dd’), ‘/’, utcnow(‘HH’))
  3. On the Settings tab for the To Avro activity, click the third argument, which contains the @ ➢ click the Add Dynamic Content link ➢ enter the following syntax into the dynamic editor ➢ and then click OK.
    @variables(‘containerAccount’)
  4. Replace the second argument, which was hard‐coded, to, for example, EMEA/brainjammer/in/2022/06/15/08 ➢ and then replace it with the following syntax using the same procedure described in step 6.
    @concat(‘EMEA/brainjammer/in/’, variables(‘datetimePath’))
  5. Click the Commit button ➢ select the Publish menu item ➢ click the Debug button ➢ review the parameters ➢ click the OK button ➢ when complete, click the Input icon on the Output tab, as shown in Figure 6.30 ➢ and then review the dynamically created input.

FIGURE 6.30 Azure Synapse Analytics pipeline dynamic arguments

You might agree that being able to set parameters and variables dynamically is a very useful feature. It is important to avoid hard‐coding values that your batch, pipeline, and activities use to process data. Those values tend to change, and when they do, there are exceptions and failures. In a lot of cases, it is not intuitively obvious that the issue lies with a parameter being used that is no longer valid. It is best to make your solutions as flexible as possible.

Ileana Pecos

Learn More

Leave a Reply

Your email address will not be published. Required fields are marked *