Azure Stream Analytics is not a direct consumer of the information provided by AML or from EDA. It is, however, the primary indirect recipient of the findings and the reason that this analysis was done. The objective has been to find some numbers that are unique per brain wave scenario. You can use those discovered unique values, or perhaps a range of values that uniquely represents the scenario, and compare them to a stream of brain wave values in real time. Then you can attempt to determine the scenario or activity an individual is doing based on the brain wave values. Table 5.2 represents those findings, which will be used later to attempt and hopefully achieve this objective.
Consider the first two scenarios in the preceding table, which are ALPHA frequency brain wave values from all five electrodes. Compare the smaller FlipChart.ALPHA reading value with the larger TikTok.ALPHA value. Adding those two values together and dividing them by two results in the Midpoint between those readings.
2.9904 = (2.9018 + 3.0789) / 2
The Midpoint is the end range for the FlipChart.ALPHA reading and the beginning range of the TikTok.ALPHA value. Subtracting the value, i.e., Midpoint, from the FlipChart.ALPHA value returns a negative number. Adding it to the FlipChart.ALPHA gives you the starting point for identifying a FlipChart scenario based on an ALPHA frequency.
2.8132 = (2.9018 – 2.9904) + 2.9018
0.0886 = 2.9904 – 2.9018
0.0886 = 2.9018 – 2.8132
The result is that the distance between the starting range and stopping range for the given scenario is the same.
Summary
In this chapter you learned about data transformation. Exactly what is considered data transformation is relative and can be any activity that changes the shape or format of the data. Pulling some data from an Azure SQL database and saving it to a Parquet file is a form of data transformation, as is converting numeric foreign keys to more readable data using dimension table references. Each iteration of transformation takes it closer and closer to the end objective of rendering business data from which valuable insights can be gathered. Those insights are then used to make decisions about gaining further data insights, driving business operations, or predicting future performance.
You can use numerous tools on Azure to perform data transformation. Azure Synapse Analytics is the recommended tool for those moving new or existing midsized data analytics work streams to the cloud. Azure Databricks and Azure HDInsight also provide data transformations for those who already have those products on‐premises and want to move the workloads to the cloud. Transformation activities like cleansing and splitting data, shredding JSON, encoding, and decoding, and normalizing data are all possible on the Azure platform.
Once you have data transformed into a state that is ready for analysis, you perform exploratory data analysis (EDA), which requires creativity, at least one hypothesis, and a good understanding of your data. Use Azure Machine Learning (AML) predictions to estimate a numeric value based on a group of input variables. The model in AML that is helpful for this is called regression, but there are other types of modeling capabilities in AML. Once you find what you are looking for, you can use charting tools in Azure Databricks or Power BI to show the results in a way that is easy to consume visually.