!!! Understanding customer load
Understanding the customer load process, version 4.6.6.6.
We will analyze the loading process of the In-Customer dimension. Unlike the product dimension, where brand, category, and product were in separate tables, here everything is concentrated in a single table. Therefore, we only need to perform one transformation process to adapt the data to the same layout as the destination table.
The data sources are varied. We have an Excel spreadsheet and a CSV file. In this context, we will work with these two sources aiming, at the end of the process, to unify them into a single data source before writing to the final table through an automatic box.
Next, we will detail the steps necessary to proceed with this process, treating each of the sources separately until the moment of their union. Starting with the Excel source. The customer's code and descriptor will be extracted directly from the ID Customer and Customer columns, respectively.
The Segment column indicates the descriptor of the corresponding segment to the Desk Segment column of the final table. However, it will be necessary to apply the lookup to obtain the code corresponding to the segment. From the Androids column, we will employ the Split function to separate the city descriptor and the state code, which will be represented by its acronym.
Next, we will use the lookup with the city description to find its corresponding code. Temporarily leaving the Excel spreadsheet data source, we move on to the CSV source. From this, we can directly extract the CodeState and DeskState fields from the State Code and State columns.
In addition, the Region column provides us with DeskRegion, which through the lookup, will allow us to obtain the CodeRegion. With both sources properly transformed, we can unify them using the common field CodeState, which is common in both. Thus, we will compile all the necessary data to write to the InCustomer table.
Now, let's start the process of creating the transformations, starting with the Drivers and Containers of these two sources. I've returned to the PIS. Let's observe the text file driver we created.
In this driver, we have the name "List of Brands and Categories." But note that the directory of this driver is the directory where all the files are located, so maybe this name is not very accurate. Let's modify the name to "List of CSV Files."
Let's give it an update. By making this modification, if we go now to DataContainer, TextFile, note that when selecting the Dimension Brands and Categories container, we have here the name of the driver different from what we just modified. I'll drag the window to have them side by side, and we can see that the driver's name is one and the reference to this driver in the container is another.
Changing some property in the driver does not automatically reflect in the container. However, is this harmful? The answer is, it depends. In the case of the relationship between the text file driver and container, no.
This change does not affect the container because the identification is made by its internal code that is not modified when we modify the name. Note that here I have this code that starts with 614, in my case, and it's the same identifier as the driver. However, of course, to maintain consistency, let's also modify the reference in the container.
To do this, I just need to click on Edit and then Save. Note that now the names are the same. In the container, I'll give Update and save the change.
I'll drag back the text container tab so I can see it full screen. And let's take advantage of this same container and let's create now the one for states and regions. The reference to the driver is the same, what will change is only the name of the file.
If we observe here, the name of the file is "Regions and States." I'll give it a New. Now I have here the container that reads from the CSV of states and regions.
We can give a Preview and observe what is obtained from this file. Let's do the same now for the Excel driver and container. So click on Driver, Excel File, note that the name is Excel Source List of Products.
Let's keep the same naming convention, "List of xlsx Files." Let's give it an Update, and I'll access DataContainer, Excel, and for the Dimension Product container, we'll come here where I make reference to the driver, click on Edit, click on Save, click on Update. Let's then take advantage of this same container, and I'll modify it to also in the descriptor I'll change it to Clip.
The file name is not Product, so checking here it was Customers.xlsx, right-click, I'll copy the name, I'll paste it here in Appointment Files, and I'll create a new container. We can give a Preview and note that I have an error. What's this error? It's because the Worksheet of the spreadsheet may have a different name, and also the mapping has to be different.
For that, I'll close it, I'll come here in Mapping Table, and I'll click on Preview. So I'll select from position A1 to position F1 to select the header of the spreadsheet, and I'll click on the first Range Cells button. Let's repeat the process to select from position A2 to F2 and click on the second Range Cells.
I'll give it an Update. Now yes, I can give a Preview, and I have the content of the customer spreadsheet. So at this point, if I come here in the Text File Data Container, click on the Dimension States and Regions, and give a Preview, we have both from the first tab and the second tab the two sources.
Now let's program the transformations on each of these sources, and at a certain point, we will merge them before writing to the Automatic Box. Thank you, and until the next video.