Since we have 5 measures there are 10 scatter plots which contribute to meaningful analysis. We can either pay attention to right angle triangle above diagonal or below diagonal. If you observe the scatter plots are symmetrical across a diagonal running from top-left to bottom-right and the scatter plots on the diagonal itself do not make sense as plotting a measure against itself will produce a perfect linear correlation. We can start seeing the correlation between any two pair of measures in the matrix. Notice that we now have moved very close to our final target. we will put car name onto detail card for creating various scatter plots to analyze correlation between various attributes present in our dataset. cylinders, acceleration, mileage per gallon etc. For our context since we are analyzing the characteristics of different cars i.e. Remember, for creating scatter plot you must choose the granularity of the data by putting a dimension onto a detail shelf. Notice that we still don’t have the data plotted into individual scatter plots in the matrix. Once you have changed the aggregation method for all measures from SUM to AVG, the column and row shelf should look like as below. The reason behind changing the aggregation of measures from SUM to AVG is because there are multiple records for the same car as model year can be different hence summing the measures will not make sense.Īs shown below right click on measure in row/column shelf and choose Avg under Measures option. Step 5 – Change aggregation of measures from SUM to AVG Though the basic skeleton for our scatter plot matrix is created but we have to perform a few more steps to turn into a really useful visualization. Likewise once you have double clicked on all 5 measures you should see the below scatter plot matrix. On double clicking on third measure you should see following scatter plot matrix. After you have double clicked on first two measures you should see a single scatter plot as shown below. Start double clicking on measures one after the other. You should see Dimension and Measures pane as shown below once Cylinders and Origin are converted into Dimension. Similarly convert Origin into Dimension as well. Step 3 – Convert Origin and Cylinders to DimensionĪs shown below right click on Cylinders and convert it into Dimension. Step 2 – Go to Sheet 1 and analyse/review the loaded data.Īs shown below, following dimensions and measures must be detected by Tableau upon loading sheet 1. Tableau Data Interpreter indicates that data doesn’t look good but there doesn’t seem to be any issues with the data so you can choose to ignore the warning posed by Tableau’s data interpreter. There should be 398 records in the dataset. I have my data stored in Excel file named auto-mpg as shown below. Hence we will make sure to convert Origin and Cylinders into dimension after loading them into Tableau. Actually origin is the place of manufacturing for car under consideration and is either produced in Europe, Asia or North America but it has been converted into numeric form may be for regression purposes. Cylinders take values from 3 to 8 whereas origin takes values from 1 to 3. Though Origin, Cylinders appear is numeric in nature, after close examination at the actual data records it can be concluded that they are actually categorical in nature. Let us have a look at the dimensions and measures that needs to be understood in order to create scatter plot matrix from this dataset. The headers for the data can be source from here. The data for our exercise is available here (free of unknown values) and can be converted into CSV or Excel file manually as the headers are missing in the dataset. For this exercise we will use an Auto MPG Data Set from University of California, Irvine website which has lot of publicly available dataset for machine learning purposes. Data To create scatter plot we all know that we need two measures, so we must choose a dataset for this exercise that has at least 3 measures else we will not be able to create a matrix of scatter plots. Though scatter plot matrix visualization is not available readily in Tableau as one click visualization under Show me but it can be created quite easily. Scatter plot matrix is a great way to roughly determine if you have a linear correlation between multiple variables. In this article we are going to learn to create scatter plot matrix for the chosen dataset.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |