Delivery data visualization via PowerBI


I mentioned previously that on each module we complete in Refocus, there's a final task that will be checked by the mentor/s. PowerBI is already under the advanced module and I will be revisiting one of the assignments under the PowerBI module. This seems to be the final task few months ago but they've already made changes to the modules so this one became one of our assignments that also needs to be submitted for mentor review.


We have a sample dataset from a delivery company operating in India. The delivered items include sides, drinks, deserts, etc., and are usually delivered in boxes or bags. In general, the delivery will be done using a car but in bigger cities where homes and restaurants are closer together, the company may opt into using bikes or motorized scooters.


The requirements are the following:

- Import and clean the data then build a dashboard for it. The dashboard should show the vehicle type, type of order, weather conditions, and road traffic. Should also show how the road traffic affects the average time taken for delivery.

- There are two sets of geographical coordinates on the dataset. Choose what's best for the dashboard with fewer mistakes and outliers. Add a map on the dashboard too.

- Provide median time for deliveries, average time for deliveries made during sunny and stormy weather, and the difference between average delivery time on ordinary scooters and motorcycle.

- Provide suggestion/s to the delivery company for better data collection and more precise results.


The dataset is in a CSV format with the following headers:




I imported the CSV file on PowerBI then started with the data cleaning process. First order of business is fix the header names and promote the first row under the data transformation option.

NaN values from Time Ordered column are removed. When filtering only those results, most of the other fields are either also Nan or the delivery location is too close to the restaurant location. No entry from that column means that no order and it's not a useful part of the data.

After a few steps/actions for cleaning, here's what it looks like now. Note that I did not include every column on the sample screenshots.




The cleaned table from the CSV will be our Fact table. I will not explain this further but you should know this already if you're into this software/tool. I made few different Dimensions table (Dim for short) which are derived from the Fact table. 


I made four dimension tables for data modelling. The dimension tables are for the following: Vehicle type, order type, weather conditions, road traffic

The dimension tables were all connected to the fact table via the ID during the modelling. I used a simple schema here.

After the schema is set, I then started making the dashboard by putting bar charts and a map. It's fully interactive which is one of the differences of PowerBI from Excel. 

Here's a short video preview.



Let's return to the requirements earlier.


We're done with importing and cleaning the data. We have already made the dashboard which shows the vehicle type, the types of order, weather conditions, and road traffic. The average times are also available when we hover on the bars.

For the geographical coordinates, I used the latitude and longitude for the delivery location. When I checked the Restaurant Location, some of them have pin on Indian Ocean so that's not reliable. I removed the entries with latitude and longitude of less than 1 from the delivery location because coordinates with values of zero also pins to somewhere in African waters.

The median time is automatically changing depending on what information we choose. Anyone who will look into the dashboard will get the information they are after, depending on what they want to see.

As for the suggestion/s for the delivery company regarding data collection, they should pay attention with the coordinates and time ordered. We have lots of rows with location that points out to nothing and rows with no recorded time for when the customers have ordered. We can't be sure if those are actually garbage data or there's something wrong with the data collection process and those might actually be helpful if filled with factual data.


My assignment got approved in one go. The feedback I received from the mentor is to utilize all of the available spaces on the dashboard and to also provide explicit, actionable steps/suggestions with the recipient in mind. This is something that I'm continuously working on. I know that analysis is much like storytelling and the audience should end up with deciding what their next steps would be based on the data.


Thanks for reading and I'll see you on the next one.


- Marc








Comments