A food delivery company is trying to understand delays in order delivery. You've been asked to analyze the dataset and help uncover what might be contributing to the delays.
Use this file to document your reasoning and code. Dot is available in the chat to support you—feel free to talk through your ideas or ask questions as you go.
Use this section to do initial exploration and to outline your approach to the problem.
You might include:
# Code goes here, feel free to add cells
import pandas as pd
import matplotlib.pyplot as pltdf = pd.read_csv('FastDrop2.csv')df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 45584 entries, 0 to 45583
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 45584 non-null object
1 Delivery_person_ID 45584 non-null object
2 Delivery_person_age 43730 non-null float64
3 Delivery_person_ratings 43676 non-null float64
4 Restaurant_latitude 45584 non-null float64
5 Restaurant_longitude 45584 non-null float64
6 Delivery_location_latitude 45584 non-null float64
7 Delivery_location_longitude 45584 non-null float64
8 Order_date 45584 non-null object
9 Time_ordered 43853 non-null object
10 Time_order_picked 45584 non-null object
11 Weather_conditions 44968 non-null object
12 Road_traffic_density 44983 non-null object
13 Vehicle_condition 45584 non-null int64
14 Type_of_order 45584 non-null object
15 Type_of_vehicle 45584 non-null object
16 Multiple_deliveries 44591 non-null float64
17 Festival 45356 non-null object
18 City 44384 non-null object
19 Time_taken (min) 45584 non-null int64
dtypes: float64(7), int64(2), object(11)
memory usage: 7.0+ MBdf = df.drop(labels=['ID', 'Delivery_person_ID'], axis=1) #drop ID columns since they don't mean anything#see some quick data examples
df.head()[HTML output]