Toronto bike theft analysis
Python
Packages: pandas, NumPy, matplotlib, seaborn, sci-kit learn, etc.
Source code

01.

About project


Toronto Police and residents are having a hard time because of the cases of bicycle theft in different regions of Toronto.
This project is conducted for public safety and awareness from local bicycle theft crimes.
It helps people to analyze whether a stolen bicycle will be returned or not.

Toronto police will be able to further strengthen their solutions to prevent theft in certain areas, and residents will be extra careful and seek preventive measures such as anti-theft locks.
Therefore, this will gradually reduce the number of bicycle theft cases.

This analysis is based on the open data provided by Toronto government and police.

02.

Activities


Data Exploration


1.1 Load the 'Bicycle_Thefts.csv' file into a dataframe and descibe data elements (columns), provide descriptions & types, ranges and values of elements as appropriate.
1.2 Statistical assessments including means, averages, correlations
1.3 Missing data evaluations – use pandas, NumPy and any other python packages.
1.4 Graphs and visualizations – use pandas, matplotlib, seaborn, NumPy and any other python packages.


Data Modelling


2.1 Data transformations – includes handling missing data, categorical data management, data normalization and standardizations as needed.
2.2 Feature selection – use pandas and sci-kit learn.
2.3 Train, Test data splitting – use NumPy, sci-kit learn.


Predictive model building


3.1 Use logistic regression and decision trees as a minimum – use scikit learn.


Model scoring and evaluation


4.1 Present results as scores, confusion matrices and ROC - use sci-kit learn.
4.2 Select and recommend the best performing model.


Deploying the model


5.1 Using flask framework arrange to turn your selected machine-learning model into an API.
5.2 Using pickle module arrange for Serialization & Deserialization of your model.
5.3 Build a client to test your model API service. Use the test data, which was not previously used to train the module.



03.

Data Insights & Visualizations


Pie Chart




Map




Location




Time



Bike