Would you survive the Titanic?

3 min readApr 9, 2022

Introduction
Analyze what kind of people are likely to survive. In particular, the use of machine learning tools to predict which passengers survived the Titanic
disaster. In fact, we have several features and two datasets, one is training and the other is testing. In this report, I try to investigate this issue and answer the following questions by using the logistic regression method, which is
fully described algorithms and Python codes in the machine learning classes.

Training Model

At this point for our training model with loss function look for the best rated learn and iteration number.
Using Train dataset for 100000 iterations. Depending on the result, the values in the thousandth range are better than the 10 thousandth range, so run the function again in a range (1e-3 and 4e-3) and you can see after approximately 50000 iterations, it converged at 0.004(second figure)

Analyze our model
After training, our model we obtain weight for all features and biases:
Class= -1.1468791 Sex= 2.80207082
Age= -0.03980783 Number Sibling= -0.32565276
Number Children= -0.09992111 Fare=0.00564653
Biases= 2.242518787043841
By observing the weights obtained, we can conclude that the higher class for the passenger, the lower the chance of survival (first-class passengers had more chance for survival than2end or third classes) because the value is
negative for class. The gender is positive therefore chance the of females was more than males (male =0 and female=1 in the dataset.) Age is negative, so young people had more chance of surviving The number of sibling and children are negative therefore who had fewer siblings and children
had a chance more of surviving(statistically) and Fare is positive and it shows that each person who buys a more expensive ticket had more chance of living. Another thing that we can understand in this model is two the biggest number is for class and sex therefore, we can guess most of the people who survived were female and were in the first classes. However, the weight of the sex
parameter is more than age but we must be careful about the range of these parameters (sex is just 0 and 1, age is more). We can see a scatter plot showing the distribution of the two classes in the plane defined by the two most influential features

Yellow is the color of people who survive, and we can tell that most of the women who survived were from class 1 and most of the men who died were from class 3. On the other hand, in the chart below, almost no one was alive after 65 years.

In the end, we check the accuracy of our model and the result is almost 80%

After interpreting our model, we can predict the test dataset with the model.
Evaluate the model
By loading the experimental data set, model evaluation became possible. By keeping the threshold at 0.5, the test accuracy reached 78.5%. This value is a little less than the accuracy of the training, the model does not seem to have an overfitting or under-fitting problem, but to improve the model, I decided to
remove the features that weigh less and fit the model again. After removing features from the training dataset and calculating weight and biases age and using, them for test data set accuracy increased by 0.5 and obtain 79%

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Shayan Kamalzadeh

54 Followers

50 Following

.Net Developer www.linkedin.com/in/shayan-kamalzadeh

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

Ime Eti-mfon

Working with Time Series Data in Python

Analyzing Trends with Pandas

Nov 13, 2024

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jessica Stillman

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Oct 30, 2024

685

Lists

Predictive Modeling w/ Python

20 stories1843 saves

Coding & Development

11 stories1019 saves

Practical Guides to Machine Learning

10 stories2215 saves

ChatGPT prompts

51 stories2602 saves

Pipeline: Your Data Engineering Resource

Zach Quinn

Creating The Dashboard That Got Me A Data Analyst Job Offer

A walkthrough of the Udemy dashboard that got me a job offer from one of the biggest names in academic publishing.

Dec 5, 2022

Five Proven Approaches for Raising Bilingual Kids

Language Lab

T. E. Isaacs

Five Proven Approaches for Raising Bilingual Kids

Every child is born with the ability to become bilingual, but simply growing up in a bilingual environment does not guarantee it.

2d ago

Stop Copy-Pasting. Turn PDFs into Data in Seconds

Data Science Collective

Ari Joury, PhD

Stop Copy-Pasting. Turn PDFs into Data in Seconds

Automate PDF extraction and get structured data instantly with Python’s best tools

6d ago

Just Stop Writing Python Functions Like This!!!

Python in Plain English

Kiran Maan

Just Stop Writing Python Functions Like This!!!

I just reviewed someone else’s code and I was just shocked.

Jan 19

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams