You are a machine learning engineer and you are given a dataset containing data related to a rare but a fatal disease. You are asked to design a ML which can classify if the person suffers from that disease or not. After performing EDA and trying different algorithms, you fit the data to the model to your highest performing model. Eureka! the accuracy of the model is 98.37%. Satisfied with the high accuracy score, you test the model with real time data. Investigations reveal that the model only predicts 1.63% as “diagnosed positive”. This results in False Positive classification, costing…

Image from Unsplash

“You see but you do not observe…” says Sherlock Holmes in one of his adventures. This quote describes the foundation of Holmes’ deductions about relying upon having all of the possible information about the case before making conjectures about it.

Data analysis is a refined technique to talk to the dataset, and uncover what it has to offer. Extracting trends and patterns from the data requires a well-defined methodical path. Before trying to predict future trends, one must check if the sample collected is good enough for analysis. A lot of time is invested in exploring, cleaning and preprocessing data…

Image from Unsplash

In the field of Data Science, one actually has to extract information about what the data has, beyond the numbers in the table format. To deduce this information, a data scientist employs numerous tools to explore, visualize and derive insights from the given dataset. Statistics allows us to establish concrete structure and make-up of the data. Descriptive statistics help to gain insights into the data such as

  • How biased the dataset is?
  • Which is the center around which the data is clustered?
  • What are the unusual values for certain features in the dataset?

Let’s see the different statistical techniques used…

Unsplash image by Spandan-Pattanayak

Are you aware that 1 among 8 women in the United States is being diagnosed with breast cancer? Breast cancer is a disease in which the healthy cells of the tissue in the breast are invaded and mutated, which further grow in large numbers to form a malignant tumor. It can most likely occur at any age.

Few risk factors that contribute to breast cancer.

  • It occurs 100 times more in women than in men
  • Genetic factors such as mother, father, sister or brother had been diagnosed with ovarian or breast cancer.
  • · Early puberty, late menopause or child birth…

Ranjani Rajamani

Machine Learning Enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store