# Exploratory Data Analysis

In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.

A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.

EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed.

EDA encompasses IDA.

# Statistical graphics

Statistical graphics, also known as graphical techniques, are graphics in the field of statistics used to visualize quantitative data.

Whereas statistics and data analysis procedures generally yield their output in numeric or tabular form, graphical techniques allow such results to be displayed in some sort of pictorial form.

They include plots such as scatter plots, histograms, probability plots, spaghetti plots, residual plots, box plots, block plots and biplots.

Exploratory data analysis (EDA) relies heavily on such techniques.

They can also provide insight into a data set to help with testing assumptions, model selection and regression model validation, estimator selection, relationship identification, factor effect determination, and outlier detection.

In addition, the choice of appropriate statistical graphics can provide a convincing means of communicating the underlying message that is present in the data to others.

Graphical statistical methods have four objectives:

• The exploration of the content of a data set
• The use to find structure in data
• Checking assumptions in statistical models
• Communicate the results of an analysis.

If one is not using statistical graphics, then one is forfeiting insight into one or more aspects of the underlying structure of the data.

Welcome to my Data Science blog. Please visit my career portfolio at https://mruanova.com 🚀🌎

## More from Mau Ruanova

Welcome to my Data Science blog. Please visit my career portfolio at https://mruanova.com 🚀🌎