fbpx
Wikipedia

Scatter plot

A scatter plot, also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram,[3] is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. If the points are coded (color/shape/size), one additional variable can be displayed. The data are displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.[4]

Scatter plot
One of the Seven Basic Tools of Quality
First described byJohn Herschel[1]
PurposeTo identify the type of relationship (if any) between two quantitative variables
Waiting time between eruptions and the duration of the eruption for the Old Faithful Geyser in Yellowstone National Park, Wyoming, USA. This chart suggests there are generally two types of eruptions: short-wait-short-duration, and long-wait-long-duration.
A 3D scatter plot allows the visualization of multivariate data. This scatter plot takes multiple scalar variables and uses them for different axes in phase space. The different variables are combined to form coordinates in the phase space and they are displayed using glyphs and coloured using another scalar variable.[2]

Overview edit

A scatter plot can be used either when one continuous variable is under the control of the experimenter and the other depends on it or when both continuous variables are independent. If a parameter exists that is systematically incremented and/or decremented by the other, it is called the control parameter or independent variable and is customarily plotted along the horizontal axis. The measured or dependent variable is customarily plotted along the vertical axis. If no dependent variable exists, either type of variable can be plotted on either axis and a scatter plot will illustrate only the degree of correlation (not causation) between two variables.[citation needed]

A scatter plot can suggest various kinds of correlations between variables with a certain confidence interval. For example, weight and height would be on the y-axis, and height would be on the x-axis. Correlations may be positive (rising), negative (falling), or null (uncorrelated). If the dots' pattern slopes from lower left to upper right, it indicates a positive correlation between the variables being studied. If the pattern of dots slopes from upper left to lower right, it indicates a negative correlation. A line of best fit (alternatively called 'trendline') can be drawn to study the relationship between the variables. An equation for the correlation between the variables can be determined by established best-fit procedures. For a linear correlation, the best-fit procedure is known as linear regression and is guaranteed to generate a correct solution in a finite time. No universal best-fit procedure is guaranteed to generate a correct solution for arbitrary relationships. A scatter plot is also very useful when we wish to see how two comparable data sets agree to show nonlinear relationships between variables. The ability to do this can be enhanced by adding a smooth line such as LOESS.[5] Furthermore, if the data are represented by a mixture model of simple relationships, these relationships will be visually evident as superimposed patterns.[citation needed]

The scatter diagram is one of the seven basic tools of quality control.[6]

Scatter charts can be built in the form of bubble, marker, or/and line charts.[7]

Example edit

For example, to display a link between a person's lung capacity, and how long that person could hold their breath, a researcher would choose a group of people to study, then measure each one's lung capacity (first variable) and how long that person could hold their breath (second variable). The researcher would then plot the data in a scatter plot, assigning "lung capacity" to the horizontal axis, and "time holding breath" to the vertical axis.[citation needed]

A person with a lung capacity of 400 cl who held their breath for 21.7 s would be represented by a single dot on the scatter plot at the point (400, 21.7) in the Cartesian coordinates. The scatter plot of all the people in the study would enable the researcher to obtain a visual comparison of the two variables in the data set and will help to determine what kind of relationship there might be between the two variables.[citation needed]

Scatter plot matrices edit

For a set of data variables (dimensions) X1, X2, ... , Xk, the scatter plot matrix shows all the pairwise scatter plots of the variables on a single view with multiple scatterplots in a matrix format. For k variables, the scatterplot matrix will contain k rows and k columns. A plot located on the intersection of row and jth column is a plot of variables Xi versus Xj.[8] This means that each row and column is one dimension, and each cell plots a scatter plot of two dimensions.[citation needed]

A generalized scatter plot matrix[9] offers a range of displays of paired combinations of categorical and quantitative variables. A mosaic plot, fluctuation diagram, or faceted bar chart may be used to display two categorical variables. Other plots are used for one categorical and one quantitative variables.

 
Visualization of 3D data along with the correspondent scatterplot matrix

See also edit

References edit

  1. ^ Friendly, Michael; Denis, Dan (2005). "The early origins and development of the scatterplot". Journal of the History of the Behavioral Sciences. 41 (2): 103–130. doi:10.1002/jhbs.20078. PMID 15812820.
  2. ^ Visualizations that have been created with VisIt at wci.llnl.gov. Last updated: November 8, 2007.
  3. ^ Jarrell, Stephen B. (1994). Basic Statistics (Special pre-publication ed.). Dubuque, Iowa: Wm. C. Brown Pub. p. 492. ISBN 978-0-697-21595-6. When we search for a relationship between two quantitative variables, a standard graph of the available data pairs (X,Y), called a scatter diagram, frequently helps...
  4. ^ Utts, Jessica M. Seeing Through Statistics 3rd Edition, Thomson Brooks/Cole, 2005, pp 166-167. ISBN 0-534-39402-7
  5. ^ Cleveland, William (1993). Visualizing data. Murray Hill, N.J. Summit, N.J: At & T Bell Laboratories Published by Hobart Press. ISBN 978-0963488404.
  6. ^ Nancy R. Tague (2004). "Seven Basic Quality Tools". The Quality Toolbox. Milwaukee, Wisconsin: American Society for Quality. p. 15. Retrieved 2010-02-05.
  7. ^ . AnyChart. Archived from the original on 1 February 2016. Retrieved 3 February 2016.
  8. ^ Scatter Plot Matrix at itl.nist.gov.
  9. ^ Emerson, John W.; Green, Walton A.; Schoerke, Barret; Crowley, Jason (2013). "The Generalized Pairs Plot". Journal of Computational and Graphical Statistics. 22 (1): 79–91. doi:10.1080/10618600.2012.694762. S2CID 28344569.

Further reading edit

  • Cattaneo, Matias D.; Crump, Richard K.; Farrell, Max H.; Feng, Yingjie (2024). "On Binscatter". American Economic Review. 114 (5): 1488–1514.

External links edit

  •   Media related to Scatterplots at Wikimedia Commons
  • What is a scatterplot? 2020-08-07 at the Wayback Machine
  • Correlation scatter-plot matrix for ordered-categorical data – Explanation and R code
  • Density scatterplot for large datasets (hundreds of millions of points)

scatter, plot, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material, challenged, removed, find, sources, news, newspapers, books, scholar, jstor, april, 2024,. This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources Scatter plot news newspapers books scholar JSTOR April 2024 Learn how and when to remove this message Not to be confused with Correlogram or Scatter matrix A scatter plot also called a scatterplot scatter graph scatter chart scattergram or scatter diagram 3 is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data If the points are coded color shape size one additional variable can be displayed The data are displayed as a collection of points each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis 4 Scatter plotOne of the Seven Basic Tools of QualityFirst described byJohn Herschel 1 PurposeTo identify the type of relationship if any between two quantitative variables Waiting time between eruptions and the duration of the eruption for the Old Faithful Geyser in Yellowstone National Park Wyoming USA This chart suggests there are generally two types of eruptions short wait short duration and long wait long duration A 3D scatter plot allows the visualization of multivariate data This scatter plot takes multiple scalar variables and uses them for different axes in phase space The different variables are combined to form coordinates in the phase space and they are displayed using glyphs and coloured using another scalar variable 2 Contents 1 Overview 2 Example 3 Scatter plot matrices 4 See also 5 References 6 Further reading 7 External linksOverview editA scatter plot can be used either when one continuous variable is under the control of the experimenter and the other depends on it or when both continuous variables are independent If a parameter exists that is systematically incremented and or decremented by the other it is called the control parameter or independent variable and is customarily plotted along the horizontal axis The measured or dependent variable is customarily plotted along the vertical axis If no dependent variable exists either type of variable can be plotted on either axis and a scatter plot will illustrate only the degree of correlation not causation between two variables citation needed A scatter plot can suggest various kinds of correlations between variables with a certain confidence interval For example weight and height would be on the y axis and height would be on the x axis Correlations may be positive rising negative falling or null uncorrelated If the dots pattern slopes from lower left to upper right it indicates a positive correlation between the variables being studied If the pattern of dots slopes from upper left to lower right it indicates a negative correlation A line of best fit alternatively called trendline can be drawn to study the relationship between the variables An equation for the correlation between the variables can be determined by established best fit procedures For a linear correlation the best fit procedure is known as linear regression and is guaranteed to generate a correct solution in a finite time No universal best fit procedure is guaranteed to generate a correct solution for arbitrary relationships A scatter plot is also very useful when we wish to see how two comparable data sets agree to show nonlinear relationships between variables The ability to do this can be enhanced by adding a smooth line such as LOESS 5 Furthermore if the data are represented by a mixture model of simple relationships these relationships will be visually evident as superimposed patterns citation needed The scatter diagram is one of the seven basic tools of quality control 6 Scatter charts can be built in the form of bubble marker or and line charts 7 Example editFor example to display a link between a person s lung capacity and how long that person could hold their breath a researcher would choose a group of people to study then measure each one s lung capacity first variable and how long that person could hold their breath second variable The researcher would then plot the data in a scatter plot assigning lung capacity to the horizontal axis and time holding breath to the vertical axis citation needed A person with a lung capacity of 400 cl who held their breath for 21 7 s would be represented by a single dot on the scatter plot at the point 400 21 7 in the Cartesian coordinates The scatter plot of all the people in the study would enable the researcher to obtain a visual comparison of the two variables in the data set and will help to determine what kind of relationship there might be between the two variables citation needed Scatter plot matrices editFor a set of data variables dimensions X1 X2 Xk the scatter plot matrix shows all the pairwise scatter plots of the variables on a single view with multiple scatterplots in a matrix format For k variables the scatterplot matrix will contain k rows and k columns A plot located on the intersection of row and j th column is a plot of variables Xi versus Xj 8 This means that each row and column is one dimension and each cell plots a scatter plot of two dimensions citation needed A generalized scatter plot matrix 9 offers a range of displays of paired combinations of categorical and quantitative variables A mosaic plot fluctuation diagram or faceted bar chart may be used to display two categorical variables Other plots are used for one categorical and one quantitative variables nbsp Visualization of 3D data along with the correspondent scatterplot matrixSee also editData and information visualization Rug plot Bar graph Line chart Scagnostics Dot plot statistics Parity plotReferences edit Friendly Michael Denis Dan 2005 The early origins and development of the scatterplot Journal of the History of the Behavioral Sciences 41 2 103 130 doi 10 1002 jhbs 20078 PMID 15812820 Visualizations that have been created with VisIt at wci llnl gov Last updated November 8 2007 Jarrell Stephen B 1994 Basic Statistics Special pre publication ed Dubuque Iowa Wm C Brown Pub p 492 ISBN 978 0 697 21595 6 When we search for a relationship between two quantitative variables a standard graph of the available data pairs X Y called a scatter diagram frequently helps Utts Jessica M Seeing Through Statistics 3rd Edition Thomson Brooks Cole 2005 pp 166 167 ISBN 0 534 39402 7 Cleveland William 1993 Visualizing data Murray Hill N J Summit N J At amp T Bell Laboratories Published by Hobart Press ISBN 978 0963488404 Nancy R Tague 2004 Seven Basic Quality Tools The Quality Toolbox Milwaukee Wisconsin American Society for Quality p 15 Retrieved 2010 02 05 Scatter Chart AnyChart JavaScript Chart Documentation AnyChart Archived from the original on 1 February 2016 Retrieved 3 February 2016 Scatter Plot Matrix at itl nist gov Emerson John W Green Walton A Schoerke Barret Crowley Jason 2013 The Generalized Pairs Plot Journal of Computational and Graphical Statistics 22 1 79 91 doi 10 1080 10618600 2012 694762 S2CID 28344569 Further reading editCattaneo Matias D Crump Richard K Farrell Max H Feng Yingjie 2024 On Binscatter American Economic Review 114 5 1488 1514 External links edit nbsp Media related to Scatterplots at Wikimedia Commons What is a scatterplot Archived 2020 08 07 at the Wayback Machine Correlation scatter plot matrix for ordered categorical data Explanation and R code Density scatterplot for large datasets hundreds of millions of points Retrieved from https en wikipedia org w index php title Scatter plot amp oldid 1221555558, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.