There are many, many more options you can set for scatterplots, such as titles and colors. (The underlying problem here is that many respondents seem to have rounded their weight to a multiple of five, making weight act somewhat like a categorical variable.) In this case it probably doesn't make much difference, but it would be a major problem if you tried to make a scatterplot of two categorical variables. This can distort the understanding you get of the distribution of the two variables. In the subsample graphs, a male (blue) point will be covered up by a female (red) point just because the graph for females was the second one specified. Stata dutifully plots two points, but the second one completely covers up the first so that you can only see one. If you run tab height weight (and sift through the rather large amount of output it creates) you'll find a weakness of these plots: sometimes two people have the same height and weight. This plot suggests that while weight is positively related to height, age and height have a very weak relationship if any. The last variable will always be the X variable and any other variables you list will be Y variables. Just list them after the scatter command. You can use similar syntax to plot multiple variables in the same scatterplot. The default legend for this version is more informative, but you'd still probably want to replace it (and add a title for the Y axis). You can create a scatterplot that plots both of these variables with: it's missing for females and thus won't be plotted) and a weight2 which only exists for females. This creates two variables: a weight1 which only exists for males (i.e. Scatter weight height if sex=1 || scatter weight height if sex=2, ///Īn alternative way to plot create this plot is to start with the separate command. The first plot you specify is plot number 1, the second number 2, etc. Within that you give a list of plot numbers and associated labels much like a list of value labels. You can do so with the legend option, which then contains the order option. Unfortunately, the default legend at the bottom is now completely useless, so you'll need to specify what it should say. Scatter weight height if sex=1 || scatter weight height if sex=2 You can use similar code to plot subsamples in different colors: Scatter weight height || lfit weight height To add a linear fit plot to a scatterplot, first specify the scatterplot, then put two "pipe" characters (what you get when you press shift-Backslash) to tell Stata you're now going to add another plot, and then specify the linear fit. You can plot a regression line or "linear fit" with the lfit command followed, as with scatter, by the variables involved. Regression attempts to find the line that best fits these points. The distribution of the points suggests a positive relationship between height and weight (i.e. The first variable you list will be the Y variable and the second will be the X variable. To create a scatterplot, use the scatter command, then list the variables you want to plot. If you plan on applying what you learn directly to your homework, create a similar do file but have it load the data set used for your assignment. Then create a do file called scatter.do in that folder that loads the GSS sample as described in Doing Your Work Using Do Files. If you plan to carry out the examples in this article, make sure you've downloaded the GSS sample to your U:\SFS folder as described in Managing Stata Files. This section will teach you how to make scatterplots Using Graphs discusses what you can do with a graph once you've made it, such as printing it, adding it to a Word document, etc. ![]() If you believe there is a causal relationship between the two variables, convention suggests you make the cause X and the effect Y, but a scatterplot is useful even if there is no such relationship. One variable is designated as the Y variable and one as the X variable, and a point is placed on the graph for each observation at the location corresponding to its values of those variables. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.Ī scatterplot is an excellent tool for examining the relationship between two quantitative variables. This article is part of the Stata for Students series.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |