搜档网
当前位置:搜档网 › Stata命令之地图生成

Stata命令之地图生成

Stata命令之地图生成
Stata命令之地图生成

___ ____ ____ ____ ____(R)

/__ / ____/ / ____/

___/ / /___/ / /___/

Statistics/Data Analysis

help spmap

Title

Syntax

spmap [attribute] [if] [in] using basemap [,

basemap_options

polygon(polygon_suboptions)

line(line_suboptions)

point(point_suboptions)

diagram(diagram_suboptions)

arrow(arrow_suboptions)

label(label_suboptions)

scalebar(scalebar_suboptions)

graph_options]

Cartogram

area(areavar) draw base map polygons with area proportional to varia split split multipart base map polygons

map(backgroundmap) draw background map defined in Stata dataset background mfcolor(colorstyle) fill color of the background map

mocolor(colorstyle) outline color of the background map

mosize(linewidthstyle) outline thickness of the background map

mopattern(linepatternstyle) outline pattern of the background map

Choropleth map

clmethod(method)attribute classification method, where method is one o clnumber(#) number of classes

clbreaks(numlist) custom class breaks

eirange(min max)attribute range for eqint classification method

kmiter(#) number of iterations for kmeans classification method ndfcolor(colorstyle) fill color of empty (no data) base map polygons

ndocolor(colorstyle) outline color of empty (no data) base map polygons

ndsize(linewidthstyle) outline thickness of empty (no data) base map polygons ndlabel(string) legend label of empty (no data) base map polygons

Format

fcolor(colorlist) fill color of base map polygons

ocolor(colorlist) outline color of base map polygons

osize(linewidthstyle_list) outline thickness of base map polygons

Format

fcolor(colorlist) fill color of supplementary polygons

ocolor(colorlist) outline color of supplementary polygons

osize(linewidthstyle_list) outline thickness of supplementary polygons

Format

color(colorlist) polyline color

size(linewidthstyle_list) polyline thickness

pattern(linepatternstyle_list) polyline pattern

Proportional size

proportional(propvar_pn) draw point markers with size proportional to variable prange(min max) normalization range of variable propvar_pn

psize(relative|absolute) reference system for drawing point markers

Deviation

deviation(devvar_pn) draw point markers as deviations from given reference refval(mean|median|#) reference value of variable devvar_pn

refweight(weightvar_pn) compute reference value of variable devvar_pn weightin dmax(#) absolute value of maximum deviation

Format

size(markersizestyle_list) size of point markers

shape(symbolstyle_list) shape of point markers

fcolor(colorlist) fill color of point markers

ocolor(colorlist) outline color of point markers

osize(linewidthstyle_list) outline thickness of point markers

Proportional size

proportional(propvar_dg) draw diagrams with area proportional to variable propva prange(min max) reference range of variable propvar_dg

Framed-rectangle chart

range(min max) reference range of variable diagvar_dg

refval(mean|median|#) reference value of variable diagvar_dg

refweight(weightvar_dg) compute the reference value of variable diagvar_dg wei refcolor(colorstyle) color of the line representing the reference value of refsize(linewidthstyle) thickness of the line representing the reference value Format

size(#) diagram size

fcolor(colorlist) fill color of the diagrams

ocolor(colorlist) outline color of the diagrams

osize(linewidthstyle_list) outline thickness of the diagrams

Format

direction(directionstyle_list) arrow direction, where directionstyle is one of the fo hsize(markersizestyle_list) arrowhead size

hangle(anglestyle_list) arrowhead angle

hbarbsize(markersizestyle_list) size of filled portion of arrowhead

hfcolor(colorlist) arrowhead fill color

hocolor(colorlist) arrowhead outline color

hosize(linewidthstyle_list) arrowhead outline thickness

lcolor(colorlist) arrow shaft line color

lsize(linewidthstyle_list) arrow shaft line thickness

lpattern(linepatternstyle_list) arrow shaft line pattern

Description

spmap is aimed at visualizing several kinds of spatial data, and is particularly suited for

spmap functioning rests on three basic principles:

o First, a base map representing a given study region R made up of N polygons is draw

o Second, at the user's choice, one or more types of additional spatial objects may b spatial objects can be superimposed onto the base map: polygons (via option polygon diagram()), arrows (via option arrow()), and labels (via option label()).

o Third, at the user's choice, one or more additional map elements may be added, such title_options).

Proper specification of spmap options and suboptions, combined with the availability of pro choropleth maps, proportional symbol maps, pin maps, pie chart maps, and noncontiguous area While providing sensible defaults for most options and supoptions, spmap gives the user ful highly customized maps.

Spatial data format

spmap requires that the spatial data to be visualized be arranged into properly formatted S backgroundmap, polygon, line, point, diagram, arrow, label.

The master dataset is the dataset that resides in memory when spmap is invoked. At the mini or polygons making up the base map. If a choropleth map is to be drawn, then the master dat feature to be represented. Additionally, if a noncontiguous area cartogram is to be drawn - values of a given numeric variable areavar - then the master dataset should contain also va

A basemap dataset is a Stata dataset that contains the definition of the polygon or polygon

_ID is required and is a numeric variable that uniquely identifies the polygons making up t nodes of the base map polygons. _Y is required and is a numeric variable that contains the indicator variable taking value 1 if the corresponding polygon is completely enclosed in an

o Both simple and multipart polygons are allowed. In the example above, polygons 1 an consists of two distinct areas).

o The first record of each simple polygon or of each part of a multipart polygon must

o The non-missing coordinates of each simple polygon or of each part of a multipart p

o Each simple polygon or each part of a multipart polygon must be "closed", i.e., the

o A basemap dataset is always required to be sorted by variable _ID.

A backgroundmap dataset is a Stata dataset that contains the definition of the polygon or p noncontiguous area cartogram. A backgroundmap dataset has exactly the same structure as a b A polygon dataset is a Stata dataset that contains the definition of one or more supplement following structure:

Variables _ID, _X, and _Y are defined exactly in the same way as in a basemap dataset, with placeholder denoting an optional variable that can be specified to distinguish different ki

A line dataset is a Stata dataset that contains the definition of one or more polylines to structure:

_ID is required and is a numeric variable that uniquely identifies the polylines. _X is req polylines. _Y is required and is a numeric variable that contains the y-coordinate of the n can be specified to distinguish different kinds of polylines. The following should be notic

o The first record of each polyline must contain missing x- and y-coordinates.

o The non-missing coordinates of each polyline must be ordered so as to correspond to

A point dataset is a Stata dataset that contains the definition of one or more points to be structure:

xvar_pn is a placeholder denoting a required numeric variable that contains the x-coordinat the y-coordinate of the points. byvar_pn is a placeholder denoting an optional variable tha denoting an optional variable that, when specified, requests that the point markers be draw variable that, when specified, requests that the point markers be drawn as deviations from optional variable that, when specified, requests that the reference value of devvar_pn be c required and optional variables making up a point dataset can either reside in an external A diagram dataset is a Stata dataset that contains the definition of one or more diagrams t to have the following structure:

xvar_dg is a placeholder denoting a required numeric variable that contains the x-coordinat variable that contains the y-coordinate of the diagram reference points. byvar_dg is a plac of diagrams. diagvar_dg is a placeholder denoting one or more variables to be represented b specified, requests that the diagrams be drawn with area proportional to propvar_dg. Finall requests that the reference value of the diagrams be computed weighting observations by var that the required and optional variables making up a diagram dataset can either reside in a An arrow dataset is a Stata dataset that contains the definition of one or more arrows to b structure:

_ID is required and is a numeric variable that uniquely identifies the arrows. _X1 is requi arrows. _Y1 is required and is a numeric variable that contains the y-coordinate of the sta x-coordinate of the ending point of the arrows. _Y2 is required and is a numeric variable t placeholder denoting an optional variable that can be specified to distinguish different ki A label dataset is a Stata dataset that contains the definition of one or more labels to be have the following structure:

xvar_lb is a placeholder denoting a required numeric variable that contains the x-coordinat variable that contains the y-coordinate of the label reference points. byvar_lb is a placeh labels. Finally, labvar_lb is a placeholder denoting the variable that contains the labels. can either reside in an external dataset or be part of the master dataset.

Color lists

Some spmap options and suboptions request the user to specify a list of one or more colors.

colorstyle. On the other hand, when the list includes two or more colors, the user can eith The following table lists the predefined color schemes available in the current version of its type, and its source.

Following Brewer (1999), sequential schemes are typically used to represent ordered data, s used when there is a meaningful midpoint in the data, to emphasize progressive divergence f used to represent unordered, categorical data.

The color schemes whose source is indicated as "Brewer" were designed by Dr. Cynthia A. Bre Pennsylvania, USA (Brewer et al. 2003). These color schemes are used with Dr. Brewer抯 perm Choropleth maps

A choropleth map can be defined as a map in which each subarea (e.g., each census tract) of the value taken on by a given quantitative variable in that subarea (Slocum et al. 2005). S distribution of quantitative variables, it is worth noting the way spmap can be used to dra

In spmap, a choropleth map is a base map whose constituent polygons are colored according t dataset and specified immediately after the main command (see syntax diagram above). To dra into k classes defined by a given set of class breaks, and then assigns a different color t

o Quantiles: class breaks correspond to quantiles of the distribution of variable att

o Boxplot: the distribution of variable attribute is divided into 6 classes defined a (p75, p75 + 1.5*iqr] and (p75 + 1.5*iqr, max], where iqr = interquartile range.

o Equal intervals: class breaks correspond to values that divide the distribution of

o Standard deviates: the distribution of variable attribute is divided into k classes Following the suggestions of Evans (1977), this proportion p varies with k as follo

Class intervals are centered on the arithmetic mean m, which is a class midpoint if (Evans 1977).

o k-means: the distribution of variable attribute is divided into k classes using k-m variable attribute, and the solution that maximizes the goodness-of-variance fit (A

o Custom: class breaks are specified by the user.

Alternatively, spmap allows the user to leave the values of variable attribute ungrouped. I assigned to each of its values.

Options for drawing the base map

id(idvar) specifies the name of a numeric variable that uniquely identifies the polygon or values must correspond to the values taken on by variable _ID contained in the basemap

area(areavar) requests that the polygons making up the base map be drawn with area proporti cartogram (Olson 1976) is obtained. areavar must be contained in the master dataset.

split requests that, before drawing a noncontiguous area cartogram, all multipart base map distinct simple polygon.

map(backgroundmap) requests that, when drawing a noncontiguous area cartogram, the polygons backgroundmap.

mfcolor(colorstyle) specifies the fill color of the background map. The default is mfcolor(

mocolor(colorstyle) specifies the outline color of the background map. The default is mocol

mosize(linewidthstyle) specifies the outline thickness of the background map. The default i

mopattern(linepatternstyle) specifies the outline pattern of the background map. The defaul

clmethod(method) specifies the method to be used for classifying variable attribute and rep

clmethod(quantile) is the default and requests that the quantiles method be used.

clmethod(boxplot) requests that the boxplot method be used.

clmethod(eqint) requests that the equal intervals method be used.

clmethod(stdev) requests that the standard deviates method be used.

clmethod(kmeans) requests that the k-means method be used.

clmethod(custom) requests that class breaks be specified by the user with option clbrea

clmethod(unique) requests that each value of variable attribute be treated as a distinc

clnumber(#) specifies the number of classes k in which variable attribute is to be divided. chosen, the default is clnumber(4). When the boxplot classification method is chosen, t is inactive and k equals the number of elements of numlist specified in option clbreaks and k equals the number of different values taken on by variable attribute.

clbreaks(numlist) is required when option clmethod(custom) is specified. It defines the cus so that the first element is the minimum value of variable attribute to be considered; of variable attribute to be considered. For example, suppose we want to group the value (25,50]; for this we must specify clbreaks(10 15 20 25 50).

eirange(min max) specifies the range of values (minimum and maximum) to be considered in th overrides the default range [min(attribute), max(attribute)].

kmiter(#) specifies the number of times the clustering procedure is applied when option clm

ndfcolor(colorstyle) specifies the fill color of the empty (no data) polygons of the chorop

ndocolor(colorstyle) specifies the outline color of the empty (no data) polygons of the cho

ndsize(linewidthstyle) specifies the outline thickness of the empty (no data) polygons of t ndlabel(string) specifies the legend label to be attached to the empty (no data) polygons o

fcolor(colorlist) specifies the list of fill colors of the base map polygons. When no choro choropleth map is drawn, the list should be either composed of k elements, or represent choropleth map is drawn, the default argument is a color scheme that depends on the cla

ocolor(colorlist) specifies the list of outline colors of the base map polygons. When no ch choropleth map is drawn, the list should be either composed of k elements, or represent specification is ocolor(black ...).

osize(linewidthstyle_list) specifies the list of outline thicknesses of the base map polygo hand, when a choropleth map is drawn, the list should be composed of k elements. The de

legenda(on|off) specifies whether the base map legend should be displayed or hidden.

legenda(on) requests that the base map legend be displayed. This is the default when a legenda(off) requests that the base map legend be hidden. This is the default when no c legtitle(string) specifies the title of the base map legend. When a choropleth map is drawn legend title.

leglabel(string) specifies the label to be attached to the single key of the base map legen specified and no choropleth map is drawn.

legorder(hilo|lohi) specifies the display order of the keys of the base map legend when a c legorder(hilo) is the default and requests that the keys of the base map legend be disp legorder(lohi) requests that the keys of the base map legend be displayed in ascending legstyle(0|1|2|3) specifies the way the keys of the base map legend are labelled when a cho legstyle(0) requests that the keys of the base map legend not be labelled.

legstyle(1) is the default and requests that the keys of the base map legend be labelle legstyle(2) requests that the keys of the base map legend be labelled using the notatio of the class interval, and & denotes a string that separates the two values. For ex legstyle(3) requests that only the first and last keys of the base map legend be labell last key is labelled with the upper limit of the corresponding class interval.

legjunction(string) specifies the string to be used as separator when option legstyle(2) is legcount requests that, when a choropleth map is drawn, the number of base map polygons bel Option polygon() suboptions

data(polygon) requests that one or more supplementary polygons defined in Stata dataset pol

select(command) requests that a given subset of records of dataset polygon be selected usin

by(byvar_pl) indicates that the supplementary polygons defined in dataset polygon belong to

fcolor(colorlist) specifies the list of fill colors of the supplementary polygons. When sub hand, when suboption by(byvar_pl) is specified, the list should be either composed of k is none, the default specification is fcolor(none ...).

ocolor(colorlist) specifies the list of outline colors of the supplementary polygons. When other hand, when suboption by(byvar_pl) is specified, the list should be either compose outline color is black, the default specification is ocolor(black ...).

osize(linewidthstyle_list) specifies the list of outline thicknesses of the supplementary p element. On the other hand, when suboption by(byvar_pl) is specified, the list should b is osize(thin ...).

legenda(on|off) specifies whether the supplementary-polygon legend should be displayed or h

legenda(on) requests that the supplementary-polygon legend be displayed.

legenda(off) is the default and requests that the supplementary-polygon legend be hidde

legtitle(string) specifies the title of the supplementary-polygon legend. When suboption by byvar_pl be used as the legend title.

leglabel(string) specifies the label to be attached to the single key of the supplementary- suboption legenda(on) is specified and suboption by(byvar_pl) is not specified.

legshow(numlist) requests that, when suboption by(byvar_pl) is specified, only the keys inc

legcount requests that the number of supplementary polygons be displayed in the legend. Option line() suboptions

data(line) requests that one or more polylines defined in Stata dataset line be superimpose

select(command) requests that a given subset of records of dataset line be selected using S

by(byvar_ln) indicates that the polylines defined in dataset line belong to kln different g

color(colorlist) specifies the list of polyline colors. When suboption by(byvar_ln) is not by(byvar_ln) is specified, the list should be either composed of kln elements, or repre specification is color(black ...).

size(linewidthstyle_list) specifies the list of polyline thicknesses. When suboption by(byv suboption by(byvar_ln) is specified, the list should be composed of kln elements. The d

pattern(linepatternstyle_list) specifies the list of polyline patterns. When suboption by(b suboption by(byvar_ln) is specified, the list should be composed of kln elements. The d

legenda(on|off) specifies whether the polyline legend should be displayed or hidden.

legenda(on) requests that the polyline legend be displayed.

legenda(off) is the default and requests that the polyline legend be hidden.

legtitle(string) specifies the title of the polyline legend. When suboption by(byvar_ln) is as the legend title.

leglabel(string) specifies the label to be attached to the single key of the polyline legen legenda(on) is specified and suboption by(byvar_ln) is not specified.

legshow(numlist) requests that, when suboption by(byvar_ln) is specified, only the keys inc

legcount requests that the number of polylines be displayed in the legend.

Option point() suboptions

data(point) requests that one or more points defined in Stata dataset point be superimposed

select(command) requests that a given subset of records of dataset point be selected using

by(byvar_pn) indicates that the points defined in dataset point belong to kpn different gro

xcoord(xvar_pn) specifies the name of the variable containing the x-coordinate of each poin

ycoord(yvar_pn) specifies the name of the variable containing the y-coordinate of each poin

proportional(propvar_pn) requests that the point markers be drawn with size proportional to

prange(min max) requests that variable propvar_pn specified in suboption proportional(propva normalization based on range [0, max(propvar_pn)].

psize(relative|absolute) specifies the reference system for drawing the point markers.

psize(relative) is the default and requests that the point markers be drawn using relat compare the map at hand with other maps of the same kind.

psize(absolute) requests that the point markers be drawn using absolute minimum and max other maps of the same kind.

deviation(devvar_pn) requests that the point markers be drawn as deviations from a referenc specified, in the first place the values of variable devvar_pn are re-expressed as devi represented by solid markers, whereas points associated with negative deviations are re proportional to the absolute value of the deviation. This suboption is incompatible wit

refval(mean|median|#) specifies the reference value of variable devvar_pn for computing dev

refval(mean) is the default and requests that the arithmetic mean of variable devvar_pn

refval(median) requests that the median of variable devvar_pn be taken as the reference

refval(#) requests that an arbitrary real value # be taken as the reference value.

refweight(weightvar_pn) requests that the reference value of variable devvar_pn be computed

dmax(#) requests that the point markers be drawn using value # as the maximum absolute devi

size(markersizestyle_list) specifies the list of point marker sizes. When suboption by(byva suboption by(byvar_pn) is specified, the list should be composed of kpn elements. The d

shape(symbolstyle_list) specifies the list of point marker shapes. When suboption by(byvar_p suboption by(byvar_pn) is specified, the list should be composed of kpn elements. The d deviation(devvar_pn) is specified, this suboption accepts only solid symbolstyles writt

fcolor(colorlist) specifies the list of fill colors of the point markers. When suboption by when suboption by(byvar_pn) is specified, the list should be either composed of kpn ele black, the default specification is fcolor(black ...).

ocolor(colorlist) specifies the list of outline colors of the point markers. When suboption when suboption by(byvar_pn) is specified, the list should be either composed of kpn ele none, the default specification is ocolor(none ...).

osize(linewidthstyle_list) specifies the list of outline thicknesses of the point markers. the other hand, when suboption by(byvar_pn) is specified, the list should be composed o osize(thin ...).

legenda(on|off) specifies whether the point legend should be displayed or hidden.

legenda(on) requests that the point legend be displayed.

legenda(off) is the default and requests that the point legend be hidden.

legtitle(string) specifies the title of the point legend. When suboption by(byvar_pn) is sp the legend title.

leglabel(string) specifies the label to be attached to the single key of the point legend w legenda(on) is specified and suboption by(byvar_pn) is not specified.

legshow(numlist) requests that, when suboption by(byvar_pn) is specified, only the keys inc

legcount requests that the number of points be displayed in the legend.

Option diagram() suboptions

data(diagram) requests that one or more diagrams defined in Stata dataset diagram be superi

select(command) requests that a given subset of records of dataset diagram be selected usin

by(byvar_dg) indicates that the diagrams defined in dataset diagram belong to kdg different specified in suboption variables(diagvar_dg).

xcoord(xvar_dg) specifies the name of the variable containing the x-coordinate of each diag

ycoord(yvar_dg) specifies the name of the variable containing the y-coordinate of each diag

variables(diagvar_dg) specifies the list of variables to be represented by the diagrams.

type(frect|pie) specifies the type of diagram to be used.

type(frect) is the default when only one variable is specified in suboption variables(d 1994) be used.

type(pie) is the default (and the only possibility) when two or more variables are spec type(pie) is specified, the variables specified in suboption variables(diagvar_dg)

proportional(propvar_dg) requests that the diagrams be drawn with size proportional to the prange(min max) requests that variable propvar_dg specified in suboption proportional(propva normalization based on range [0, max(propvar_dg)].

range(min max) requests that variable diagvar_dg specified in suboption variables(diagvar_dg normalization based on range [0, max(diagvar_dg)].

refval(mean|median|#) specifies the reference value of variable diagvar_dg for drawing the refval(mean) is the default and requests that the arithmetic mean of variable diagvar_dg refval(median) requests that the median of variable diagvar_dg be taken as the referenc refval(#) requests that an arbitrary real value # be taken as the reference value.

refweight(weightvar_dg) requests that the reference value of variable diagvar_dg be compute refcolor(colorstyle) specifies the color of the reference line. The default is refcolor(bla refsize(linewidthstyle) specifies the thickness of the reference line. The default is refsi

size(#) specifies a multiplier that affects the size of the diagrams. For example, size(1.5 size(1).

fcolor(colorlist) specifies the list of fill colors of the diagrams. When just one variable specified, the list should include only one element. When just one variable is specifie should be either composed of kdg elements, or represented by the name of a predefined c the list should be either composed of J elements, or represented by the name of a prede fcolor(black ...), and the default specification when J>1 is fcolor(red blue orange gre ocolor(colorlist) specifies the list of outline colors of the diagrams. When just one varia specified, the list should include only one element. When just one variable is specifie should be either composed of kdg elements, or represented by the name of a predefined c the list should be either composed of J elements, or represented by the name of a prede ocolor(black ...).

osize(linewidthstyle_list) specifies the list of outline thicknesses of the diagrams. When is not specified, the list should include only one element. When just one variable is s list should be composed of kdg elements. Finally, when J>1 variables are specified in s outline thickness is thin, the default specification is osize(thin ...).

legenda(on|off) specifies whether the diagram legend should be displayed or hidden.

legenda(on) requests that the diagram legend be displayed.

legenda(off) is the default and requests that the point diagram be hidden.

legtitle(string) specifies the title of the diagram legend. When just one variable is speci of variable diagvar_dg be used as the legend title.

legshow(numlist) requests that only the keys included in numlist be displayed in the diagra legcount requests that the number of diagrams be displayed in the legend.

Option arrow() suboptions

data(arrow) requests that one or more arrows defined in Stata dataset arrow be superimposed select(command) requests that a given subset of records of dataset arrow be selected using by(byvar_ar) indicates that the arrows defined in dataset arrow belong to kar different gro

direction(directionstyle_list) specifies the list of arrow directions, where directionstyle suboption by(byvar_ar) is not specified, the list should include only one element. On t elements. The default direction is 1, the default specification is direction(1 ...).

hsize(markersizestyle_list) specifies the list of arrowhead sizes. When suboption by(byvar_a suboption by(byvar_ar) is specified, the list should be composed of kar elements. The d

hangle(anglestyle_list) specifies the list of arrowhead angles. When suboption by(byvar_ar) suboption by(byvar_ar) is specified, the list should be composed of kar elements. The d

hbarbsize(markersizestyle_list) specifies the list of sizes of the filled portion of arrowh element. On the other hand, when suboption by(byvar_ar) is specified, the list should b hbarbsize(1.5 ...).

hfcolor(colorlist) specifies the list of arrowhead fill colors. When suboption by(byvar_ar) suboption by(byvar_ar) is specified, the list should be either composed of kar elements the default specification is hfcolor(black ...).

hocolor(colorlist) specifies the list of arrowhead outline colors. When suboption by(byvar_a suboption by(byvar_ar) is specified, the list should be either composed of kar elements black, the default specification is hocolor(black ...).

hosize(linewidthstyle_list) specifies the list of arrowhead outline thicknesses. When subop hand, when suboption by(byvar_ar) is specified, the list should be composed of kar elem lcolor(colorlist) specifies the list of arrow shaft line colors. When suboption by(byvar_ar suboption by(byvar_ar) is specified, the list should be either composed of kar elements default specification is lcolor(black ...).

lsize(linewidthstyle_list) specifies the list of arrow shaft line thicknesses. When subopti hand, when suboption by(byvar_ar) is specified, the list should be composed of kar elem lpattern(linepatternstyle_list) specifies the list of arrow shaft line patterns. When subop hand, when suboption by(byvar_ar) is specified, the list should be composed of kar elem

legenda(on|off) specifies whether the arrow legend should be displayed or hidden.

legenda(on) requests that the arrow legend be displayed.

legenda(off) is the default and requests that the arrow legend be hidden.

legtitle(string) specifies the title of the arrow legend. When suboption by(byvar_ar) is sp the legend title.

leglabel(string) specifies the label to be attached to the single key of the arrow legend w legenda(on) is specified and suboption by(byvar_ar) is not specified.

legshow(numlist) requests that, when suboption by(byvar_ar) is specified, only the keys inc

legcount requests that the number of arrows be displayed in the legend.

Option label() suboptions

data(label) requests that one or more labels defined in Stata dataset label be superimposed

select(command) requests that a given subset of records of dataset label be selected using

by(byvar_lb) indicates that the labels defined in dataset label belong to klb different gro

xcoord(xvar_lb) specifies the name of the variable containing the x-coordinate of each labe

ycoord(yvar_lb) specifies the name of the variable containing the y-coordinate of each labe

label(labvar_lb) specifies the name of the variable containing the labels.

length(lengthstyle_list) specifies the list of label lengths, where lengthstyle is any inte by(byvar_lb) is not specified, the list should include only one element. On the other h The default label lenght is 12, the default specification is length(12 ...).

size(textsizestyle_list) specifies the list of label sizes. When suboption by(byvar_lb) is by(byvar_lb) is specified, the list should be composed of klb elements. The default lab

color(colorlist) specifies the list of label colors. When suboption by(byvar_lb) is not spe by(byvar_lb) is specified, the list should be either composed of klb elements, or repre default specification is color(black ...).

position(clockpos_list) specifies the list of label positions relative to their reference p element. On the other hand, when suboption by(byvar_lb) is specified, the list should b position(0 ...).

gap(relativesize_list) specifies the list of gaps between labels and their reference point. the other hand, when suboption by(byvar_lb) is specified, the list should be composed o angle(anglestyle_list) specifies the list of label angles. When suboption by(byvar_lb) is n by(byvar_lb) is specified, the list should be composed of klb elements. The default lab

Option scalebar() suboptions

units(#) specifies the length of the scale bar expressed in arbitrary units.

scale(#) specifies the ratio of scale bar units to map units. For example, suppose map coor then the ratio of scale bar units to map units will be 1; if, on the other hand, the sc units will be 1/1000. The default is scale(1).

xpos(#) specifies the distance of the scale bar from the center of the plot region on the h values request that the distance be computed from the center to the right, whereas nega is xpos(0).

ypos(#) specifies the distance of the scale bar from the center of the plot region on the v values request that the distance be computed from the center to the top, whereas negati is ypos(-110).

size(#) specifies a multiplier that affects the height of the scale bar. For example, size( size(1).

fcolor(colorstyle) specifies the fill color of the scale bar. The default is fcolor(black).

ocolor(colorstyle) specifies the outline color of the scale bar. The default is ocolor(blac osize(linewidthstyle) specifies the outline thickness of the scale bar. The default is osiz label(string) specifies the descriptive label of the scale bar. The default is label(Units) tcolor(colorstyle) specifies the color of the scale bar text. The default is tcolor(black).

tsize(textsizestyle) specifies the size of the scale bar text. The default is tsize(*1). Graph options

polyfirst requests that the supplementary polygons specified in option polygon() be drawn b

gsize(#) specifies the length (in inches) of the shortest side of the graph available area

space around the map). The default ranges from 1 to 4, depending on the aspect ratio of the standard xsize() and ysize() options.

freestyle requests that, when drawing the graph, all the formatting presets and restriction restricts the use of some others, so as to produce a "nice" graph automatically. By spe the graph formatting options.

twoway_options include all the options documented in [G] twoway_options, except for aspect_o added_text_options, axis_options, title_options, legend_option, and region_options, as possible to control also aspect_option and scheme_option.

Examples 1: Choropleth maps

NOTE: All the examples illustrated in the present and in the following sections can be run spmap ancillary datasets are located.

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id)

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) ///

legstyle(2) legend(region(lcolor(black)))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1m using "Italy-RegionsCoordinates.dta", id(id) ///

ndfcolor(red) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) ///

legstyle(2) legend(region(lcolor(black)))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

clmethod(eqint) clnumber(5) eirange(20 70) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) ///

legstyle(2) legend(region(lcolor(black)))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

clnumber(20) fcolor(Reds2) ocolor(none ..) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) ///

legstyle(3)

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

clnumber(20) fcolor(Reds2) ocolor(none ..) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) ///

legstyle(3) legend(ring(1) position(3))

(click to run)

clnumber(20) fcolor(Reds2) ocolor(none ..) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// legstyle(3) legend(ring(1) position(3)) /// plotregion(margin(vlarge))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) /// clnumber(20) fcolor(Reds2) ocolor(none ..) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// legstyle(3) legend(ring(1) position(3)) /// plotregion(icolor(stone)) graphregion(icolor(stone))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) /// clnumber(20) fcolor(Greens2) ocolor(white ..) osize(medthin ..) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// legstyle(3) legend(ring(1) position(3)) /// plotregion(icolor(stone)) graphregion(icolor(stone))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) /// clnumber(20) fcolor(Greens2) ocolor(white ..) osize(thin ..) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// legstyle(3) legend(ring(1) position(3)) /// plotregion(icolor(stone)) graphregion(icolor(stone)) /// polygon(data("Italy-Highlights.dta") ocolor(white) /// osize(medthick))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) /// clnumber(20) fcolor(Greens2) ocolor(white ..) osize(medthin ..) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// legstyle(3) legend(ring(1) position(3)) /// plotregion(icolor(stone)) graphregion(icolor(stone)) /// scalebar(units(500) scale(1/1000) xpos(-100) label(Kilometers))

(click to run)

Examples 2: Proportional symbol maps

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig1) fcolor(red) size(*1.5))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig1) fcolor(red) size(*1.5) /// shape(s))

(click to run)

title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig1) fcolor(red) /// ocolor(white) size(*3)) /// label(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) label(relig1) color(white) size(*0.7))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) deviation(relig1) fcolor(red) dmax(30) /// legenda(on) leglabel(Deviation from the mean))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) fcolor(white) /// title("Catholics without reservations", size(*0.9) box bexpand /// span margin(medsmall) fcolor(sand)) subtitle(" ") /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig1) prange(0 70) /// psize(absolute) fcolor(red) ocolor(white) size(*0.6)) /// plotregion(margin(medium) color(stone)) /// graphregion(fcolor(stone) lcolor(black)) /// name(g1, replace) nodraw

. spmap using "Italy-OutlineCoordinates.dta", id(id) fcolor(white) /// title("Catholics with reservations", size(*0.9) box bexpand /// span margin(medsmall) fcolor(sand)) subtitle(" ") /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig2) prange(0 70) /// psize(absolute) fcolor(green) ocolor(white) size(*0.6)) /// plotregion(margin(medium) color(stone)) /// graphregion(fcolor(stone) lcolor(black)) /// name(g2, replace) nodraw

. spmap using "Italy-OutlineCoordinates.dta", id(id) fcolor(white) /// title("Other", size(*0.9) box bexpand /// span margin(medsmall) fcolor(sand)) subtitle(" ") /// point(data("Italy-RegionsData.dta") xcoord(xcoord) /// ycoord(ycoord) proportional(relig3) prange(0 70) /// psize(absolute) fcolor(blue) ocolor(white) size(*0.6)) /// plotregion(margin(medium) color(stone)) /// graphregion(fcolor(stone) lcolor(black)) /// name(g3, replace) nodraw

. graph combine g1 g2 g3, rows(1) title("Religious orientation") /// subtitle("Italy, 1994-98" " ") xsize(5) ysize(2.6) /// plotregion(margin(medsmall) style(none)) /// graphregion(margin(zero) style(none)) /// scheme(s1mono)

(click to run)

Examples 3: Other maps

. use "Italy-RegionsData.dta", clear

. spmap using "Italy-RegionsCoordinates.dta", id(id) fcolor(stone) /// title("Pct. Catholics without reservations", size(*0.8)) /// subtitle("Italy, 1994-98" " ", size(*0.8)) /// diagram(variable(relig1) range(0 100) refweight(pop98) /// xcoord(xcoord) ycoord(ycoord) fcolor(red))

(click to run)

diagram(variable(relig1 relig2 relig3) proportional(fortell) ///

xcoord(xcoord) ycoord(ycoord) legenda(on)) ///

legend(title("Religious orientation", size(*0.5) bexpand ///

justification(left))) ///

note(" " ///

"NOTE: Chart size proportional to number of fortune tellers per million population", // size(*0.75))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

clmethod(stdev) clnumber(5) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) area(pop98) ///

note(" " ///

"NOTE: Region size proportional to population", size(*0.75))

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta", id(id) ///

clmethod(stdev) clnumber(5) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Italy, 1994-98" " ", size(*0.8)) area(pop98) ///

map("Italy-OutlineCoordinates.dta") mfcolor(stone) ///

note(" " ///

"NOTE: Region size proportional to population", size(*0.75))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) fc(bluishgray) ///

ocolor(none) ///

title("Provincial capitals" " ", size(*0.9) color(white)) ///

point(data("Italy-Capitals.dta") xcoord(xcoord) ///

ycoord(ycoord) fcolor(emerald)) ///

plotregion(margin(medium) icolor(dknavy) color(dknavy)) ///

graphregion(icolor(dknavy) color(dknavy))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) fc(bluishgray) ///

ocolor(none) ///

title("Provincial capitals" " ", size(*0.9) color(white)) ///

point(data("Italy-Capitals.dta") xcoord(xcoord) ///

ycoord(ycoord) by(size) fcolor(orange red maroon) shape(s ..) ///

legenda(on)) ///

legend(title("Population 1998", size(*0.5) bexpand ///

justification(left)) region(lcolor(black) fcolor(white)) ///

position(2)) ///

plotregion(margin(medium) icolor(dknavy) color(dknavy)) ///

graphregion(icolor(dknavy) color(dknavy))

(click to run)

. use "Italy-OutlineData.dta", clear

. spmap using "Italy-OutlineCoordinates.dta", id(id) fc(sand) ///

title("Main lakes and rivers" " ", size(*0.9)) ///

polygon(data("Italy-Lakes.dta") fcolor(blue) ocolor(blue)) ///

line(data("Italy-Rivers.dta") color(blue) )

(click to run)

. use "Italy-RegionsData.dta", clear

. spmap relig1 using "Italy-RegionsCoordinates.dta" if zone==1, ///

id(id) fcolor(Blues2) ocolor(white ..) osize(medthin ..) ///

title("Pct. Catholics without reservations", size(*0.8)) ///

subtitle("Northern Italy, 1994-98" " ", size(*0.8)) ///

polygon(data("Italy-OutlineCoordinates.dta") fcolor(gs12) ///

ocolor(white) osize(medthin)) polyfirst

(click to run)

stata命令总结

stata11常用命令 注:JB统计量对应的p大于0.05,则表明非正态,这点跟sktest和swilk 检验刚好相反; dta为数据文件; gph为图文件; do为程序文件; 注意stata要区别大小写; 不得用作用户变量名: _all _n _N _skip _b _coef _cons _pi _pred _rc _weight double float long int in if using with 命令: 读入数据一种方式 input x y 1 4 2 5.5 3 6.2 4 7.7 5 8.5 end su/summarise/sum x 或 su/summarise/sum x,d 对分组的描述: sort group by group:su x %%%%% tabstat economy,stats(max) %返回变量economy的最大值 %%stats括号里可以是:mean,count(非缺失观测值个数),sum(总和),max,min,range, %% sd,var,cv(变易系数=标准差/均值),skewness,kurtosis,median,p1(1%分位 %% 数,类似地有p10, p25, p50, p75, p95, p99),iqr(interquantile range = p75 – p25) _all %描述全部 _N 数据库中观察值的总个数。 _n 当前观察值的位置。 _pi 圆周率π的数值。 list gen/generate %产生数列 egen wagemax=max(wage) clear use by(分组变量)

STATA最常用命令大全

stata save命令 FileSave As 例1. 表1.为某一降压药临床试验数据,试从键盘输入Stata,并保存为Stata格式文件。 STATA数据库的维护 排序 SORT 变量名1 变量名2 …… 变量更名 rename 原变量名新变量名 STATA数据库的维护 删除变量或记录 drop x1 x2 /* 删除变量x1和x2 drop x1-x5 /* 删除数据库中介于x1和x5间的所有变量(包括x1和x5) drop if x<0 /* 删去x1<0的所有记录 drop in 10/12 /* 删去第10~12个记录 drop if x==. /* 删去x为缺失值的所有记录 drop if x==.|y==. /* 删去x或y之一为缺失值的所有记录 drop if x==.&y==. /* 删去x和y同时为缺失值的所有记录 drop _all /* 删掉数据库中所有变量和数据 STATA的变量赋值 用generate产生新变量 generate 新变量=表达式 generate bh=_n /* 将数据库的内部编号赋给变量bh。 generate group=int((_n-1)/5)+1 /* 按当前数据库的顺序,依次产生5个1,5个2,5个3……。直到数据库结束。 generate block=mod(_n,6) /* 按当前数据库的顺序,依次产生1,2,3,4,5,0。generate y=log(x) if x>0 /* 产生新变量y,其值为所有x>0的对数值log(x),当x<=0时,用缺失值代替。 egen产生新变量 set obs 12 egen a=seq() /*产生1到N的自然数 egen b=seq(),b(3) /*产生一个序列,每个元素重复#次 egen c=seq(),to(4) /*产生多个序列,每个序列从1到# egen d=seq(),f(4)t(6) /*产生多个序列,每个序列从#1到#2 encode 字符变量名,gen(新数值变量名) 作用:将字符型变量转化为数值变量。 STATA数据库的维护 保留变量或记录 keep in 10/20 /* 保留第10~20个记录,其余记录删除 keep x1-x5 /* 保留数据库中介于x1和x5间的所有变量(包括x1和x5),其余变量删除keep if x>0 /* 保留x>0的所有记录,其余记录删除

Stata命令整理教学内容

Stata 命令语句格式: [by varlist:] command [varlist] [=exp] [if exp] [in range] [weight] [, options] 1、[by varlist:] *如果需要分别知道国产车和进口车的价格和重量,可以采用分类操作来求得, sort foreign //按国产车和进口车排序 . by foreign: sum price weight *更简略的方式是把两个命令用一个组合命令来写。 . by foreign, sort: sum price weight 如果不想从小到大排序,而是从大到小排序,其命令为gsort。 . sort - price //按价格从高到低排序 . sort foreign -price /*先把国产车都排在前,进口车排在后面,然后在国产车内再按价格从大小到排序,在进口车内部,也按从大到小排序*/ 2、[=exp]赋值运算 . gen nprice=price+10 //生成新变量nprice,其值为price+10 /*上面的命令generate(略写为gen) 生成一个新的变量,新变量的变量名为 nprice,新的价格在原价格的基础上均增加了10 元。 . replace nprice=nprice-10 /*命令replace 则直接改变原变量的赋值,nprice 调减后与price 变量取值相等*/ 3、[if exp]条件表达式 . list make price if foreign==0 *只查看价格超过1 万元的进口车(同时满足两个条件),则 . list make price if foreign==1 & price>10000 *查看价格超过1 万元或者进口车(两个条件任满足一个) . list make price if foreign==1 | price>10000 4、[in range]范围筛选 sum price in 1/5 注意“1/5”中,斜杠不是除号,而是从1 到 5 的意思,即1,2,3,4,5。 如果要计算前10 台车中的国产车的平均价格,则可将范围和条件筛选联合使用。 . sum price in 1/10 if foreign==0 5、[weight] 加权 sum score [weight=num] 其中,num为每个成绩所对应的人数 6、[, options]其他可选项 例如,我们不仅要计算平均成绩,还想知道成绩的中值,方差,偏度和峰度等*/ . sum score, detail . sum score, d //d 为detail 的略写,两个命令完全等价 . list price, nohead //不要表头 Stata 数据类型转换 1、字符型转化成数值型 destring, replace //全部转换为数值型,replace 表示将原来的变量(值)更新 destring date, replace ignore(“ ”) 将字符型数据转换为数值型数据:去掉字符间的空格destring price percent, gen(price2 percent2) ignore(“$ ,%”) 与date 变量类似,变量price 前面有美元符号,变量percent 后有百分号,换为数值型时需要忽略这些非数值型字符 2、数值型转化为字符型

(完整)stata命令总结,推荐文档

stata11 常用命令 注:JB统计量对应的p大于0.05 ,则表明非正态,这点跟sktest 和 swilk 检验刚好相反;dta 为数据文件;gph 为图文件;do 为程序文件;注 意stata 要区别大小写;不得用作用户变量名: _all _n _N _skip _b _coef _cons _pi _pred _rc _weight double float long int in if using with 命令:读入数据一种方式 input x y 14 2 5.5 3 6.2 47.7 58.5 end su/summarise/sum x 或su/summarise/sum x,d 对分组的描述: sort group by group:su x %%%%% tabstat economy,stats(max)%返回变量economy的最大值 %%stats括号里可以是:mean,count(非缺失观测值个数),sum(总 和),max,min,range , %% sd ,var ,cv(变易系数=标准差/ 均值),skewness,kurtosis , median,p1(1 %分位 %% 数,类似地有p10, p25, p50, p75, p95, p99),iqr(interquantile range = p75 –p25) _all %描述全部 _N 数据库中观察值的总个数。 _n 当前观察值的位置。 _pi 圆周率π 的数值。 list gen/generate % 产生数列egen wagemax=max(wage) clear use by(分组变量)

stata常用命令

用help命令熟悉以下命令的功能: cd:(Change directory)改变stata的工作路径 用法:(cd changes the current working directory to the specified drive and directory.) ●指定全路径:cd e:\ ●指定相对路径(如果当前路径已经指向e:\那么下面命令将达到和上面全路 径命令同样效果): ●cd .. 返回上一级目录 dir:(Display filenames)显示当前目录下的文件信息 用法:(list the names of files in the specified,the names of the commands come from names popular on Unix and Windows,filespec may be any valid Mac, Unix, or Windows file path or file)工作列表文件中指定的名称目录,命令的名称来自名字流行的Unix和Windows文件规范可以是任何有效的Mac,Unix或Windows文件路径或文件。 . dir, w . dir *.dta . dir \mydata\*.dta List:(List values of variables)列出指定变量的取值 用法:(st displays the values of variables. If no varlist is specified, the values of all the variables are displayed)列表显示变量的值。如果没有指定varlist,所有的值显示的变量。list [varlist] [if] [in] [, options] . list in 1/10 . list mpg weight . list mpg weight in 1/20 . list if mpg>20 . list mpg weight if mpg>20 . list mpg weight if mpg>20 in 1/10 Describe:(Describe data in memory or in file)描述内存或者文件中的数 据(样本数、变量类型等信息) 用法:(describe produces a summary of the dataset in memory or of the data stored in a Stata-format dataset. For a compact listing of variable names, use describe, simple.) ●描述内存数据: ●描述文件数据:describe [varlist] using filename [, file_options] Use:(Load Stata dataset)调用数据,打开数据文件(以dta结尾)文 件名+.dta 数据读入stata 用法:(use loads into memory a Stata-format dataset previously saved by save. If filename is specified without an extension, .dta is assumed. If your

stata命令大全(全)

*********面板数据计量分析与软件实现********* 说明:以下do文件相当一部分内容来自于中山大学连玉君STATA教程,感谢他的贡献。本人做了一定的修改与筛选。 *----------面板数据模型 * 1.静态面板模型:FE 和RE * 2.模型选择:FE vs POLS, RE vs POLS, FE vs RE (pols混合最小二乘估计) * 3.异方差、序列相关和截面相关检验 * 4.动态面板模型(DID-GMM,SYS-GMM) * 5.面板随机前沿模型 * 6.面板协整分析(FMOLS,DOLS) *** 说明:1-5均用STATA软件实现, 6用GAUSS软件实现。 * 生产效率分析(尤其指TFP):数据包络分析(DEA)与随机前沿分析(SFA) *** 说明:DEA由DEAP2.1软件实现,SFA由Frontier4.1实现,尤其后者,侧重于比较C-D与Translog生产函数,一步法与两步法的区别。常应用于地区经济差异、FDI溢出效应(Spillovers Effect)、工业行业效率状况等。 * 空间计量分析:SLM模型与SEM模型 *说明:STATA与Matlab结合使用。常应用于空间溢出效应(R&D)、财政分权、地方政府公共行为等。 * --------------------------------- * --------一、常用的数据处理与作图----------- * --------------------------------- * 指定面板格式 xtset id year (id为截面名称,year为时间名称) xtdes /*数据特征*/ xtsum logy h /*数据统计特征*/ sum logy h /*数据统计特征*/ *添加标签或更改变量名 label var h "人力资本" rename h hum *排序 sort id year /*是以STATA面板数据格式出现*/ sort year id /*是以DEA格式出现*/ *删除个别年份或省份 drop if year<1992 drop if id==2 /*注意用==*/ *如何得到连续year或id编号(当完成上述操作时,year或id就不连续,为形成panel格式,需要用egen命令) egen year_new=group(year) xtset id year_new **保留变量或保留观测值 keep inv /*删除变量*/ **或 keep if year==2000 **排序 sort id year /*是以STATA面板数据格式出现 sort year id /*是以DEA格式出现 **长数据和宽数据的转换 *长>>>宽数据 reshape wide logy,i(id) j(year)

[推荐] stata基本操作汇总常用命令

[推荐] Stata基本操作汇总——常用命令 help和search都是查找帮助文件的命令,它们之间的 区别在于help用于查找精确的命令名,而search是模糊查找。 如果你知道某个命令的名字,并且想知道它的具体使用方法,只须在stata的命令行窗口中输入help空格加上这个名字。回车后结果屏幕上就会显示出这个命令的帮助文件的全部 内容。如果你想知道在stata下做某个估计或某种计算,而 不知道具体该如何实现,就需要用search命令了。使用的 方法和help类似,只须把准确的命令名改成某个关键词。回车后结果窗口会给出所有和这个关键词相关的帮助文件名 和链接列表。在列表中寻找最相关的内容,点击后在弹出的查看窗口中会给出相关的帮助文件。耐心寻找,反复实验,通常可以较快地找到你需要的内容.下面该正式处理数据了。我的处理数据经验是最好能用stata的do文件编辑器记下你做过的工作。因为很少有一项实证研究能够一次完成,所以,当你下次继续工作时。能够重复前面的工作是非常重要的。有时因为一些细小的不同,你会发现无法复制原先的结果了。这时如果有记录下以往工作的do文件将把你从地狱带到天堂。因为你不必一遍又一遍地试图重现做过的工作。在stata 窗口上部的工具栏中有个孤立的小按钮,把鼠标放上去会出

现“bring do-file editor to front”,点击它就会出现do文件编 辑器。 为了使do文件能够顺利工作,一般需要编辑do文件的“头”和“尾”。这里给出我使用的“头”和“尾”。capture clear (清空内存中的数据)capture log close (关闭所有 打开的日志文件)set more off (关闭more选项。如果打开该选项,那么结果分屏输出,即一次只输出一屏结果。你按空格键后再输出下一屏,直到全部输完。如果关闭则中间不停,一次全部输出。)set matsize 4000 (设置矩阵的最大阶数。我用的是不是太大了?)cd D: (进入数据所在的盘符和文件夹。和dos的命令行很相似。)log using (文件名).log,replace (打开日志文件,并更新。日志文件将记录下所有文件运行后给出的结果,如果你修改了文件内容,replace选项可以将其更新为最近运行的结果。)use (文件名),clear (打开数据文件。)(文件内容)log close (关闭日志文件。)exit,clear (退出并清空内存中的数据。) 实证工作中往往接触的是原始数据。这些数据没有经过整理,有一些错漏和不统一的地方。比如,对某个变量的缺失观察值,有时会用点,有时会用-9,-99等来表示。回归时如果 使用这些观察,往往得出非常错误的结果。还有,在不同的数据文件中,相同变量有时使用的变量名不同,会给合并数

常用到的stata命令

常用到的sta命令 闲话不说了。help和search都是查找帮助文件的命令,它们之间的区别在于help用于查找精确的命令名,而search是模糊查找。如果你知道某个命令的名字,并且想知道它的具体使用方法,只须在sta的命令行窗口中输入help空格加上这个名字。回车后结果屏幕上就会显示出这个命令的帮助文件的全部内容。如果你想知道在sta下做某个估计或某种计算,而不知道具体该如何实现,就需要用search命令了。使用的方法和help类似,只须把准确的命令名改成某个关键词。回车后结果窗口会给出所有和这个关键词相关的帮助文件名和链接列表。在列表中寻找最相关的内容,点击后在弹出的查看窗口中会给出相关的帮助文件。耐心寻找,反复实验,通常可以较快地找到你需要的内容。 下面该正式处理数据了。我的处理数据经验是最好能用sta的do文件编辑器记下你做过的工作。因为很少有一项实证研究能够一次完成,所以,当你下次继续工作时。能够重复前面的工作是非常重要的。有时因为一些细小的不同,你会发现无法复制原先的结果了。这时如果有记录下以往工作的do文件将把你从地狱带到天堂。因为你不必一遍又一遍地试图重现做过的工作。在sta窗口上部的工具栏中有个孤立的小按钮,把鼠标放上去会出现“bring do-file editor to front”,点击它就会出现do文件编辑器。 为了使do文件能够顺利工作,一般需要编辑do文件的“头”和“尾”。这里给出我使用的“头”和“尾”。 /*(标签。简单记下文件的使命。)*/ capture clear(清空内存中的数据) capture log close(关闭所有打开的日志文件) set mem 128m(设置用于sta使用的内存容量) set more off(关闭more选项。如果打开该选项,那么结果分屏输出,即一次只输出一屏结果。你按空格键后再输出下一屏,直到全部输完。如果关闭则中间不停,一次全部输出。) set matsize4000(设置矩阵的最大阶数。我用的是不是太大了?)

stata常用命令模板

stata 常用命令 (2012-07-29 17:22:25) 转载▼ 分类:stata 标签: 杂谈 save命令 FileSave As 例1. 表1.为某一降压药临床试验数据,试从键盘输入Stata,并保存为Stata格式文件。STATA数据库的维护 排序 SORT 变量名1 变量名2 …… 变量更名 rename 原变量名新变量名 STATA数据库的维护 删除变量或记录 drop x1 x2 /* 删除变量x1和x2 drop x1-x5 /* 删除数据库中介于x1和x5间的所有变量(包括x1和x5) drop if x<0 /* 删去x1<0的所有记录 drop in 10/12 /* 删去第10~12个记录 drop if x==. /* 删去x为缺失值的所有记录 drop if x==.|y==. /* 删去x或y之一为缺失值的所有记录 drop if x==.&y==. /* 删去x和y同时为缺失值的所有记录 drop _all /* 删掉数据库中所有变量和数据 STATA的变量赋值 用generate产生新变量 generate 新变量=表达式 generate bh=_n /* 将数据库的内部编号赋给变量bh。 generate group=int((_n-1)/5)+1 /* 按当前数据库的顺序,依次产生5个1,5个2,5个 3……。直到数据库结束。 generate block=mod(_n,6) /* 按当前数据库的顺序,依次产生1,2,3,4,5,0。generate y=log(x) if x>0 /* 产生新变量y,其值为所有x>0的对数值log(x),当x<=0时,用缺失值代替。 egen产生新变量 set obs 12 egen a=seq() /*产生1到N的自然数 egen b=seq(),b(3) /*产生一个序列,每个元素重复#次 egen c=seq(),to(4) /*产生多个序列,每个序列从1到# egen d=seq(),f(4)t(6) /*产生多个序列,每个序列从#1到#2 encode 字符变量名,gen(新数值变量名) 作用:将字符型变量转化为数值变量。

Stata统计分析命令

Stata统计分析常用命令汇总 一、winsorize极端值处理 范围:一般在1%和99%分位做极端值处理,对于小于1%的数用1%的值赋值,对于大于99%的数用99%的值赋值。 1、Stata中的单变量极端值处理: stata 11.0,在命令窗口输入“findit winsor”后,系统弹出一个窗口,安装winsor模块 安装好模块之后,就可以调用winsor命令,命令格式:winsor var1, gen(new var) p(0.01) 或者在命令窗口中输入:ssc install winsor安装winsor命令。winsor命令不能进行批量处理。 2、批量进行winsorize极端值处理: 打开链接:https://www.sodocs.net/doc/7f468606.html,/judson.caskey/data.html,找到winsorizeJ,点击右键,另存为到stata中的ado/plus/目录下即可。命令格式:winsorizeJ var1var2var3,suffix(w)即可,这样会生成三个新变量,var1w var2w var3w,而且默认的是上下1%winsorize。如果要修改分位点,则写成如下格式:winsorizeJ var 1 var2 var3,suffix(w) cuts(5 95)。 3、Excel中的极端值处理:(略) winsor2 命令使用说明 简介:winsor2 winsorize or trim (if trim option is specified) the variables in varlist at particular percentiles specified by option cuts(# #). In defult, new variables will be generated with a suffix "_w" or "_tr", which can be changed by specifying suffix() option. The replace option replaces the variables with their winsorized or trimmed ones. 相比于winsor命令的改进: (1) 可以批量处理多个变量; (2) 不仅可以winsor,也可以trimming; (3) 附加了by() 选项,可以分组winsor 或trimming; (4) 增加了replace 选项,可以不必生成新变量,直接替换原变量。 范例: *- winsor at (p1 p99), get new variable "wage_w" . sysuse nlsw88, clear . winsor2 wage *- left-trimming at 2th percentile . winsor2 wage, cuts(2 100) trim *- winsor variables by (industry south), overwrite the old variables . winsor2 wage hours, replace by(industry south) 使用方法: 1. 请将winsor 2.ado 和winsor2.sthlp 放置于stata12\ado\base\w 文件夹下; 2. 输入help winsor2 可以查看帮助文件;

Stata常用15条命令

【命令1】:导入数据 一般做实证分析使用的是excel中的数据,其后缀名为.xls,需要将其修改为.csv insheet using name.csv, clear 【命令2】:删除重复变量 sort var1 var2 duplicatesdrop var1 var2, force 【命令3】:合并数据 use data1, clear merge m:m var1 var2 using data2 drop if _merge==2 drop if _merge==1 drop _merge 【命令4】:描述性统计分析 tabstat var1var2, stat(n min mean median p25 p75 max sd), if groupvar==0 or 1 输出到word中: logout, save(name) word replace: tabstat var, stat(n min mean p50 max sd) col(stat)f(%9.2g) 【命令5】:结果输出 安装 ssc install estout, replace 单个回归 reg y x esttab using name.rtf, compress nogap r2 ar2 star(* 0.1 ** 0.05 *** 0.01) 多个回归一起 reg y x1 est store m1 reg y x2 est store m2 esttab m1 m2 using name.rtf, compress nogap r2 ar2 star(* 0.1 ** 0.05 *** 0.01)

常用到的stata命令

安装estat: ssc install estout,replace\ 2010-10-14 11:38:15来自: 杨囡囡(all a woman lack is a wife) (转自人大论坛) 调整变量格式: format x1 %10.3f ——将x1的列宽固定为10,小数点后取三位 format x1 %10.3g ——将x1的列宽固定为10,有效数字取三位 format x1 %10.3e ——将x1的列宽固定为10,采用科学计数法 format x1 %10.3fc ——将x1的列宽固定为10,小数点后取三位,加入千分位分隔符 format x1 %10.3gc ——将x1的列宽固定为10,有效数字取三位,加入千分位分隔符 format x1 %-10.3gc ——将x1的列宽固定为10,有效数字取三位,加入千分位分隔符,加入“-”表示左对齐 合并数据: use "C:\Documents and Settings\xks\桌面\2006.dta", clear merge using "C:\Documents and Settings\xks\桌面\1999.dta" ——将1999和2006的数据按照样本(observation)排列的自然顺序合并起来 use "C:\Documents and Settings\xks\桌面\2006.dta", clear merge id using "C:\Documents and Settings\xks\桌面\1999.dta" ,unique sort ——将1999和2006的数据按照唯一的(unique)变量id来合并,在合并时对id进行排序(sort) 建议采用第一种方法。 对样本进行随机筛选: sample 50 在观测案例中随机选取50%的样本,其余删除 sample 50,count 在观测案例中随机选取50个样本,其余删除 查看与编辑数据:

stata常用命令

stata常用命令 stata save命令 FileSave As 例1. 表1.为某一降压药临床试验数据,试从键盘输入Stata,并保存为Stata格式文件。STATA数据库的维护 排序 SORT 变量名1 变量名2 …… 变量更名 rename 原变量名新变量名 STATA数据库的维护 删除变量或记录 drop x1 x2 /* 删除变量x1和x2 drop x1-x5 /* 删除数据库中介于x1和x5间的所有变量(包括x1和x5) drop if x<0 /* 删去x1<0的所有记录 drop in 10/12 /* 删去第10~12个记录 drop if x==. /* 删去x为缺失值的所有记录 drop if x==.|y==. /* 删去x或y之一为缺失值的所有记录 drop if x==.&y==. /* 删去x和y同时为缺失值的所有记录 drop _all /* 删掉数据库中所有变量和数据 STATA的变量赋值 用generate产生新变量 generate 新变量=表达式 generate bh=_n /* 将数据库的内部编号赋给变量bh。 generate group=int((_n-1)/5)+1 /* 按当前数据库的顺序,依次产生5个1,5个2,5个3……。直到数据库结束。 generate block=mod(_n,6) /* 按当前数据库的顺序,依次产生1,2,3,4,5,0。generate y=log(x) if x>0 /* 产生新变量y,其值为所有x>0的对数值log(x),当x<=0时,用缺失值代替。 egen产生新变量 set obs 12 egen a=seq() /*产生1到N的自然数 egen b=seq(),b(3) /*产生一个序列,每个元素重复#次 egen c=seq(),to(4) /*产生多个序列,每个序列从1到# egen d=seq(),f(4)t(6) /*产生多个序列,每个序列从#1到#2

stata常用命令

面板数据估计 首先对面板数据进行声明: 前面是截面单元,后面是时间标识: tsset company year tsset industry year 产生新的变量:gen newvar=human*lnrd 产生滞后变量Gen fiscal(2)=L2.fiscal 产生差分变量Gen fiscal(D)=D.fiscal 描述性统计: xtdes :对Panel Data截面个数、时间跨度的整体描述 Xtsum:分组内、组间和样本整体计算各个变量的基本统计量 xttab 采用列表的方式显示某个变量的分布 Stata中用于估计面板模型的主要命令:xtreg xtreg depvar [varlist] [if exp] , model_type [level(#) ] Model type 模型 be Between-effects estimator fe Fixed-effects estimator re GLS Random-effects estimator pa GEE population-averaged estimator mle Maximum-likelihood Random-effects estimator 主要估计方法: xtreg: Fixed-, between- and random-effects, and population-averaged linear models xtregar:Fixed- and random-effects linear models with an AR(1) disturbance xtpcse :OLS or Prais-Winsten models with panel-corrected standard errors xtrchh :Hildreth-Houck random coefficients models

常用stata命令-好用

我常用到的stata命令 最重要的两个命令莫过于help和search了。即使是经常使用stata的人也很难,也没必要记住常用命令的每一个细节,更不用说那些不常用到的了。所以,在遇到困难又没有免费专家咨询时,使用stata自带的帮助文件就是最佳选择。stata的帮助文件十分详尽,面面俱到,这既是好处也是麻烦。当你看到长长的帮助文件时,是不是对迅速找到相关信息感到没有信心? 闲话不说了。help和search都是查找帮助文件的命令,它们之间的区别在于help用于查找精确的命令名,而search是模糊查找。如果你知道某个命令的名字,并且想知道它的具体使用方法,只须在stata的命令行窗口中输入help空格加上这个名字。回车后结果屏幕上就会显示出这个命令的帮助文件的全部内容。如果你想知道在stata下做某个估计或某种计算,而不知道具体该如何实现,就需要用search命令了。使用的方法和help类似,只须把准确的命令名改成某个关键词。回车后结果窗口会给出所有和这个关键词相关的帮助文件名和链接列表。在列表中寻找最相关的内容,点击后在弹出的查看窗口中会给出相关的帮助文件。耐心寻找,反复实验,通常可以较快地找到你需要的内容。 下面该正式处理数据了。我的处理数据经验是最好能用stata的do文件编辑器记下你做过的工作。因为很少有一项实证研究能够一次完成,所以,当你下次继续工作时。能够重复前面的工作是非常重要的。有时因为一些细小的不同,你会发现无法复制原先的结果了。这时如果有记录下以往工作的do文件将把你从地狱带到天堂。因为你不必一遍又一遍地试图重现做过的工作。在stata窗口上部的工具栏中有个孤立的小按钮,把鼠标放上去会出现“bring do-file editor to front”,点击它就会出现do文件编辑器。 为了使do文件能够顺利工作,一般需要编辑do文件的“头”和“尾”。这里给出我使用的“头”和“尾”。 /*(标签。简单记下文件的使命。)*/ capture clear (清空内存中的数据) capture log close (关闭所有打开的日志文件) set mem 128m (设置用于stata使用的内存容量) set more off (关闭more选项。如果打开该选项,那么结果分屏输出,即一次只输出一屏结果。你按空格键后再输出下一屏,直到全部输完。如果关闭则中间不停,一次全部输出。)set matsize 4000 (设置矩阵的最大阶数。我用的是不是太大了?) cd D: (进入数据所在的盘符和文件夹。和dos的命令行很相似。) log using (文件名).log,replace (打开日志文件,并更新。日志文件将记录下所有文件运行后给出的结果,如果你修改了文件内容,replace选项可以将其更新为最近运行的结果。) use (文件名),clear (打开数据文件。) (文件内容)

stata常用命令

调整变量格式: format x1 % ——将x1的列宽固定为10,小数点后取三位 format x1 % ——将x1的列宽固定为10,有效数字取三位 format x1 % ——将x1的列宽固定为10,采用科学计数法 format x1 % ——将x1的列宽固定为10,小数点后取三位,加入千分位分隔符 format x1 % ——将x1的列宽固定为10,有效数字取三位,加入千分位分隔符 format x1 % ——将x1的列宽固定为10,有效数字取三位,加入千分位分隔符,加入“-”表示左对齐合并数据: use "C:\Documents and Settings\xks\桌面\", clear merge using "C:\Documents and Settings\xks\桌面\" ——将1999和2006的数据按照样本(observation)排列的自然顺序合并起来 use "C:\Documents and Settings\xks\桌面\", clear merge id using "C:\Documents and Settings\xks\桌面\" ,unique sort ——将1999和2006的数据按照唯一的(unique)变量id来合并,在合并时对id进行排序(sort)建议采用第一种方法。 对样本进行随机筛选: sample 50 在观测案例中随机选取50%的样本,其余删除 sample 50,count 在观测案例中随机选取50个样本,其余删除 查看与编辑数据: browse x1 x2 if x3>3 (按所列变量与条件打开数据查看器) edit x1 x2 if x3>3 (按所列变量与条件打开数据编辑器) 数据合并(merge)与扩展(append) merge表示样本量不变,但增加了一些新变量;append表示样本总量增加了,但变量数目不变。one-to-one merge: 数据源自stata tutorial中的exampw1和exampw2 第一步:将exampw1按v001~v003这三个编码排序,并建立临时数据库tempw1 clear use "t:\statatut\" su ——summarize的简写 sort v001 v002 v003 save tempw1 第二步:对exampw2做同样的处理 clear use "t:\statatut\" su sort v001 v002 v003 save tempw2 第三步:使用tempw1数据库,将其与tempw2合并: clear use tempw1 merge v001 v002 v003 using tempw2 第四步:查看合并后的数据状况:

Stata基本命令

Stata基本命令 一、描述性统计 命令:sum(var1 var2) 二、独立样本t检验 命令:ttest var1, by(group) 三、回归 (一)检测变量是否需要加对数 1、C-D方程中基本都要加对数,除了0-1的小数和离散变量 命令:gen lnvar1=log(var1),若var1有零值,则gen lnvar1=log(var1+1) 2、其他方程的变量检测 (1)sktest var1,若PT(skewness)>0.05,则呈正态分布,不用加对数(2)ladder var1,若P(chi2)越大,就选这种形式。 (二)构建面板数据 命令:xtset county year (三)回归 1、随机效应模型 命令:xtreg y var1 var2 DID t_2008 t_2007 t_2006 t_2005 south north,re 2、固定效应模型(地区变量不需要放进去) 命令:xtreg y var1 var2 DID t_2008 t_2007 t_2006 t_2005,fe

3、随机效应模型和固定效应模型的结果只能两者选其一,方法是Hausman检验,做法如下:第一步:固定效应模型回归 xtreg y var1 var2 DID t_2008 t_2007 t_2006 t_2005,fe 第二步:存储固定效应值 est store fe 第三步:随机效应模型回归 xtreg y var1 var2 DID t_2008 t_2007 t_2006 t_2005 south north,re 第四步:存储随机效应值 est store re 第五步:检测 hausman fe re,看prob>chi2的结果,若显著,则选择固定效应模型。(一般都是选择固定效应模型) 四、注意数据的保存和命令的保存 Do命令的保存,可以使用英文的””在里面加注释

stata命令大全(全)之欧阳歌谷创编

*********面板数据计量分析与软件 实现********* 欧阳歌谷(2021.02.01) 说明:以下do文件相当一部分内容来自于中山大学连玉君STATA 教程,感谢他的贡献。本人做了一定的修改与筛选。 *----------面板数据模型 * 1.静态面板模型:FE 和RE * 2.模型选择:FE vs POLS, RE vs POLS, FE vs RE(pols混合最小二乘估计) * 3.异方差、序列相关和截面相关检验 * 4.动态面板模型(DID-GMM,SYS-GMM) * 5.面板随机前沿模型 * 6.面板协整分析(FMOLS,DOLS) *** 说明:1-5均用STATA软件实现, 6用GAUSS软件实现。 * 生产效率分析(尤其指TFP):数据包络分析(DEA)与随机前沿分析(SFA) *** 说明:DEA由DEAP2.1软件实现,SFA由Frontier4.1实现,尤其后者,侧重于比较C-D与Translog生产函数,一步法与两步法的区别。常应用于地区经济差异、FDI溢出效应(Spillovers Effect)、工业行业效率状况等。 * 空间计量分析:SLM模型与SEM模型 *说明:STATA与Matlab结合使用。常应用于空间溢出效应 (R&D)、财政分权、地方政府公共行为等。 * --------------------------------- * --------一、常用的数据处理与作图----------- * --------------------------------- * 指定面板格式 xtset id year(id为截面名称,year为时间名称) xtdes /*数据特征*/ xtsum logy h /*数据统计特征*/ sum logy h /*数据统计特征*/ *添加标签或更改变量名 label var h "人力资本" rename h hum *排序

相关主题