




                       U.S. Bureau of the Census Monograph       D R A F T
                              Economic Census Staff
                             Washington, D.C.  20233

 USING EXTRACT WITH STF 1A  (EXTutor 4)                            July 16, 1992

 ======================================================================
 CONTACT:        Paul Zeisset or Bob Marske                       (301) 763-1792


 ABSTRACT:       The following was designed as a self-directed tutorial in
                 EXTRACT usage.  It may also be used as a training presentation.

                 The tutorial draws all of its examples from STF 1A, but once
                 the skills covered are learned, the user should have immediate
                 facility with any of the other major STFs, including STF 1C and
                 STF 3A.

                 This tutorial is in five parts:

                 4A--Basic skills:  simple item and record selection, display
                 options, printing results

                 4B--Increasing your productivity in item and record selection,
                 creating "user-defined items", extracting results to a comma-
                 delimited file, and importing that file into Lotus 1-2-3.

                 4C--Adding labels, more about record selection and user-
                 defined items, extracting data to a DBF file and reusing it in
                 EXTRACT.

                 4D--Advanced topics:  setting up an EXTRACT menu, selecting all
                 components of a county subdivision, tract, or Indian
                 reservation.

                 4E--Advanced topics:  linking or merging files beyond what is
                 possible by adding labels.

                 Some of the material duplicates skills taught in EXTutor1, 2,
                 and 3.

 TIME REQUIRED:  about 30 minutes per part.

 =====================================================================

 Why Use EXTRACT?  

 Each 1990 Census CD-ROM comes with a basic data display program run with the
 command "GO".  This software does a nice job of displaying any STF table, one
 area at a time.  There is also a feature on presenting a "general profile". 
 For STF 1, the general profile includes 89 key items selected or derived from
 the 982 population and housing data fields carried in the 10 STF 1A databases. 


                                                  1

 Any table, or the general profile, may be printed to your printer or to a file,
 one area at a time.

 Because there is a separate GO program for each of the STFs, the software
 insulates the user from much of the complexity of the STF files--data stored in
 multiple files, with cryptic variable names, and with complex geographic
 hierarchies.  As such, GO is highly appropriate for library patrons or other
 new users.

 EXTRACT, on the other hand, requires you to deal with more of the complexity of
 the data set, but gives you much more power and flexibility in working with the
 data, including the following capabilities:

     -   To deal effectively with classes of geographic areas, such as all
         places within a county or all tracts with 1 or more person of Hispanic
         origin.
     -   To work with data items from more than one table at a time.
     -   To view narrative concept definitions so you understand what you are
         working with.
     -   To create "user-defined items", such as population per square mile,
         percent white, or a count of persons under 15.
     -   To merge these data with data from other sources, such as the economic
         or agriculture censuses.
     -   To copy to a file whatever subset of data you select, so that you can
         import the data into other applications, such as Lotus 1-2-3(tm) or
         Harvard Graphics(tm).
     -   To work these data using a tool you have learned how to use with other
         Census Bureau data sets.

 Before you begin

 This tutorial assumes that you have installed EXTRACT 1.3 or 1.4, that you have
 created a separate subdirectory for temporary files, e.g. \EXTRACT\WORK, and
 that you have made the directory containing EXTRACT the default.  

     An optional part of tutorial 4B assumes that you have Lotus 1-2-3
 available.

     Instructions for keyboard entry are italicized in printed versions of this
 tutorial, or are shown enclosed in {braces} when distributed as an ASCII text
 file.  User keyboard entries are shown in all capital letters, and special
 characters are enclosed in <>, for example, <esc> for the escape key.















                                                  2

 Tutorial 4A--

 STARTING THE PROGRAM

 1.  {Type:} EXTRACT <enter>, {then press} <enter> {again after viewing the
     opening screen.}

 2.  DRIVE SELECTION

     The system prompts you for the drive designation for your CD-ROM, and for a
     location where it can store temporary files.

     {Enter the appropriate drive letter for your CD-ROM, e.g., type:} L:
     <enter>, {or whatever is appropriate for your system.}

     EXTRACT then asks for two locations on your hard disk--a directory for
     workspace and a directory for auxiliary files.  

     {Type:} C:\EXTRACT\WORK <enter> {for work space--assuming c:\extract\work
     already exists.}

     {Type:} C:\EXTRACT\1990AUX <enter> {for auxiliary files.}

     We have the option to save these parameters for use next time.  {Type:} S.

     {[If at this point the program presents you with a choice of catalog files,
     leave the cursor on MASTER or MSTRST1A and press} <enter>.  {Multiple
     master catalogs are discussed later under tutorial 4D.]}

     The system asks you to specify a name for these parameters.  When we work
     with CD-ROMs other than STF 1A we will use different parameters, and save
     them under different names.

     {Type:} EXSTF1A <enter>.























                                                  3

 3.  SELECT A CATALOG

     The system then presents you with a menu of all of the types of files--or
     "catalogs"--you have to choose from on this disc.

     There are ten types of files within STF 1A, each covering a different group
     of tables, or, in the case of age data, parts of tables.  You may select
     only one as a starting point.  [In tutorial 4C and 4D you will learn how to
     link data from two or three catalogs into a single display.]

     File    Matrixes        Subjects covered 
  
     STF1A0  P1 - P10        Geographic Identifiers, Sex, Race, Hispanic Origin 
     STF1A1  P11 - P12(p2)   Age, Total and White 
     STF1A2  P12(p3-p5)      Age, Black & American Indian, Eskimo, Aleut (male) 
     STF1A3  P12(p6-p8)      Age, American Indian .(female) and Asian or Pacific
     STF1A4  P12(p9)-P13(p1) Age, Other Race and Hispanic Origin (male) 
     STF1A5  P13(p2) - P19   Age, Hispanic Origin(female), Household Rel. & Type
     STF1A6  P20 - P35       Household Relationship & Type (cont.), Imputations 
     STF1A7  P36, H1 - H20   Housing: Vacancy, Tenure, Age of Hhr, Rooms,Persons
     STF1A8  H21 - H40       Persons Per Room, Value, Rent, Duration of Vacancy 
     STF1A9  H41 - H55       Units in Structure, Housing Imputations

     {With the cursor (highlighted bar) on STF1A0__, press} <enter>.

 4.  HELP SCREEN

     The help screen associated with this catalog of files appears.  It has two
     parts:

     a.  A general description of the file, including some notes on the sequence
         of the data.

     b.  A brief description of how three of the options on the main menu that
         follows apply to this particular file.

     You may pull up this help screen at any time from the main menu by pressing
     <H>elp or <F1>.

     {Press} <enter> {to continue.}

 5.  SELECT A FILE

     Now the system presents a list of all of the files within the STF1A0
     catalog--which correspond to all of the states on this disc.  {Move the
     highlighted bar (press the down arrow) down to the state of your choice,
     and press} <enter>.

     At this point, the program takes a few moments to read in the file's data
     dictionary and get itself set up, and (after about 25 seconds) displays the
     main menu.






                                                  4

 5.  MAIN MENU

     The main menu is the control center from which you will operate the rest of
     the program.

     At the lower right is the name of the file you are working with.

     The main menu lists a series of options.  Most sessions involve selecting
     items and records (options 1 and 2), but let's jump ahead to Display the
     file to the screen, as is.

     {Type:} 6 <enter>

 6.  DISPLAY TO SCREEN

     You are now looking at a large "spreadsheet".  You can scroll down and to
     the right to see more data.

     Features:

         At the top are cryptic variable names.  In STF 1 files most of these
         variable names are not really self-explanatory.

         At the bottom is a more complete description of the one item that the
         cursor is highlighting.  

         You can scroll the cursor from side to side to see different
         descriptions.  {Press right arrow key repeatedly.}

     This is the columnar display, where every line represents a separate
     geographic area.  

 7.  TOGGLE TO ROWWISE DISPLAY
  
     {Press:} T {to <T>oggle to "rowwise" display.}

     There is another display mode that shows us only one area at a time.  Until
     we become more familiar with the contents of the file, this is a friendlier
     way to browse through the data.  Data values are shown at the left, and a
     complete description of the item is shown at the right.  The descriptive
     text comes from a separate file called the "data dictionary".  

     {Press} <enter> {twice.}  There are so many geographic codes in this file,
     that it is not until the third screen that we see the name for the
     geographic area, and then the beginning of the population counts we were
     expecting to see.  {Press} <enter> {to see more data values, then again,
     and again to see the last of the record.  Pressing} <enter> {again brings
     you to the beginning of the next geographic area.}

     Even with the rowwise display, this is more information than you want to
     deal with all at once, so let's create a more selective display.

     {Press:} <esc> {to return to the main menu.}




                                                  5

 8.  SELECT ITEMS

     {Type:} 1 <enter>

     In order to select items, the program lists variable names, the ones we saw
     as column headings in the first "columnar" display screen, along with the
     more complete descriptors we saw in the second "rowwise" display.  

     We can select the variables we want -- by marking each with an X.

     For reasons that will be evident later, SUMLEV is very important.

     {Mark an} X {by SUMLEV and GEOCOMP (third and fourth from the top), then
     press} <PgDn> {four times and place an} X {by ANPSADPI,} <PgDn> {again to
     mark P0010001, again to mark P0060002, and three more times to mark
     P0080001.}

     {Press} <esc> {to return to the main menu.}

 9.  DISPLAY TO SCREEN

     {Type:}  6 <enter>

     Now we have all of the data items we want on one screen, and, in fact, we
     find that the program can now fit two full areas on a single screen.  Let's
     say that we are interested in comparing these statistics across a number of
     areas at one time, so for that purpose the original columnar display is
     more efficient.

     {Press:}  T {to <T>oggle back to columnar display.}

     Initially, we see only the columns for SUMLEV, GEOCOMP, and AREALAND.  This
     is because the area name field, ANPSADPI, is so wide (66 characters) that
     little else can share the screen width it.  {Move the cursor right to see
     the area name and other items.}

 10. ADJUSTING COLUMN WIDTH 
  
     In columnar mode, we can adjust the display width of each column.

     {Move the cursor to the column containing ANPSADPI.}

     {Type:} W {for <W>idth}

     {Enter} 15 {as the new width.  Press} <enter>

     Just because you have narrowed a text column doesn't mean you can't look at
     the hidden text if you need to.  

     {Move the cursor to highlight an area name that appears truncated, and
     type} S {for <S>how.}  A box appears that displays the entire text,
     temporarily covering over the text of adjacent lines.  {To return to normal
     display, move the cursor off of this cell, in any direction.}




                                                  6

 11. DISPLAYING DEFINITIONS 

     {[If you are using a version of EXTRACT dated prior to October 1991, skip
     this section--STF 1 definitions are not accessible.]}

     We can see that the first several records are summaries for the state, with
     different populations, and different values for the code GEOCOMP--00 then
     codes in the 40's.  Here is a case where even the description at the bottom
     of the screen doesn't tell us everything we need to know.  Fortunately,
     most concepts in the STF 1 files have full definitions available, which we
     can refer to by simply pressing D.

     {Press} D.

     This definition is in the form of a code list.  By paging down we find
     explanations for categories 40 to 44.  These summaries are associated with
     various types of lands of American Indians and Alaska Natives.

     {Press} <esc> {to return to data display screen.}

     By the same token, we can look at definitions of subject concepts.

     {Cursor to the right to highlight the P0080001 column, and press} D.

     This definition is the same as is presented in the back of printed reports,
     or in the CD-ROM technical documentation.  EXTRACT encourages you to
     confirm that you know what the census concepts mean by making it easy to
     consult the definitions while you are looking at the data.  [The
     <D>efinition function also works from within the Select Items screen and
     from the main menu.]

     {Press} <esc> {to return to data display screen.}

 12. Before leaving this screen, note the geographic sequence of the file. 
     After the state component records comes the first county, followed by
     summaries for its first county subdivision, then the first place within
     that, the first tract or block numbering area within that place, and
     finally a record for each block group fitting within that hierarchy.  If
     one of these latter areas crosses the boundaries of an area above it, the
     summary given is only for that part within the given hierarchy, and (pt.)
     is added to each area name.

     Since the names associated with each area do not always make clear what
     geographic level you are looking at, you can keep track of where you are
     with the SUMLEV variable.  If you need definitions of SUMLEV codes, moving
     the cursor to the SUMLEV column and typing D will give you SUMLEV's
     <D>efinition.

     Let's see if we can display data for counties statewide but not for any of
     the smaller areas. 

     {Press} <esc> {to return to the main menu.}





                                                  7

 13. At the main menu, nothing comes right out and tells us how to select data
     for all counties.  If we are not sure how to proceed, we can use the <H>elp
     function.

     {Press the <F1> key or type} H <enter> {for <H>elp.}

     This is the same screen we saw after selecting a catalog.  The bottom of
     the <H>elp screen features comments about how three EXTRACT options relate
     to this particular type of file.  It tells us that while Select Items lets
     us specify particular data variables, Select Records allows us to specify
     particular geographic areas.  [It also mentions adding labels, which we
     will refer to in tutorial 4C.]

     In terms of the data display we used a moment ago, selecting items is a
     matter of selecting columns, and selecting records is a matter of selecting
     rows.  The main menu does not use columns and rows terminology, though,
     because we have already seen that there is a display mode that turns
     columns into rows.

     Since we want to select on a type of geographic area, we want to Select
     Records.

     {Press} <esc> {to return to the main menu.}

 14. SELECT RECORDS

     {Type:} 2 <enter>

     The first record selection screen looks a lot like the item selection
     screen we used before.  That is because we can use any variable in the file
     to govern the record selection process.  Some selection variables work
     better than others, though, and we most often try to use variables marked
     with an asterisk (*).

     We want to select records for all counties.  If you <PgDn> once, you see
     that there is a county code (CNTY), but that is not what we want because
     that code merely helps us pick one or more particular counties.  To get
     counties in general, we need to use the SUMLEV variable.  

     {Press} <PgUp>, {cursor to SUMLEV, and type:}  S.  Note that we use S to
     select variables at this screen rather than the X called for when we were
     selecting items and could mark lots of variables.

     {Press} <esc> {to continue.}

     Now a menu appears listing all possible values of SUMLEV.  The second entry
     is 050--State-County, and those are the summaries we want.  [If code 050 is
     not visible, cursor up to it.]

     {Cursor to 050, and type:}  X.  {Press} <esc> {to continue.}

     The system now spends a little time searching for the first qualifying
     record, then returns you to the main menu.




                                                  8

 15. {Type} 6 <enter> {to display to screen.}

     Because the system has to look for just certain classes of records, it
     requires more time to fill the screen.  But in a few moments, we have the
     statistics we selected for all counties in the state.

     A new subheading on the second line of the screen advises us that these
     data are sorted by SUMLEV+PLACEFP, and that we can press N for natural
     order.   Since we have the display we want, there is no point in
     experimenting with natural order.  So we can ignore that message for the
     moment.

     {After examining the data, press} <esc> {to return to the main menu.}

 16. Let's try out other ways of selecting records.  Suppose we want to display
     all census tracts within one county.

     {Type:}  2 <enter> {to select records.}

     {Mark an} S {next to SUMLEV and another} S {next to CNTY, and press} <esc>.

     {Mark an} X {by SUMLEV code 140, and press} <esc>

     {Mark an} X {by a particular county, and press} <esc>.
  
     {Type:} 6 <enter> {to display to screen.}

     {After examining the data, press} <esc> {to return to the main menu.}

 17. Let's select all block groups within [a particular place].  

     {Type:}  2 <enter> {to select records.}

     {Put an S next to SUMLEV,} <PgDn> {twice and cursor to PLACEFP, and type
     another} S.

     Note that there are two place codes to choose from.  We select the FIPS
     place code PLACEFP because it has an asterisk.  The index that will help
     the system find [place] fast uses the FIPS code and not the Census code.

     {Press} <esc>.

     You notice that there are two summary levels featuring block groups, 091
     and 150.  What is the difference?  SUMLEV 150 observes the simplest
     hierarchy--State, county, tract/BNA, and block group.  SUMLEV 091 inserts
     county subdivisions and places in the hierarchy, which is relevant at the
     moment because we want to select block groups within a place.  

     {Mark an} X {by SUMLEV code 091, and press} <esc>.

     {Mark an} X {by [a particular place], and press} <esc>.

     {Type:}  6 <enter> {to display to screen.}




                                                  9

     It is obvious that the area name alone is not enough to uniquely identify
     each item in this query.  You also need to display tract codes to identify
     block groups.  
  
     {Press} <esc> {to return to the main menu.}

 18. {Type:}  1 <enter> {to select items.}

     To add another variable to our display, it is simply a matter of adding
     another X.

     <PgDn> {once and mark TRACTBNA with} X.

     {Press} <esc> {to return to the main menu.}

     {Type:} 6 <enter> {to display to screen.}

     {Press} <esc> {to return to the main menu.}

 19. FORMAT OPTIONS
 
     Our next step is to prepare these data for printing.  The program gives us
     a few format options.

     {Type:} 5 <enter>

     One of the options is to specify your own heading.  The program defaults
     start us off with a heading taken from the data dictionary, but by the time
     we have gone through item and record selection, a much more specific
     heading may be appropriate.

     {Type:} 3 {(heading) then enter an appropriate heading, for example,}  
     Blacks and Hispanics by Block Group in [place]...10/1/91 PTZ  <enter>

     It is frequently handy to add today's date or your initials to the heading,
     particularly if you are going to save your output for future use.

     {Press:} <esc> {to return to the main menu.}


 SENDING OUTPUT TO A PRINTER 

     {[Optional; skip if you are not connected to a printer.]}
  
     You may want to print the results of your work.  EXTRACT's defaults will
     print out an 80-character display to any printer.  You can print out in
     either columnar or rowwise format, depending on which way you last toggled
     the display. 

 1.  {Type:} 7 {to print.}
  
     If you have specified too many items to fit on one line in a columnar
     display, the program will "wrap" the overflow items onto the next line(s). 
     If that is not what you want, press <esc> to stop the printout, and go to
     the Select Items menu to reduce the number of items selected or to change
     the width of certain columns.

                                                 10


 2.  If your printer can accommodate more columns, you can tell EXTRACT how wide
     to make the page through the Format Options.

     {Type:}  5 <enter>

     {Type:}  5 <enter>

     {Type the number of characters per line (e.g., 136) and press} <enter>

     You also can use format options to specify a different left-hand margin or
     page length. 

     EXTRACT cannot send codes to your printer for compressed print or to change
     fonts or styles.  You will need to set that up before you start EXTRACT, or
     use hardware switches if your printer has them. 
  
     {Press} <esc> {to return to the main menu.}

 3.  {Type:} 7 <enter> {to print your output with the new options.}

     When your printout is finished, you will notice that at the bottom is a
     list of all data items along with their complete descriptions. 

 4.  If is also possible to create a printout in rowwise format (set by
     <T>oggling in the data display screen).  The result is a profile format
     especially useful for displaying data for one area at a time.

 5.  {When the system returns you to the main menu, you type:} Q {to <Q>uit.}




























                                                 11

 Tutorial 4B

     Having saved our opening parameters into the file EXSTF1A.bat, we can skip
     past the opening drive selection screen.

     {Type:}  EXSTF1A <enter> <enter>

     Before selecting a catalog, notice the prompt at the bottom of the screen
     that you may <R>estrict the entire session to a single State.  This option
     is of little value for many Census CD-ROMs, but is useful on STF 1A.

     {Press} R {to <R>estrict the session and enter the State abbreviation of
     your choice, e.g.,} VA <enter>.

     {Press} <enter> {to select the STF1A0__ catalog.}


 SELECT ITEMS--SPECIAL FEATURES

 1.  {Type} 1 <enter>

     {Put an} X {next to SUMLEV.} 

     In our previous item selections, we used <PgUp> and <PgDn> to move through
     the long list of data items.  That may be fine for getting to know the
     file, but there are more efficient ways of getting through the list if we
     know what we are looking for.  One way we can find out about these
     shortcuts is to press the help key <F1> or <H>.

     {Press} <F1>.

 2.  HELP SCREEN

     EXTRACT's help system is context sensitive, that is, it gives you help
     specific to where you are in the program.

     This screen gives us three suggestions for how to move quickly around the
     list:

     J   will let you <J>ump to a specific item name, if you already know it,
         and the printout from exercise 4A gives you a key to some of the item
         names.

     L   is for <L>ocate, and it finds a particular character string at the
         beginning of the item description.  

     W   is for <W>ord search, which looks anywhere in the description for the
         character string you enter.

     {Press} <esc> {to return to the select-items menu.}

 3.  Last time we skipped through a number of screens to find the area name
     field ANPSADPI.  This time let's <J>ump to it directly.  {Press} J, {and
     type:}  ANPSADPI <enter> {in the box at the bottom of the screen.  Enter}
     X, {then cursor down to P0010001 and put an} X {next to it.}


                                                 12

     We also want to include the count of black persons.  Let's say we don't
     remember its field name, but we can try to <L>ocate it based on its text
     description.  {Press} L {and type:}  black <enter>, {then put an} X {next
     to it.}

     Note that <L>ocate was not sensitive to upper and lower case.  

     The difference between <J>ump and <L>ocate is that <J>ump lets you search
     in the field name column, and <L>ocate lets you search the description
     column.  Both look for a match only at the beginning of the appropriate
     column.  In contrast, the <W>ord search option looks anywhere in the
     description field.

     Let's say that we are looking for the variable identifying land area, but
     are not sure how it is worded.  {Press} W {and type:}  area <enter>.  This
     is obviously the end of the list, but we can <PgUp> to see a number of
     descriptions with the word area in them.  {Cursor to AREALAND, and put an}
     X {next to it.}  

     Notice that area is presented in thousandths of a square kilometer.  Let's
     see if the definition can help us figure out how to convert those to square
     miles.  {Press} D.  Note the last sentence of the first paragraph, telling
     us to divide the figures presented by 2590 to get square miles.  {Press}
     <esc>.

     At the moment the screen still is filtered to only show items with the
     letters a-r-e-a in the description.  To return to the full list, do a blank
     <W>ord search.  {Press} W {and press} <enter> {without entering any search
     string.}  

 4.  USER-DEFINED ITEMS 
  
     One of the really interesting ways EXTRACT augments your displays is by
     computing "User-Defined Items".  This in effect gives you some limited
     computational capabilities. 

     {Type:}  U {for <U>ser-defined item.}
  
     You can create two types of user-defined items--Ratios and Free-form
     expressions. 
  
     Ratios, such as percents, are used most often.  You will need to enter
     exact variable names as they appear in the database for the numerator and
     denominator.  If you do not know the exact names, you may return to the
     regular Select Items mode {(Press <esc> twice) and cursor down until you
     see the item you want, then Type:  U {again to return to the <U>ser-defined
     items mode, and type} 1 {to select the first item}.
  
     To compute population per square mile-- 

         {Type:}  P0010001 <enter> {for the numerator;}

         {Type:}  AREALAND <enter> {for the denominator;}




                                                 13

         {Type:}  2590 <enter> {for the scaling factor.}  The definition for
         land area told us that a thousandth of a square kilometer is 1/2590 of
         a square mile, so the inverse of that compensates for having the
         denominator in thousandths of a square mile.

         {Press} <enter> {to get past the Free-form Expression field.}

     Be sure to leave the "freeform expression" blank if you are computing a
     ratio, and leave the numerator and denominator fields blank if you are
     entering a freeform expression.

     Once past the basic specification, the system asks how to present the new
     item.

         {Type:}  7 <enter> {for the field length;}

         {Type:}  1 <enter> {for the number of decimals;}

         {Type:}  PopSqMi <enter> {for the field name; and} 

         {Type:}  Population per square mile {for the title.}

     In this screen, the <enter> key moved you from one field to the next.  If
     you need to back up, use the <up arrow> key.  To skip ahead, use the <down
     arrow> key.  If <up arrow> doesn't let you get to the fields you want, for
     example, if you need to respecify the numerator, try <PgUp>.

     {Press} <enter> {after completing the title.}
  
 5.  EXTRACT will let you enter up to 10 user-defined items, either now or
     later.  Let's say we want to percent Black in the population as well. 
     {Press} 2 {to define a second item.} 

         {Type:}  P0060002 <enter> {for the numerator;}

         {Type:}  P0010001 <enter> {for the denominator;}

         {Type:}  100 <enter> {for the scaling factor, since you normally
         multiply a proportion by 100 to get a percent;}

         {Press} <enter> {to get past the Free-form Expression field;}

         {Type:}  5 <enter> {for the field length;}

         {Type:}  1 <enter> {for the number of decimals;}

         {Type:}  BLACK <enter> {for the field name; and}

         {Type:}  Percent Black of total population {for the title.}

     {Press} <enter> {after completing the title, and} <esc> {to return to the
     regular select items menu.}
  




                                                 14

 6.  PREVIEW MODE 

     At this point you may wonder whether all of these new items are going to
     fit on the screen.  We could go back to the main menu and display a full
     screen of data, but EXTRACT gives us a shortcut from within the Select
     Items menu.
  
     {Type:}  P {for <P>review mode.}

     In the preview mode you can cursor to the right to see more fields, just as
     in the full screen display mode.  

     You can even cursor down to see the next record(s) in the database.  

     If you do not see all of the items you want on the screen, cursor to the
     right, and adjust the width of any item that looks like it could get by
     with less space.  {Cursor right to ANPSADPI, press} W {and type: } 15
     <enter>.  Now it takes only a moment to adjust the width of any or all
     columns with the W option.  The same <W>idth option exercised from the
     regular display screen may take considerably longer, as the system repaints
     the full 17-line data screen with every width change.

 7.  You may find that you also want to alter the display by adding another item
     or taking one from the display. 
  
     {Type:}  R {to <R>eselect data items.}

     Mark X by any additional items you want to include in the display, or press
     <space> to un-select items.  [The preview displayed at the top of the
     screen will not be updated until you press <P> again.] 

     {Press}  <esc> {to return to the main menu.}

 SELECTING RECORDS FOR PLACES WITH 10,000 OR MORE POPULATION
 
 8.  {Type:}  2 <enter> {to Select Records.}

     It is possible to select records using not only the various geographic code
     variables but also any data value in the current file.

     {Cursor to SUMLEV, type} S, {then} <PgDn> {several times until you see
     P0010001, and place an} S {next to it as well.  Press} <esc> {when
     finished.}

     {In the next screen, put an} X {next to 160--State-Place, and press} <esc>.

     When we use as a selection criterion an item that does not have a "*" or
     "-" next to it on the first record selection screen (e.g., P0010001), the
     system cannot give us a menu of all possible values, so instead it prompts
     us to enter the minimum and maximum values for a range.  The minimum
     population value we want to include is 10,000, which we enter with no
     commas.  We will not enter a maximum, to leave the interval open ended.

     {Type:}  10000 <enter> {for the minimum, and press} <enter> again to skip
     the maximum.}


                                                 15

 9.  {Type:}  6 <enter> {to display the data to the screen.}

     This same technique allows you to select records based on any data value in
     the current file, so that you could, for example, select all block groups
     with 1 or more Black persons.

     {Press} <esc>

 10. EXTRACTING THE DATA TO A FILE

     At this point, we could again print out our results with option 7. 
     Instead, let's prepare for some real work.  Let's say we want to extract
     the data to a file that Lotus 1-2-3 can read in, so that we can arrange the
     counties in rank order of population, and prepare a graph.

     {Type:} 8 <enter>

     EXTRACT allows us to copy our extracted data set into any of four formats:

     1 - dbf We can create another dBASE file, just smaller than the original.

     2 - prn A comma-delimited file is the kind you want for importing to Lotus
             1-2-3.

     3 - sdf A fixed-format file looks more like a columnar report.

     4 - txt A print file, with the same formatting options as printed output. 
             [Option 4 is available starting with EXTRACT 1.3e, issued February
             1992 on Economic Censuses CD-ROM 2b and elsewhere.]

     In addition, there is a Dry Run option that counts up all of the records
     selected and projects the size of the output file without actually doing
     the extraction.  

     {Type:}  5 <enter> {to execute a dry run.}  Doing a dry run doesn't save
     any time, unless you find out that the file you were about to create would
     have been too large for your hard disk or floppy, leading you to reduce the
     number of items or make the selection of records more narrow.  A dry run
     does give you a count of records, and in this case the system gives us a
     count of places with 10,000 or more inhabitants.  Similarly, you can answer
     such questions as "How many block groups are there in my county?" or "How
     many places in the State have more than 100 Hispanic residents.

     Now, let's go ahead and extract the data for the county already selected
     into a file for import into Lotus 1-2-3.

     {Type:} 2 <enter>

     Let's let the system add headings for us.

     {Type:} Y <enter>

     The system prompts us to specify a name for our output file and a drive and
     directory location on our hard disk or floppy.



                                                 16

     {If you have Lotus 1-2-3, specify a filename in the subdirectory where that
     program normally looks for data files.  For example, type:} c:\123\demo1
     <enter>.  If you do not have 1-2-3, specify any valid filename.

     The system then examines the records in the data base for ones that meet
     the selection criteria we specified, then copies off the data fields that
     we selected.  This selection was very simple, but some data extractions can
     take quite a long time.

     {When the "completed" message appears, press} <enter> {to return to the
     main menu.}

     At this point, we could look at the file again, perhaps changing the items
     displayed, selecting a different set of records, or modifying formats. 
     Instead, let's exit the program and see what we accomplished with the file
     we just created.

     {Type:} Q {to quit.}

 LOOKING AT AN EXTRACTED FILE

 {[This step is optional if you have Lotus 1-2-3 and can view the file in Lotus. 
 If you have available the shareware program LIST or a text editor, you may
 prefer to use that program rather than TYPE in the following command.]}

 {Type:} TYPE C:\123\DEMO1.PRN <enter>

 The data have scrolled by quickly, but what is left on the screen shows us what
 we mean by "comma-delimited format":

     Alphabetic variables and codes are enclosed in quotes.

     Numbers have no leading zeros or spaces.

     Commas have been inserted between each field.

 Because we told EXTRACT to give us headers and footers, the file includes a
 complete list and description of each of the variables (columns) at the bottom
 of the file.

 We can see how useful that format is by loading the data into Lotus 1-2-3.


 LOAD EXTRACTED DATA INTO LOTUS 1-2-3

 1.  Start Lotus 1-2-3 in the way you normally do to bring up a blank
     spreadsheet, for example, type:  123 <enter><enter>

 2.  To import a comma-delimited file, type: /F(ile) I(mport) N(umbers).  Then
     we select our file from the menu presented.

     {Cursor to highlight DEMO1, and press} <enter>.  {If DEMO1 is not one of
     the options on the menu, change the default directory with <esc> /F(ile)
     D(irectory) to the one where DEMO1 is located, then try /I(mport)
     /N(umbers) again.}


                                                 17

     This easy procedure loads our extracted data into spreadsheet form.

     Note that this procedure has preserved not only column headings, but also
     the master heading we specified in EXTRACT.

 3.  You may want to adjust the column width for the area name by cursoring to
     that column and entering: /W(orksheet) W(idth) B [for column B] 15.

 4.  At the bottom of the display ({press} <end><down arrow>), more complete
     descriptions are associated with the cryptic column headings.

 5.  Let's say that we want to rank order these data on total population.

     {Press} <home> {to return to the top, then cursor down to cell A4.}

     {Type:} /D(ata) S(ort) D(ata-Range) .(period) <end><right arrow>
     <end><down>

     {Type:} P(rimary sort key) F1 (or whatever cell is at the top of the
     population column) D(escending sort)

     {Type:} G(o)

     Now we have sorted all areas in decreasing order of population.

 6.  Of course, you will have lots of other things you want to do inside a
     spreadsheet program, like--

     -   calculate percentages
     -   add up subtotals
     -   rank areas based on population size
     -   do bar charts or graphs

 7.  {To leave the program, type:} /Q(uit) Y(es)























                                                 18

 Tutorial 4C

     Having saved our opening parameters into the file EXSTF1A.bat, we can skip
     past the opening drive selection screen.

     {Type:}  EXSTF1A <enter> <enter>

     {Press} R {to <R>estrict the session and enter the State abbreviation of
     your choice, e.g.,} VA <enter>.

     This time let's look at data on people living alone.  Household type and
     household relationship are found in files 5 and 6.  For starters, let's
     look at STF1A5.

     {Cursor down to STF1A5__ and press} <enter> <enter>.  You have a few
     moments to read the help screen while the system is reading in the data
     dictionary.

 1.  To find data for persons living alone we may have to page through quite a
     few screens full of items on the Select Items menu.  It would help if we
     knew which table to look for.  It is a good idea to keep the STF 1A paper
     documentation handy for such purposes, but for right now, we can make use
     of the fact that there is a general table locator for STF 1A built into the
     definitions system.

     {Type:}  D <enter> {bring up the index of definitions from the main menu.}

     {With the cursor highlighting -LOCATOR, press} <enter>, {then <PgDn>
     through the alphabetic entries.}

     While there are no entries for "persons living alone", there are entries
     for Household Size and Household Type (including P016 in STF1A5) and other
     entries under Persons in Unit (in STF1A7).

     {Press} <esc> {to return to the main menu.}

 2.  {Type:}  1 <enter> {to the Select Items.}

     This file obviously does not have as many geographic identification fields
     as STF1A0.  Later we will have to address the lack of an area name field.

     {Put an} X {next to SUMLEV, CNTY, COUSUBFP, PLACEFP, and LOGRECNU.}

     {Type}:  J {and} P016 <enter> {to <J>ump to the entries beginning with the
     characters P016.  Put an X next to P0160001 and P0160002.}

 3.  There are two separate cells that make up one-person households.  If we
     need that expressed as a single total, we need to compute a user-defined
     item.

     {Type:}  U {for <U>ser-defined item.}
  
     To total two items, we will enter a freeform expression.

     {Press} <enter> <enter> <enter> {to move to the freeform expression field,
     then type:}  P0160001 + P0160002 <enter>.

                                                 19


     Make sure that you type 0's rather than o's in these item names.  If you do
     not enter the items correctly, the system will keep cycling you back
     through these cells until it gets an expression it finds meaningful.

     {Type:}  7 <enter> {for the field length;}

     {Press}  <enter> {to skip any entry for the number of decimals;}

     {Type:}  Person1 <enter> {for the field name; and} 

     {Type:}  One-person households <enter> {for the title.}

 4.  We can compute one-person households as a percent if we can supply the
     denominator of all households.  That number is shown in a separate file,
     but we can also compute it from table P016 since the universe of the table
     is households, and all of the cells in the table should add up to that
     total.

     {Press}  2 {to define a second item.}

     {Type:}  P0160001 + P0160002 <enter> {for the numerator.}  (Users of
     EXTRACT 1.3e or later could type in Person1, representing the sum of these
     two items computed above.)

     To enter ten field names in the denominator field would be laborious. 
     Fortunately, EXTRACT gives us a shortcut.  You may enter the first and last
     fields separated by a colon, and the system will compute the sum of all of
     the fields in between.

     {Type:}  P0160001:P0160010 <enter> {for the denominator.}

     {Type:}  100 <enter> {as the scaling factor appropriate for a percent.} 
     {Press} <enter> {again to get past the freeform expression field.}

     {Type:}  5 <enter> {for the field length;}

     {Press}  1 <enter> {to specify one decimal place for the percent.}

     Note that 5 characters was assigned for a percent to allow for a figure as
     large as 100.0 percent.

     {Type:}  Pct1P <enter> {for the field name.} 

     You may be tempted to use the percent sign (%) in such a field name, but
     special characters such as %, $, /, *, +, or - are not allowed.  It is also
     preferable to make the field name no longer than the field length--EXTRACT
     will make the data display column wide enough to accommodate both the field
     and its name.

     {Type:}  One-person households as a percent of all households <enter> {for
     the title.}

     {Press} <esc> {to return to the Select Items menu, and} <esc> {again to
     return to the main menu.}


                                                 20

 5.  {Type:}  6 <enter> {to display the data.}

 ADDING LABELS

     Obviously, the display is limited in its usefulness until we add a label
     for the area name.

     {Press} <esc> {to return to the main menu.}

 6.  {Type:}  3 <enter> {to Add Labels.}

     We may add labels associated with just about any of the coded variables on
     the file, except tract and block group for which there are no names.  The
     most flexible type of label, however, is the general area name (ANPSADPI)
     found in the STF1A0 file, which gives a title appropriate to any summary
     level.  That choice also has the advantage of giving us access to other
     data items in the STF1A0 file.

     {Cursor down to --STF1A0 and press} <enter>. 

     The area name is highlighted, and pressing <enter> would select it and
     return us to the main menu.  But let's consider other items from the file
     as well, taking advantage of the options show a the bottom of the screen.

     {Press} M {to allow selecting multiple items.}

     {Press} <enter> {to select ANPSADPI.  A ">" confirms that it is selected.}

     {Now press} A {to show all fields.  Type:} J {and} P001 {to <J>ump to the
     first of the data tables.  Cursor to highlight P0010001, and press} <enter>
     {to include P0010001 as a label, then cursor down to P0030001 and press}
     <enter> {to include it as well.}

     {Press} <esc> to return to the main menu.

 7.  {Type:}  6 <enter> to display the data.

     {Press} W {and} 13 {to reduce the width of the ANPSADPI field to 13
     characters.

 OPTIONAL:  COMPUTING A USER-DEFINED ITEM WITH DATA FROM ANOTHER FILE

     In the percentage displayed at the right, one-person households was shown
     as a percent of all households.  Another way of looking at one-person
     households would be as a percent of persons.

     It is possible to use the population count added as a "label" in computing
     a user-defined item, but the field name must be recorded precisely.  Notice
     on the display that the heading for total persons is STF1A0->P0010001, with
     the two characters "-" and ">" making up something that looks like an
     arrow.  This indicates that the item has been pulled in from the STF1A0
     file; we will use this notation shortly in a user-defined item.

     {Press} <esc> {to return to the main menu.}



                                                 21

 8.  {Type} 1 <enter> {to Select Items}

     {Press} U {to bring up the list of user-defined items, and press} 2 {to
     review how we defined the second item.}

     The numerator we want will stay the same--P0160001 + P0160002.

     {Press} <esc> {to return to the list, then enter} 3 {to specify a new
     item.}

     {Type:}  P0160001 + P0160002 <enter> {as the numerator}

     {Type:}  STF1A0->P0010001 <enter> {as the denominator}

     {Type:} 100 <enter> {as the scaling factor, and another <enter> to skip
     past the freeform expression field.

     {Type:} 5 <enter> {for the field length;} 1 <enter> {for decimal places;}
     Pct1P <enter> {for the field name; and} Persons living alone as a percent
     of all persons <enter> {for the title.}

     {Press} <esc> {to return to the Select Items menu.}

 9.  {Press} P {to <P>review column widths}

     To get all of the fields on the screen, we need to adjust the widths of
     some of the columns.

     {Cursor to the SUMLEV column, press} W {and since the 3 appearing in the
     prompt window is the minimum this code requires, simply press} <enter>. 
     {Repeat the process narrowing the COUSUBFP and PLACEFP column to 5
     characters each.}

     {Press} <esc> {to return to the main menu.}

 SELECTING ALL RECORDS WITHIN A COUNTY

 10. {Type:}  2 <enter> {to Select Records}

     In the examples in tutorials 4a and 4b, we selected one SUMLEV at a time,
     something that can be accomplished quite efficiently.  But sometimes you
     want to show the whole geographic hierarchy within a particular county or
     county subdivision.

     Let's pick data for one county, using the CNTY variable.

     {Cursor down to CNTY, type} S, {and press} <esc> {to go to the next
     screen.}

     In Select Records Screen 2, put an X next to the particular county of
     interest.






                                                 22

 11. SPEED-UP SCREEN

     Here is something you may not have expected.  The system tells us that it
     can speed up retrieval if we pick one of the categories of SUMLEV.
   
     {Cursor to 050--State-County, and type:}  X.

         You may be puzzled as to why the system asked that extra question,
         particularly when the answer seems obvious.  In fact, county codes are
         attached to county subdivision, tract, and block group and some place
         records as well as to county records, so your answer does make a
         difference.

         EXTRACT makes use of existing "indexes" on the CD-ROM whenever it can
         to speed the process of finding specific data.  In this case it found
         that in order to use a particular index, it had to know both the county
         and the summary level, so it prompted us for the missing piece of
         information.

         If you happen to select a SUMLEV that is not valid in combination with
         the rest of your selection criteria, then the program will not find any
         of the data you want.  For example, if you specified a particular
         county code and a SUMLEV of 160 (State-Place--a summary level with no
         county codes), EXTRACT will give you a message that it can find no
         records meeting those conditions.

         EXTRACT uses the information supplied "to speed up retrieval" only for
         the purpose of finding the first eligible record.  Thus, the index is
         "turned off" as soon as the first eligible record is found.

     {Press} <esc>.

 12. {At the main menu, type} 6 <enter> {to redisplay the data.}

     This shows us the entire geographic hierarchy for the county selected. 
     Note that its name has been introduced as a subheading.

 OPTIONAL:  ADDING A SECOND SET OF LABELS

 13. You may recall from scanning the table locator that some of the tables
     related to household composition were in file STF1A6.  We have already seen
     how we can merge in data from STF1A0 using the Add Labels option.  EXTRACT
     can actually handle two sets of labels at once, so we have a chance to
     investigate what additional information STF1A6 can bring to bear on persons
     living alone.

     {Type:} 3 <enter> {to Add Labels}

     This time, a blinking prompt at the top advises us to press S first if we
     want to <S>ave the labels we have already designated.

     {Press} S, {then cursor down to --STF1A6, and press} <enter>.

     {Cursor down to P023, press} M {to allow selection of multiple items,
     cursor to P0230005, press} <enter>, {move down to P0230007, and press}
     <enter> {again, then} <esc> {when we are finished selecting labels.}

                                                 23


 14. {Type:} 6 <enter> {to display.}

     At this point we may see that the data displayed are not particularly in
     the sequence we might have first chosen.  Labels appear first, in the
     sequence chosen, then the data items from the current file, then user-
     defined items.  EXTRACT 1.3 does not give us any further control over the
     sequence of items.

 EXTRACTING DATA TO A .DBF FILE

 This set of records might just be a group we want to refer back to, so let's
 save that to a file that EXTRACT can continue to use.

     [Note:  Before proceeding, use Select Items to make sure that you have all
     of the items you need.  For example, if you eliminated LOGRECNU or CNTY to
     save space in the screen display, add them back in for your .DBF output.]

 1.  {Type:}  8 <enter> {to extract the data to a file.}

 2.  {Type:}  1 {to select the DBF file option.}

     {Enter a file name, without any extension, e.g.,} alone.  If a drive and
     directory have not been specified, this file is automatically save to the
     work directory, and becomes c:\work\alone.dbf.  If you want to save this
     file to a floppy disk or other directory, enter the drive and directory
     explicitly, e.g, a:\alone.

     The system prompts us for a description for our "My_Files" catalog.  You
     can take the one listed, you can edit it by cursoring to something you want
     to change, or you can retype the line.  

     {Type:} Persons living alone in [county] [date] <enter>.

     The system will work for a while, first extracting the appropriate records,
     and then in a second pass adding the appropriate labels.  If you have added
     two sets of labels, this step can take quite a while.

     When the system is finished, it will give you the opportunity to go
     directly to the newly created file.  {Press} 2.

 5.  {Type:} 6 <enter> {to display to the screen.}

     At this point the display has the same data items we had selected before,
     except that the data items previously added as labels take less width
     because they no longer need as long a name.  We also have to reset the
     width of the ANPSADPI column to fit our display since the system copied the
     full contents from the original file.  This time there was no waiting while
     EXTRACT built up the screen--the system is not having to slow down as it
     filters out records that do not qualify, and it is working off a fast hard
     disk rather than a slow CD-ROM.

     {Press} <esc>.




                                                 24

 6.  {Type:} 1 <enter> {to bring up the Select Items screen.}

     The system has created a customized data dictionary for our new database
     file.  Here we can see that there are fewer items to choose from.

     {Press} <esc> to leave the item list unchanged and return to the main menu.

 7.  If we want to add another label to the file, we could do so with option 3. 

     If we wanted to cut out some of the records, for instance including records
     for selected townships (county subdivisions), we could do so with option 2.

     But if we decided at this point that we wanted more data rather than less,
     such as more items from the original STF1A5 file, we would have to reselect
     the original file, and repeat most of our original steps.  

 8.  {If you are finished, type:} Q {to quit.}


 PRACTICE PROBLEMS

 1.  Display the name, population, number of persons under 15, and percent under
     15 of all persons for geographic areas of your choice.

     Hints:  You will want to use STF1A1 file as your main file for age data,
     and add the area name and population as labels from the STF1A0 file.  You
     must add together nine separate items to create persons under 15 as a user-
     defined item--entering P0110001:P0110009 as a freeform expression.  In
     computing the percent, you will enter that same expression as the
     numerator, STF1A0->P0010001 as the denominator, and 100 as the scaling
     factor.  

     If you are wondering where the STF1A0->P0010001 came from, that is the
     heading EXTRACT shows for the total population column in the regular data
     display screen when you have added it as a label.  You need to copy those
     characters exactly, including the - and >.





















                                                 25

 Tutorial 4D--

 BUILDING MENUS FOR MULTIPLE CD-ROMS

 1.  {Type:}  EXMENU <enter>.

     If you have used EXTRACT 1.3 only with the STF 1A tutorials, the EXTRACT
     menu printed out may have only one entry.  If you have used more than one
     type of CD-ROM with EXTRACT 1.3, your menu might look more like this:

                     EXTRACT MENU
     .
     EX
     EXSTF1A
     EXPL94
     EXSTF1B
     EXCBP
     EXFTD
     EXSTF1A

     Every time you <S>ave parameters from EXTRACT's Drive Selection screen, the
     program adds the name you give that set of parameters to the file
     EXMENU.bat.  The sequence of entries is simply the order in which you saved
     parameters with various discs.  Obviously, this menu could be more useful
     to you if it were annotated, rearranged, and unduplicated.  Since
     EXMENU.bat is an ASCII text file, you may do anything you like with it in
     your text editor or word processor--as long as you save it as an ASCII text
     file, and as long as you leave the word "echo" at the beginning of each
     line.  Here is a sample of how the menu might look after editing.

                     EXTRACT MENU
     .
     EX          EXTRACT with Economic or Agriculture Census discs
     EXCBP       EXTRACT with County Business Patterns
     EXFTD       EXTRACT with U.S. Exports or Imports of Merchandise
     EXPL94      EXTRACT with 1990 Census P.L. 94-171
     EXSTF1A     EXTRACT with 1990 Census STF 1A
     EXSTF1B     EXTRACT with 1990 Census STF 1B

     Some users prefer to set up menus where the above options would be numbered
     1, 2, ... 6 rather than given the cryptic names.  If you do so, make sure
     you rename the corresponding batch files.  For example, EX.bat would be
     renamed 1.bat, EXCBP.bat would be renamed 2.bat, etc.

     Note:  Some users may prefer to have this menu reappear after the
     conclusion of every EXTRACT session.  This may be accomplished by adding
     two lines, one saying PAUSE and another saying EXMENU, to each of the
     EXTRACT batch files.  Users of computers with extended memory (more than
     640K) may find that adding "SET CLIPPER=E000" at the beginning helps avoid
     unwanted conflicts.  For example, the file EXSTF1A.bat could be edited to
     read:

     SET CLIPPER=E000
     EXTRACT c l: c:\auxil\ c:\work\ master.ctg
     PAUSE
     EXMENU

                                                 26


     The PAUSE line yields the prompt "Press any key to continue" before
     returning you to EXMENU.  If EXTRACT terminates abnormally, it displays an
     error message on the top line (usually beginning with "Proc") that may be
     the key to figuring out what you (or the program) did incorrectly.  Press
     <shift><Print screen> or copy down the message before pressing any other
     key, least you be returned to the EXTRACT menu without that important clue.

 2.  {Type:}  EXSTF1A <enter> <enter>

     {Press} R {to <R>estrict the session and enter the State abbreviation of
     your choice, e.g.,} VA <enter>.

     This time let's look at some housing data.

     {Cursor down to STF1A7__ and press} <enter> <enter>.  You have a few
     moments to read the help screen while the system is reading in the data
     dictionary.

 3.  {Type:} 1 <enter> {to select items}

     {Put an X next to SUMLEV, COUSUBFP, PLACEFP, TRACTBNA, BLCKGR, LOGRECNU,
     and (after <PgDn>) H0010001.}

     {Press} <esc>.

 SELECTING ALL RECORDS WITHIN A COUNTY SUBDIVISION OR TRACT/BNA

 4.  Selecting records within a county subdivision or census tract/BNA presents
     some new wrinkles to learn, because their names may not be unique within
     the State, and the code lists are sorted within county.  Place and county
     names are always unique within state, so their code lists are presented in
     straight alphabetic sequence, making them easier to select from a list.

     There are two ways to deal with these identifiers nested within county, as
     shown below.

     a.  {Type:} 2 <enter> {to select records.}

         {Put an S next to CNTY and another S next to COUSUBFP, then press}
         <esc>.

         {Move to the county of your choice and put an X next to it.  While you
         are there, note the county code for future reference.  Press} <esc>.

         Contrary to your expectation, the county subdivision code list does not
         start with the county you have just specified.  Instead it lists the
         county subdivisions for the county that is first in alphabetic
         sequence.  We could press L to <L>ocate the code based on the county
         subdivision name, but both the <L>ocate and <W>ord search options are
         rather time consuming on county subdivision, tract, and place lists on
         1990 census files.  On the other hand <J>ump is quite fast, once you
         know how.

         {Press} J.


                                                 27

         The prompt "Jump to what value for SUMLEV+CNTY+COUSUBFP+PLACEFP
         +TRACTBNA+BLCKBR?" is not particularly inviting.  Fortunately, all you
         need to specify in order to get to the right part of the list is the
         SUMLEV (060 for county subdivisions) and the county code (which you
         should have made note of while selecting the county in the previous
         step).  Thus, if the county code were 025, you would enter the string
         060025, followed by <enter>.  {Put an X next to the desired county
         subdivision and press} <esc>.

         Now comes one of those pesky "speed up retrieval" prompts, requesting a
         particular value for SUMLEV. {Put an X next to 060 and press} <esc>.

         {Type:} 6 <enter> {at the main menu to display the data, then} <esc>
         {back to the main menu when you are ready.}

     b.  There is another way to select a particular county subdivision or
         census tract, a bit more time consuming, but without the complexity of
         entering in codes at the <J>ump prompt.  This involves two separate
         record selection operations, first for the county, to bring up the
         desired part of the code list, second for the individual county
         subdivision(s) or census tract(s).

         {Type:} 2 <enter> {to select records.}

         {Put an S next to CNTY and press} <esc>.

         {Move to the county of your choice, put an X next to it, and press}
         <esc>.

         {In response to the "speed up" screen, put an X next to 060--State-
         County-County Subdivision, since that is where we want to be next.}

         {From the main menu, type:} 2 <enter> {to select records a second
         time.}

         {Put an S next to COUSUBFP (or TRACT/BNA) and press} <esc>.

         {Move to the county subdivision of your choice, put an X next to it,
         and press} <esc>.

         This two-pass approach brings you to the right part of the county
         subdivision (or census tract/BNA) list.  EXTRACT always attempts to
         bring you into each code list at a point corresponding to the data
         record that was last displayed or selected.

         {In response to the "speed up" screen, put an X next to 060 for SUMLEV,
         and press} <esc>.  This is followed by another "speed up" screen,
         asking you to confirm the county.  {Press} X, {then} <esc>.

         {Display the data to the screen at this point if you wish.}

     Note that, while you could have selected more than one county subdivision
     or census tract within a single county, EXTRACT does not give you the
     capacity to select individual tracts or county subdivisions in multiple
     counties in a single retrieval.  As will be discussed later, however, two


                                                 28

     separate retrievals, if extracted to .DBF files, can later be merged
     "vertically" into a single data base.

 5.  {Type:} 6 <enter> {to display the data to the screen.}

     The data displayed reflect the full geographic hierarchy within the
     selected county subdivision:  county subdivision, place, tract, and block
     group.

     a.  Starting on the second line of the screen is the prompt
           --Press I to Index (resequence) by SUMLEV+CNTY+COUSUBFP+PLACEFP
         +TRACTBNA+BLCKGR--
         Since the display is sequenced the way we want it, we do not want to
         press I.

         This message appears whenever the record selection process takes us
         through the "speed up retrieval" prompt.  The system has used an index
         to find the first record, then disengaged the index so that the data
         display in the original sequence.  In this case, turning the index back
         on would group the records by summary level.  For example, to view only
         the block group records, cursor to the first block group record and
         press I.  If you then want to return to the original display, you may
         press N for natural order, which in this case was hierarchic.  The
         system lets you go back and forth between indexed and natural order
         only once, after that you will need to repeat your original record
         selection process.

     b.  You may wish to add the area name label (from STF1A0, the ANPSADPI
         field).  For the tract and block group records this may give you only
         information that is self evident from the tract/BNA or block group
         number.  On the other hand, the name may include a part indicator (pt.)
         that alerts you that the county subdivision or place boundary
         subdivides the area you are looking at.

 KEEPING TRACK OF WHAT SUMMARY LEVELS HAVE WHAT CODES

 Understanding summary levels and the sequencing of records is so important that
 you will want to keep handy the summary level sequence chart on page 6-1 of the
 STF 1 documentation (or print out the first part of \DOCUMENT\SUM_LEV.ASC on
 the CD-ROM).  The same information is contained, somewhat abbreviated, in the
 definition of SUMLEV, accessible by pressing D for <D>efinition in either of
 two places:  (1) at the data display screen when the cursor is on the SUMLEV
 column, or (2) at the item selection screen when the cursor is on the SUMLEV
 row.

 You also need to keep straight which codes are available on which summary
 levels.  For example, if you want to analyze tracts/BNAs by place, you need to
 specify a SUMLEV of "080" instead of "140", because SUMLEV 140 tract/BNA
 records do not include place codes, while SUMLEV 080 tract/BNA records do. 
 Similarly, if you want to list places within a county, you need to work with
 SUMLEV 155 or 070, not SUMLEV 160, becauses the latter does not have county
 codes.  To determine which codes are available on which summary levels, see
 "File identification by summary level" chart in STF 1 documentation pages 2-3
 to 2-15 or print out \DOCUMENT\HOWTOUSE.ASC on the CD-ROM.



                                                 29

 The chart is somewhat awkward to use because it spans a number of pages, the
 first 4 pages dealing with SUMLEVs 010 to 201, the next four with SUMLEVs 202
 to 321, and the last four with SUMLEVs 329 and up, with pairs of pages
 alternating between codes from A to M, and M to U.  As an exercise, (1) use the
 chart to confirm that place codes are present on tract records at summary level
 080 but not 140; and (2) determine what summary levels are accessible using
 Congressional district codes.

 DISPLAYING DATA FOR AN INDIAN RESERVATION

 Selecting data for an Indian reservation introduces some interesting
 challenges, because there is no menu available for Indian reservations, and
 since no indexes are available to speed searches for the components of a
 reservation.

 1.  {Load a file in the STF1A0__ group.  If you are have just completed the
     preceeding exercise, return to the file selection menu (option 9) to do
     so.}

 2.  {At the main menu, type:} 2 <enter> {to select records, then put an S next
     to AIANAFP and press} <esc>.  What next appears is not an easy-to-use menu
     of all of the possible values for AIANAFP, but rather a prompt to specify a
     minimum value for the code, something you are not yet prepared to do. 
     {Press} <esc> {to skip back to the main menu.}

 3.  In the absence of an Indian reservation code menu, you will have to
     construct your own code list.

     a.  {Type:} 2 <enter> {to select records, and place an S next to SUMLEV and
         press} <esc>.  {Put an X next to each of the codes having to do with
         Indian reservations or trust lands, from 210 to 221, and press} <esc>. 
         [If you get the message "No records found meeting these conditions",
         then this state has no Indian reservations, and you will need to repeat
         the above steps with another state.]

     b.  {Type:} 1 <enter> {to select items.  Put an X next to SUMLEV, AIANAFP,
         AIANACC, ARTLI, and CNTY; <PgDn> three times; and put Xs next to
         ANPSADPI and P0010001.  Press} <esc>.

     c.  {Type:} 6 <enter> {to display.  Cursor to the right to highlight the
         ANPSADPI column, press} W {and type:} 24 <enter> {to reduce the column
         width so that everything fits on the screen.}  At this point we can
         simply make a note of the AIANAFP code associated with a particular
         reservation of interest.

     d.  If you want to look at the block groups that comprise the Indian
         reservation, it will be quite useful if the search can be narrowed down
         to a single county.  Summary levels 220 and 221 specify county
         components.  If there is more than one reservation in the state,
         SUMLEVs 220 and 221 will not be together with the associated
         reservation title.  The second line of the screen tells us that the
         display is sorted by SUMLEV (the PLACEFP in the expression has no
         effect on these records).  {Press} N {to get the Natural or original
         sequence which is hierarchical.}



                                                 30

     e.  The codes for AIANACC and ARTLI are not self explanatory, so it may be
         worthwhile to see what they mean.  {Cursor to AIANACC and press} D {to
         display its definition.  Repeat the process with ARTLI.}

     f.  This display should make a handy code list.  {Press} <esc> {to return
         to the main menu, the type:} 7 <enter> {to print out the list, assuming
         you are connected to a printer.  If you are not connected to a printer,
         make a note of the county in which the particular reservation is
         located.}

 4.  {Type:} 2 {to select records.}

     Before specifying the record selection, we need to review the fact that
     there is no asterisk next to AIANAFP, meaning that record selections based
     on it do not benefit from any indexes.  Since searches using unasterisked
     items can take several minutes per megabyte to accomplish, we are much
     better off if we can help the program find a appropriate starting county.  

     {Put an S next to AIANAFP and and another S next to CNTY.  When the prompt
     for a minimum value for AIANAFP appears, enter the 5-digit code you
     recorded earlier, press <enter>, then enter the same code as the maximum. 
     On the next screen, put an X next to the appropriate county, as determined
     from the code list, and press <esc>.  Finally, in response to the "speed up
     retrieval" screen, put an X next to 091 for block groups.}  While not
     obvious from this screen, we can confirm with the chart discussed in the
     preceeding section that American Indian reservation codes appear on block
     group records.  {Press} <esc>.

 5.  Before displaying the data, confirm that you have the items you want. 
     {Type:} 1 <enter> {to reselect items, delete the X next to AIANACC code,
     and add an X next to TRACTBNA several lines below.}  You will want a tract
     or BNA code to help you identify the block groups in the display.  You may
     also want to add a count of American Indians, and the easiest way to find
     that item is to do a <W>ord search on the word Indian.  {Press} W, {type:}
     Indian <enter>, {then put an X next to P0060003.  To confirm that all of
     the data will fit on the screen, and press} P {to <P>review.  Adjust the
     width of the ANPSADPI column if necessary.  When satisfied, press} <esc>
     {to return to the main menu}. 

 6.  {Type:} 6 <enter> {to initiate the display screen.}  The system will now
     search through all of block groups starting with the county specified.  If
     there are many block groups in the remainder of the state, this could take
     a while.  Do something else for a while, and come back when the CD-ROM
     light is no longer blinking to see your results.













                                                 31

 Tutorial 4E--

 MERGING FILES HORIZONTALLY

 With the Add Labels feature, practiced in Tutorial 4c, you can merge data from
 up to three STF1A files at a time (the original file selected plus two more
 specified with "Add Labels"), as long as the total number of "label" items
 taken is no more than 10 (e.g., 2 from one file and 8 from a second).

 Sometimes you need more than the add labels function will allow, either because
 you want to add more than 10 items from another file or because you want to add
 items from a fourth or fifth file.

 {[Note:  The following example assumes that you have installed EXTRACT 1.3e
 (Feb 92) or later.  Earlier editions of EXTRACT do not successfully link files
 using record number, although linking by geographic codes is still possible. 
 If you have EXTRACT 1.3d or earlier, skip ahead to the "Merging Files to Make
 Inter-Area Comparisons" below.]}

 1.  The first step in this process is to create an extract file with the
     appropriate records and as much of the data from other files as can be
     accommodated by adding labels.

     You must include in the extract enough information to support a unique
     match.  The easiest way to assure that is to include the variable LOGRECNU. 
     (It is also possible to make a horizontal match with a string of other
     variables if they match the contents of one of the STF1A indexes (e.g.,
     sumlev, county, tractbna, block group).

 2.  The file ALONE.dbf created in the tutorial 4c is a good starting point.
     -   {Start EXTRACT, or, if you are already in the program, type:} 9 <enter>
         to return to the file selection menu.
     -   {At the catalog selection menu, <PgDn> to highlight the MY_FILES
         catalog, and press} <enter>.
     -   {Move the cursor if necessary to highlight ALONE and press} <enter>.
     -   {Type:} 1 <enter> {to select items, and put an X next to items you wish
         to display.  You should also confirm that the variable LOGRECNU is
         present.  Press} <esc> {to return to the main menu.  You may also wish
         to display the data to the screen before proceeding and to reduce the
         width of ANPSADPI.}

 3.  Merging files is one of the functions provided within the "Manipulate
     Files" option at the main menu.  {Type:} 4 <enter>

 4.  At the file manipulation menu, select the third option, since we want to
     add more items to existing records.  (We will discuss vertical mergers
     later)  {Type:} 3 <enter>.

 5.  We want to add information about the tenure of persons living alone from
     file STF1A7.  {Cursor to STF1A7 and press} <enter>.  {At the next screen,
     select the appropriate state and press} <enter>.

 6.  The prompt asks "How do records in STF1A7.. relate to records in ALONE?" 
     Since the LOGRECNU variable gives us the record number in the original
     STF1A data bases, we can relate the two data bases by record number. 
     {Type:} 1 <enter>

                                                 32


     The system follows with another question:  "What expression in ALONE gives
     the record number in STF1A7..?".  {Type:} LOGRECNU <enter>

     (If we were working from one of the original STF1A files on the CD-ROM,
     this second question would not have appeared, since the system would have
     detected the same number of records in the two files.)

 7.  The system takes us directly to a Select Items for the file being merged
     in, STF1A7..  {Cursor or <J>ump to H0180001, and put an X next to it and
     H0180008, the numbers of owners and renters among one-person households.

 8.  Display the results.

 9.  If you want to merge in data from yet another source, you will need to
     first save what you have to an extract .DBF file.

     a.  But first go back to the select items menu to make sure you aren't
         leaving out any variables you will need.  It is particularly easy to
         forget about LOGRECNU since it adds nothing of analytic interest to
         your data display; take LOGRECNU along since it is critical to further
         merging.

     b.  Extract the data to a file using option 8.  When the data extraction
         process is complete, the system will allow you to go directly to the
         new file, which you may use as the base of operations for repeating the
         steps outlined above.

 MERGING FILES TO MAKE INTER-AREA COMPARISONS

 We have seen a number of ways that EXTRACT allows you to work with data within
 a single record, such as the creation of user-defined items.  If you want to
 make comparisons among different records from the same source, you will usually
 have the most flexibility if you import data into a spreadsheet program, such
 as Lotus 1-2-3 or Quattro Pro.

 Still, the horizontal merge feature gives some flexibility for linking among
 different levels of the hierarchy.  The most common application is in doing a
 percent distribution across the components of an area.

 The following assumes that you are currently using ALONE.dbf (created in
 Tutorial 4c) or a file you have created from it.  Our objective is to create a
 percent distribution of persons living alone within the county.

 1.  {At the main menu, type:} 4 <enter> {to manipulate files}

 2.  {Type:} 3 <enter> {to merge files horizontally.}

     If you previously made another horizontal merger into this file, it will
     ask you whether you are simply changing items within the previously merged
     file.  By selecting items from a different file (option b) you lose the
     results of the previous merger.  If you don't want that, press <esc> twice
     to return to the main menu, and save what you have as a .dbf file (option 8
     from the main menu), before coming back to this menu.



                                                 33

 3.  {Cursor to STF1A5, and press} <enter>.  {Select the appropriate state on
     the next screen and press} <enter>.

 4.  We want to link each record to the corresponding county total, and there is
     an index that will let us do that.  In response to the prompt "How do
     records in STF1A5.. relate to records in ALONE?"  type: 3 <enter>.  Cursor
     now to the STF1AM.. index, because its description starts with sumlev by
     county.  {Press} <enter>.

 5.  The system asks us to enter a specific key for relating the two files, and
     it prompts us with the contents of the key expression for STF1AM..: 
     SUMLEV + CNTY + COUSUBFP + PLACEFP.  Since we want to constrain the summary
     level to give us only county data, we type "050" (including quotation
     marks) over SUMLEV and press <del> to get rid of the last character in
     SUMLEV.  Cursor right past CNTY then press <ctl>Y to delete to the end of
     the line.  {Press} <enter>.

 6.  {When the system brings up the item selection screen, put an X next to
     p0160001 and p0160002, then press} <esc>.

 7.  {At the main menu type:} 6 <enter> {to display the results.}

     On the line for the county total (sumlev=050), number of males and females
     living alone should be repeated, first from our extract file ALONE, second
     from the merged file.

     On subsequent records the data from the merged file will stay constant as
     long as the county does not change, while the data from the original
     extract file will vary from record to record.

 8.  Note the headings for new columns, STF1A5..->P0160001 and
     STF1A5..->P0160002 (where the abbreviation for the state you are working
     with shows in place of the ..).  If you want to create a user defined item
     that incorporates these statistics, you must be prepared to enter the
     references exactly that way.  For example, to obtain a percent distribution
     of females living alone within the county, with the county total as 100%,
     enter a user defined item with P0160002 as the numerator,
     STF1A5..->P0160002 as the denominator, and 100 as the scaling factor.

 9.  When you extract the results to a .dbf file, the system will prompt you to
     change the duplicate field names and descriptions, since you already have a
     P0160001 and a P0160002 in the data base.  Enter MALE1CO and "County total
     for males living alone" in the two prompt areas for P0160001, and FEMALE1CO
     and "County total for females living alone" for P0160002.

 OTHER APPLICATIONS FOR HORIZONTAL FILE MERGING

 Merging data from STF 1 and STF 3

 When STF 3A and 3C data become available, with much of the same geographic
 structure as STF 1A and 1C, users may want to display data from both sources
 together.  While many of the key STF 1 data are repeated on STF 3, other items
 can be obtained only from STF 1.  Further, the STF 1 numbers, being free of
 sampling variability, are more accurate than counterparts from STF 3 (although
 the difference is negligible for large areas).


                                                 34

 Unless you have a system that can address two CD-ROMs at once (not just a
 "jukebox"), you will need to extract data from one source to a dBASE file in
 your MY_FILES catalog before loading the CD-ROM for the other source.

 You will not be able to use LOGRECNU in specifying the relationship between the
 two data sets since the record numbering will not match exactly.  Therefore,
 your extract file will need to include as many codes as are necessary to
 satisfy the index keyed to the second source, e.g., SUMLEV, CNTY, COUSUBFP,
 PLACEFP, TRACTBNA, and BLCKGR.  

 The steps for specifying the merger are similar to those specified in the above
 example for inter-area comparisons, except that you would specify the matching
 relationship as SUMLEV + CNTY + COUSUBFP + PLACEFP + TRACTBNA + BLCKGR.  (The
 default prompt includes only the first four of those terms.)

 If you are working with data for Indian reservations or other areas not
 featured in key indexes, you will need to create a new index for linking
 purposes.  Since reindexing a whole STF 1A file can consume considerable time
 and hard disk space, it is best to extract data from both sources to your
 MY_FILES catalog, then merge based on an index you create at the time of file
 merger (option 2 at the relationship specifying screen).

 Merging data from the 1988 County and City Data Book CD-ROM

 One of the great frustrations of users of 1990 census CD-ROMs is that the discs
 include no comparative numbers from previous censuses.  At this writing, the
 only Census Bureau CD-ROM featuring any 1980 census data is the 1988 County and
 City Data Book CD, which includes a number of summary measures corresponding to
 STF 1 (population by race, Hispanic origin, households by type, housing counts,
 percent owner occupied, median value), and a few corresponding to STF 3
 (percent high school and college graduates, per capita income, median household
 income, percent below poverty).  There are no detailed distributions from the
 1980 census (the 9-category age distribution is based on 1984 estimates, not
 the 1980 census.)  These data are shown for States, counties, and cities of
 25,000 or more inhabitants, with the items spread across a number of small
 files.  One file shows 1980 population and 1979 per capita income for all
 places with 2,500 or more inhabitants.  

 County files are indexed on a 5-digit STCO code that combines the state and
 county code.  In specifying the relationship from an STF 1 extract, use STATEFP
 + CNTY.  City files are indexed on a 6-digit STPL code combining the state and
 Census place code (not the FIPS code).  In specifying the relationship from STF
 1, use STATEFP + PLACECE.  Unfortunately, the file with all places does not
 have an index usable for linking.

 In using 1980 census data from any source, you must take into account the
 possibility of changes in boundaries or coding between 1980 and 1990.  While
 county changes were very few (La Paz, AZ and Cibola, NM), many places changed
 boundaries through annexation or detachment, and codes may have changed as a
 result of mergers and new incorporations.

 Merging STF 1 data with economic data

 Data from STF 1 may also be merged with data from the 1987 economic or
 agriculture censuses or from the annual County Business Patterns CD-ROM.  The
 most common merger involves combining measures of retail sales with measures of

                                                 35

 population or income.  The notes above on comparability over time apply here as
 well, although the time difference is only 3 years, and no county changes
 occurred during that period.  The economic censuses recognize only incorporated
 places with more than 2,500 inhabitants, and only selected county subdivisions
 in New England, New Jersey, and Pennsylvania.

 State, county and place data will need to be merged in separate operations,
 since most of the key indexes involve the 2-digit RECTYPE codes (02 for states,
 04 for MSAs, 06 for counties, 07 for places), which serve the same function as
 but are not the same as SUMLEV codes.  To merge data with the RC87A3 retail
 trade file, you would make the merge using the index RC87C3 (by rectype, state,
 place, and county) and specify the relationship as follows:
     (for State data)    "02" + STATEFP
     (for county data)   "06" + STATEFP + "0000" + CNTY. 
     (for place data)    "07" + STATEFP + PLACECE
 Note that, while both the 4-digit census place codes and new 5-digit FIPS place
 codes are included in both the economic census and 1990 census files, the
 economic censuses employed the 4-digit code (corresponding to PLACECE) to
 control indexes, while the decennial census employed the 5-digit code (PLACEFP)
 for indexing.  For the merge to work PLACECE must have been included in the
 current file.

 CHOOSING WHERE TO START WITH LABELS AND FILE MERGERS

 In tutorial 4c, we found that one could add up to two sets of labels at a time,
 in effect linking 3 files at once.  In the exercises above, we found that we
 could use the horizontal merge feature to add even more in successive
 iterations.  While EXTRACT generally gives you considerable flexibility in the
 sequence with which these operations are performed, you may want to consider
 the following factors.

 -   All record selection variables need to be on the base file, that is, the
     file you load at startup or through Main Menu option 9.  If you want to
     select records on a code or data item not in every file (for example,
     congressional district code), use STF1A0 as your base file.  Where this is
     not immediately possible, as in selection based on a user-defined item
     (e.g., tracts more than 50 percent Black), you can always create an extract
     .DBF file.  Once you use it, the extract file becomes your base file,
     complete with any previously added labels or user-defined items.

 -   In specifying a User-Defined Item, referring to variables from files other
     than the base file can be awkward.  You have to use nomenclature like
     STF1A0MD->H0010001 and you can't use the colon as a shortcut in specifying
     the sum of a series of items.

 -   Items you add through Add Labels will appear first in your output, and in
     the sequence chosen.  If you want the area name to be at the left, make it
     the first label you select.  Items you add through horizontal file merger
     will appear after items in the base file, though before user-defined items
     being created now.

 -   There are fewer steps required to add labels than to merge files, and fewer
     things to go wrong, as long as you are working off the CD-ROM.  Once you
     extract data to a .DBF file and use that file, the linkage to label files,
     as specified in the data dictionary, may be broken.  (For example, the data
     dictionary assumes that the base file and the label files each incorporate

                                                 36

     the state abbreviation as characters 7 and 8 of the file name, e.g.,
     STF1A0VA.)  Therefore, in a multi-file extract, it is generally best to add
     what you can as labels (up to the two-file, 10-item limit) before trying a
     horizontal file merger. 

 -   If you need more than 10 items from a file other than the base, merging
     files horizontally can incorporate all of the items in a single pass, while
     adding labels cannot.  No more than three files can be linked at a time. 
     If you want to add items through file merger, you can add labels from only
     one source on the same pass.

 -   No dBASE file can store more than 128 items.

 PRACTICE PROBLEMS

 1.  Merge State, county or place data from STF1 with data from Economic
     Censuses CD-ROM 1D or 1E.  

     Hints:  You will need to start by extracting desired parts of one of the
     CD-ROMs to your MY_FILES catalog, then restart EXTRACT with the other CD-
     ROM in place.  

     Once you have selected an index on which to relate a particular economic
     census file, you will need to specify a relationship.  Here you must bridge
     across different nomenclature used between the two sources.  The following
     codes are directly comparable:  STATEFP = ST, CNTY = COUNTY, PLACEFP =
     PLACEFIP, PLACECE = PLACE, MSACMSA = MSA.  RECTYPE fills the same role for
     the economic censuses as SUMLEV plays 1990 census files, but the codes are
     different:  01 = U.S., 02 = State, 04 = MSA/CMSA/PMSA, 06 = county, 07 =
     place.  The relationship between files must be expressed in terms
     meaningful to the base file, not to the file being merged in.  Thus, if
     your base file contains summary data from the 1990 census, and you select a
     retail trade file indexed on RECTYPE + ST + COUNTY, you must specify the
     relationship as "06" + STATEFP + CNTY.

 End of April 1992 draft





















                                                 37

 =========================================

 Material to be added to future revisions:

     Merging files vertically, to combine data from multiple states or prior
     retrievals

     Resequencing records

 ===========================================

 Comments on this series of tutorials is welcome.  Call Paul Zeisset or Bob
 Marske at 301/763-1792, or write the Economic Census Staff, Bureau of the
 Census, Washington, D.C.  20233.











































                                                 38
