





                       U.S. Bureau of the Census Monograph       
                              Economic Census Staff
                             Washington, D.C.  20233

 USING EXTRACT WITH 1990 CENSUS EEO FILES  (EXTutor 5)            March 29, 1993

 ======================================================================
 CONTACT:        Paul Zeisset or Bob Marske                       (301) 763-1792


 ABSTRACT:       The following was designed as a self-directed tutorial in
                 EXTRACT usage. 

                 The tutorial draws all of its examples from the EEO CD-ROM, but
                 once the skills covered are learned, the user should be able to
                 apply them with other files.

                 This tutorial is in four parts:

                 5A--Basic skills:  simple item and record selection, adding
                 labels, display options, extracting data to a file

                 Parts 5B, 5C, and 5D deal with a problem of special relevance
                 to the EEO files:  the need to generate totals and subtotals
                 across a number of detailed records (for individual
                 occupations).  These exercises are designed for the user
                 already comfortable with basic data retrieval in EXTRACT.  The
                 presentation style is abbreviated, without specifying every
                 keystroke, unlike tutorial 5A.

                     5B--Generating a single total for each area
                     5C--Creating totals for each major occupation group in each
                     area
                     5D--Creating subtotals for user-defined groupings of
                     occupations in each area.

                 For more grounding in basic extract skills, see EXTutor 4,
                 which focusses its examples on STF 1A CD-ROMs.

 TIME REQUIRED:  about 30 minutes per part.

 =====================================================================

 Before you begin

 This tutorial assumes that you have installed EXTRACT 1.4b (issued March 1993)
 or later in the \EXTRACT subdirectory, that you have installed EEO auxiliary
 files (from both EEOAUXIL.exe and STFAUX.exe) in \EXTRACT\1990AUX, that you
 have created a separate subdirectory for temporary files, \EXTRACT\WORK, and
 that you have made the \EXTRACT directory the default.  

     Instructions for keyboard entry are italicized in printed versions of this
 tutorial, or are shown enclosed in {braces} when distributed as an ASCII text


                                                  1





 file.  User keyboard entries are shown in all capital letters, and special
 characters are enclosed in <>, for example, <esc> for the escape key.



 Tutorial 5A--

 1.  {Type:} EXTRACT <enter>, {then press} <enter> {again after viewing the
     opening screen.}

 2.  DRIVE SELECTION

     The system prompts you for the drive designation for your CD-ROM, and for a
     location where it can store temporary files.

     {Enter the appropriate drive letter for your CD-ROM, e.g., type:} L:
     <enter>, {or whatever is appropriate for your system.}

     EXTRACT then asks for two locations on your hard disk--a directory for
     workspace and a directory for auxiliary files.  

     {Type:} C:\EXTRACT\WORK <enter> {for work space--assuming c:\extract\work
     already exists.}

     {Type:} C:\EXTRACT\1990AUX <enter> {for auxiliary files.}

     We have the option to save these parameters for use next time.  {Type:} S.

     At this point the program presents you with a choice of catalog files. 
     {Place the cursor on MSTREEO and press} <enter>. 

     The system asks you to specify a name for these parameters.  When we work
     with CD-ROMs other than EEO we use different parameters, and save them
     under different names.

     {When prompted with EXEEO, press:} <enter> {to accept, or first change the
     batch file name to whatever you want.}

 3.  SELECT A CATALOG

     The system then presents you with a menu of all of the types of files--or
     "catalogs"--you have to choose from on this disc.

     There are five types of files on EEO CDs, each covering a different table
     or group of tables.  You may select only one as a starting point.  

     File       Subjects covered 
  
     SP3EEO0_   List of geographic areas covered
     SP3EEO1_   Detailed Occupation by Sex, Hispanic Origin and Race
     SP3EEO2_   Educational Attainment by Sex
     SP3EEO3_   Educational Attainment by Age by Sex by Hispanic Origin and Race
     SP3EEO4_   Educational Attainment by Age by Sex by Race

     {With the cursor (highlighted bar) on SP3EEO1_, press} <enter>.


                                                  2





 4.  HELP SCREEN

     The help screen associated with this catalog of files appears.  It has two
     parts:

     a.  A general description of the file, including some notes on the sequence
         of the data.

     b.  A brief description of how three of the options on the main menu that
         follows apply to this particular file.

     You may pull up this help screen at any time from the main menu by pressing
     <F1>.

     {Press} <enter> {to continue.}

 5.  SELECT A FILE

     Now the system presents a list of all of the files within the SP3EEO1_
     catalog--which correspond to all of the states on this disc.  {Move the
     highlighted bar (press the down arrow) down to the state of your choice,
     and press} <enter>.

     At this point, the program takes a few moments to read in the file's data
     dictionary and get itself set up, and (after about 10 seconds) displays the
     main menu.

 5.  MAIN MENU

     The main menu is the control center from which you will operate the rest of
     the program.

     At the lower right is the name of the file you are working with.

     The main menu lists a series of options.  Most sessions involve selecting
     items and records as well as adding labels (options 1, 2 and 3), but let's
     jump ahead to Display the file to the screen, as is.

     {Type:} 6 <enter>

 6.  DISPLAY TO SCREEN

     You are now looking at a large "spreadsheet".  You can scroll down and to
     the right to see more data.

     Features:

         At the top are cryptic variable names.  In EEO files most of these
         variable names are not really self-explanatory.

         At the bottom is a more complete description of the one item that the
         cursor is highlighting.  

         You can scroll the cursor from side to side to see different
         descriptions.  {Press right arrow key repeatedly.}


                                                  3





     What we see here is far from self explanatory.  One can, however, discern
     that there is a different OCC_CODE identifying each line.  The statistics
     we want start with ITEM1 and continue to the right.  In the following steps
     we will reduce the number of columns and add labels to identify the
     different occupations.

     {Press} <esc> {to return to the main menu.}

 7.  SELECT ITEMS

     {Type:} 1 <enter>

     In order to select items, the program lists variable names, the ones we saw
     as column headings in the first columnar display screen, along with the
     more complete descriptors we saw at the bottom as we cursored from side to
     side.

     We can select the variables we want -- by marking each with an X.

     For reasons that will be evident later, SUMLEV is very important.

     {Mark an} X {by SUMLEV, CNTY, OCC_CODE, ITEM1 (males), ITEM2 (females),
     ITEM16 (black males) and ITEM21 (black females)}

     {Press} <esc> {to return to the main menu.}

 8.  ADD LABELS

     {Type:} 3 <enter>

     This is a list of all of the identifiers on the file that have names
     associated with them.  Most of them are geographic, but we will pick
     occupation titles.

     {Cursor to OCC_CODE, and press:} <enter>

     The next screen prompts you to select the label TEXT.  (Some other label
     files have more than one type of label to chose from at this point.) 
     {Press:} <enter>

 9.  DISPLAY TO SCREEN

     {Type:}  6 <enter>

     Initially, we see only the wide column for the occupation titles just
     selected and two codes.  {Move the cursor right to see that the selected
     data items are in fact present.}










                                                  4





 10. ADJUSTING COLUMN WIDTH 
  
     In columnar mode, we can adjust the display width of each column.

     {Move the cursor to the first column, identified as OCC_CODE->TEXT.}

     {Type:} W {for <W>idth}

     {Enter} 29 {as the new width.  Press} <enter>

     Just because you have narrowed a text column doesn't mean you can't look at
     the hidden text when you need to.  

     {Move the cursor to highlight an occupation with a truncated title, and
     type} S {for <S>how.}  A box appears that displays the entire text,
     temporarily covering over the text of adjacent lines.  {To return to normal
     display, move the cursor off of this cell, in any direction.}

 11. SELECTING DATA FOR A COUNTY

     The data displayed here at the beginning of the file relate to the state
     total.  Let's look at data for a particular county.

     {Press} <esc> {to return to the main menu.}

     At the main menu, nothing comes right out and tells us how to select data
     for a particular county.  If we are not sure how to proceed, we can use the
     help function.

     {Press the <F1> key for help.}

     This is the same screen we saw after selecting a catalog.  The bottom of
     the help screen features comments about how three EXTRACT options relate to
     this particular type of file.  It tells us that while Select Items lets us
     specify particular data variables, Select Records allows us to specify
     particular geographic areas or occupation.

     In terms of the data display we used a moment ago, selecting items is a
     matter of selecting columns, and selecting records is a matter of selecting
     rows.  The main menu does not use columns and rows terminology, though,
     because there is another display mode that turns columns into rows.

     Since we want to select a particular geographic area, we want to Select
     Records.

     {Press any key to return to the main menu.}

 12. Select Records 

     {Type:} 2 <enter>

     The first record selection screen looks a lot like the item selection
     screen we used before.  That is because we can use any variable in the file
     to govern the record selection process.  Some selection variables work
     better than others, though, and we most often try to use variables marked
     with an asterisk (*).

                                                  5






     We want to records for a particular county.  {Cursor to CNTY and type:}  
     S.  Note that we use S to select variables at this screen rather than the X
     called for when we were selecting items and could mark lots of variables.

     {Press} <esc> {to continue.}

     Now a menu appears listing all counties, starting with the current state.  
     {Cursor down, or press <PgDn> until you can highlight the county you want. 
     Don't go too far, though, since the list includes counties for all states. 
     With the cursor on the desired county, type:}  X.  {Press} <esc> {to
     continue.}

 13. Speed-Up Screen

     Here is something you may not have expected.  The system tells us that it
     can speed up retrieval if we pick one of the categories of SUMLEV.
   
     {Cursor to 050--State-County, and type:}  X.

         You may be puzzled as to why the system asked that extra question,
         particularly when the answer seems obvious.  In fact, county codes are
         attached also to county subdivision records (in those states where
         present) as well as to county records, so your answer does make a
         difference.

         EXTRACT makes use of existing "indexes" on the CD-ROM whenever it can
         to speed the process of finding specific data.  In this case it found
         that in order to use a particular index, it had to know both the county
         and the summary level, so it prompted us for the missing piece of
         information.

         If you happen to select a SUMLEV that is not valid in combination with
         the rest of your selection criteria, then the program will not find any
         of the data you want.  For example, if you specified a particular
         county code and a SUMLEV of 168 (State-Place--a summary level with no
         county codes), EXTRACT will give you a message that it can find no
         records meeting those conditions.

         EXTRACT uses the information supplied "to speed up retrieval" only for
         the purpose of finding the first eligible record.  Thus, the index is
         "turned off" as soon as the first eligible record is found.

     {Press} <esc>.

     The system now spends a little time searching for the first qualifying
     record, then returns you to the main menu.

 14. {Type} 6 <enter> {to display to screen.}

     Because the system has to look for just certain classes of records, it
     requires more time to fill the screen.  But in a few moments, we have
     statistics for occupations in the county we selected.

     {Press} <PgDn> {a number of times.}  These data continue though about 30
     screens, since there are 500 occupation categories in all.

                                                  6






     {After examining the data, press} <esc> {to return to the main menu.}

 15. SELECT RECORDS FOR AN OCCUPATION IN ALL COUNTIES                           
 
     Let's try out another way of selecting records.  Suppose we want to display
     just one occupation category--librarians--across all counties in the state.

     {Type:}  2 <enter> {to select records.}

     To select a particular occupation, we will obviously use the OCC_CODE
     variable.  To select all counties is not quite as obvious.  Using CNTY
     might be tempting, but we previously used that to select a particular
     county, not all counties in general.  We need to use the SUMLEV variable to
     identify a particular type of geographic areas in general.

     {Mark an} S {next to SUMLEV and another} S {next to OCC_CODE, and press}
     <esc>.

     {Mark an} X {by SUMLEV code 050 for all counties, and press} <esc>

 16. Moving Through a Long Code List with <W>ord Search

     The system next brings up the 500-line long list of occupation codes.  We
     have already used <PgUp> and <PgDn> to move through the long list of data
     items or codes.  That may be fine for getting to know the file, but there
     are more efficient ways of getting through the list if we know what we are
     looking for.  One way we can find out about these shortcuts is to press the
     help key <F1> or <H>.

     {Press} <F1>.

         Help Screen

         EXTRACT's help system is context sensitive, that is, it gives you help
         specific to where you are in the program.

         This screen gives us three suggestions for how to move quickly around
         the list:

         J   will let you <J>ump to a specific code, if you already know it. 
             That is not particularly likely since these occupation codes are
             unique to this file.  The codes used are not the Standard
             Occupation Classification (SOC) codes that appear in parentheses at
             the end of each description.

         L   is for <L>ocate, and it finds a particular character string at the
             beginning of the item description.  

         W   is for <W>ord search, which looks anywhere in the description for
             the character string you enter.

         {Press} <esc> {to return to the select-records menu.}

     Since we don't know how the description for librarians is worded, let's use
     the <W>ord search option.  {Press} W {and type:}  libr<enter>.  You do not

                                                  7





     have to type the whole word, only a string of characters you think will be
     unique enough to get the descriptions you want.  The system narrows the
     list to--

         110 Librarians
         201 Library clerks
         398 Adjusters and calibrators

     <W>ord search has found "libr" in the middle of a word as well as where we
     expected it.  The fact that an irrelevant entry is displayed is of no
     consequence.  We can simply select the entry for librarians and ignore the
     rest.

     {Place an X next to 110--Librarians, and press} <esc>.

     {Type:} 6 <enter> {to display to screen.}

     At this point it is apparent that labels identifying occupations are no
     longer particularly useful.  Therefore, we should ask for a different set
     of labels.

     {Press} <esc> {to return to the main menu.}

 17. CHANGING LABELS

     {Type:} 3 <enter> {to add a different set of labels.}

     A flashing prompt tells us to press S to save the previous set of labels--
     occupation titles.  We do not need occupation titles on every line, so we
     will ignore the prompt.

     {Cursor to CNTY and press} <enter>, {and in a moment, press} <enter> {again
     to select the particular label.

     {Type:} 6 <enter> {to display the data to the screen, then press} <esc> to
     return to the main menu when ready.}

 18. USER-DEFINED ITEMS 
  
     One of the ways EXTRACT can augment your displays is by computing
     "User-Defined Items", such as totals or percents.  Simple computations are
     possible where the inputs to the calculation are all on the same line in a
     columnar display.  (For "column" totals across records, see tutorials 5B
     through 5D.)  This feature is accessed through the Select Items screen.

     {Type:}  1 <enter> {to select items.}

     {Type:}  U {for <U>ser-defined item.}
  
     You can create two types of user-defined items--Ratios and Freeform
     expressions. 
  
     Many of the totals users want from the EEO data base have not been included
     on the files, such as the total number of Black persons in an occupation. 
     We can add the number of Black males and Black females together as a
     "freeform expression".  You will need to enter variable names exactly as

                                                  8





     they appear in the database.  If you do not know the exact names, you may
     return to the regular Select Items mode {(Press <esc> twice) and cursor
     down until you see the items you want (ITEM16 and ITEM21), then Type:} U
     {again to return to the <U>ser-defined items mode, and type} 1 {to select
     the first item}.
  
     To compute total Blacks in the civilian labor force-- 

         {Press} <enter> {three times to get to the freeform expression line.}

         {Type:}  ITEM16+ITEM21 <enter> 

     Be sure to leave the numerator, denominator, and scaling factor fields
     blank if you are entering a freeform expression.

     Once past the basic specification, the system asks how to present the new
     item.

         {Type:}  7 <enter> {for the field length;}

         {Type:}  0 <enter> {for the number of decimals;}

         {Type:}  Black <enter> {for the field name; and} 

         {Type:}  Blacks in civilian labor force {for the title.}

     In this screen, the <enter> key moved you from one field to the next.  If
     you need to back up, use the <up arrow> key.  To skip ahead, use the
     <down arrow> key.  If <up arrow> doesn't let you get to the fields you
     want, for example, if you need to respecify the freeform expression, try
     <PgUp>.

     {Press} <enter> {after completing the title.}
  
 19. EXTRACT will let you enter up to 10 user-defined items, either now or
     later.  Let's say we want the percent Black in the population as well. 
     {Press} 2 {to define a second item.} 

         {Type:}  Black <enter> {for the numerator, since the system will allow
         us to make use of previous user-defined items;}

         {Type:}  ITEM1+ITEM2 <enter> {for the denominator;}

         {Type:}  100 <enter> {for the scaling factor, since you normally
         multiply a proportion by 100 to get a percent;}

         {Press} <enter> {to get past the Free-form Expression field;}

         {Type:}  5 <enter> {for the field length;}

         {Type:}  1 <enter> {for the number of decimals;}

         {Type:}  PCTBLACK <enter> {for the field name; and}

         {Type:}  Percent Black of total civilian labor force {for the title.}


                                                  9





     {Press} <enter> {after completing the title, and} <esc> {to return to the
     regular select items menu.}
  
 20. PREVIEW MODE 

     At this point you may wonder whether all of these new items are going to
     fit on the screen.  We could go back to the main menu and display a full
     screen of data, but EXTRACT gives us a shortcut from within the Select
     Items menu.
  
     {Type:}  P {for <P>review mode.}

     In the preview mode you can cursor to the right to see more fields, just as
     in the full screen display mode.  

     You can even cursor down to see the next record(s) in the database.  

     Since we do not see our new computations on the screen, we may want to
     reduce the width of the county name and eliminate some of the data fields. 
     {With the cursor on the county name, press} W {and type: } 15 <enter>.  Now
     it takes only a moment to adjust the width of any or all columns with the W
     option.  The same <W>idth option exercised from the regular display screen
     may take considerably longer, as the system repaints the full 17-line data
     screen with every width change.

     That still doesn't show us our computations, so let us remove several
     items.  
  
     {Type:}  R {to <R>eselect data items.}

     Mark X by any additional items you want to include in the display, or press
     <space> to un-select items, for example, SUMLEV, ITEM16, and ITEM21.  

     {Press} P {to update the Preview screen.}

     {Press} <esc> {to return to the main menu.}


     {Type:} 6 <enter> {to display to screen.}

     {Press} <esc> {to return to the main menu, when you are ready.}

 21. FORMAT OPTIONS
 
     Our next step is to prepare these data for output.  The program gives us a
     few format options.

     {Type:} 5 <enter>

     One of the options is to specify your own heading.  The program defaults
     start us off with a heading taken from the data dictionary, but by the time
     we have gone through item and record selection, a much more specific
     heading may be appropriate.




                                                 10





     {Type:} 4 {(second level heading) then enter an appropriate heading, for
     example,}  
     Librarians in [state] by county, by sex by race...3/29/1/93 JDoe  <enter>

     It is frequently handy to add today's date or your initials to the heading,
     particularly if you are going to save your output for future use.

     {Press} <esc> {to return to the main menu.}

 22. EXTRACTING THE DATA TO A FILE

     At this point, we could print out our results with option 7.  Instead,
     let's save our work for further manipulation.  

     {Type:} 8 <enter>

     EXTRACT allows us to copy our extracted data set into any of four formats:

     1 - dbf We can create another dBASE file, just smaller than the original.

     2 - prn A comma-delimited file is the kind you want for importing to Lotus
             1-2-3.

     3 - sdf A fixed-format file looks more like a columnar report.

     4 - txt A print file, with the same formatting options as printed output.  

     In addition, there is a Dry Run option that counts up all of the records
     selected and projects the size of the output file without actually doing
     the extraction.  

     {Type:}  5 <enter> {to execute a dry run.}  Doing a dry run doesn't save
     any time, unless you find out that the file you were about to create would
     have been too large for your hard disk or floppy, leading you to reduce the
     number of items or make the selection of records more narrow.  A dry run
     does give you a count of records, and in this case the system gives us the
     number of counties.

     Now, let's go ahead and extract the data for the county already selected
     into a file for further use in EXTRACT, which requires a dBase format file.

     {Type:} 1 <enter>

     The system prompts us to specify a name for our output file and a drive and
     directory location on our hard disk or floppy.

     {Enter a file name, without any extension, e.g.,} LIBRARIA.  If a drive and
     directory have not been specified, this file is automatically save to the
     work directory, and becomes C:\EXTRACT\WORK\LIBRARIA.DBF.  If you want to
     save this file to a floppy disk or other directory, enter the drive and
     directory explicitly, e.g, A:\LIBRARIA.

     The system prompts us for a description for our "My_Files" catalog.  You
     can accept the description listed, you can edit it by cursoring to
     something you want to change, or you can retype the line.


                                                 11





     {Accept the description as is, edit it, or replace it, then press} <enter>.

     The system will work for a while, first extracting the appropriate records,
     and then in a second pass adding the appropriate labels.

 23. USING THE NEWLY CREATED FILE

     When the system is finished, it will give you the opportunity to go
     directly to the newly created file.  {Press} 2.

     A few moments are required to read in the new file's data dictionary, then
     you return to the main menu.

     {Type:} 6 <enter> {to display to the screen.}

     At this point the display has the same data items we had selected before,
     except that the width of wide columns goes back to their defaults.  This
     time there was no waiting while EXTRACT built up the screen--the system is
     not having to slow down as it filters out records that do not qualify, and
     it is working off a fast hard disk rather than a slow CD-ROM.

     {Press} <esc>.

 24. {Type:} 1 <enter> {to bring up the Select Items screen.}

     The system has created a customized data dictionary for our new database
     file.  Here we can see that there are fewer items to choose from.

     {Press} <esc> to leave the item list unchanged and return to the main menu.

     But if we decided at this point that we wanted more data rather than less,
     such as more race categories or the FIPS county code, we would have to
     reselect the original file, and repeat most of our original steps.  

 24. {If you are finished, type:} Q {to quit.}

     When you wish to access the LIBRARIA.DBF file again, you will select it
     from the MY_FILES catalog, which appears at the bottom of the first file-
     selection screen.


















                                                 12





 Tutorial 5B:  Creating Totals from the EEO Files

 This exercise is for the experienced EXTRACT user, and instructions are
 abbreviated for those steps, like selecting items, with which you should
 already be familiar.

 The EEO files contain data for detailed occupations, but include no totals
 across all occupations, much less totals for user-defined groups of
 occupations.

 This exercise creates a single total for each area in a particular file, i.e.,
 one state at a time.  That could be a total across all occupations, for the
 entire civilian labor force, or it could be a total for a group of occupations
 that you select.  (If you want totals for multiple occupation groups for each
 area, see Tutorials 5C and 5D.)

 This is not a fast process.  A test run for Maryland (3.8 Mb) took about an
 hour on an 8Mz 286 machine.  It generated a file of totals only about 1/500 of
 the size of the original file.  If you simply want totals for the civilian
 labor force and have STF 3A, or better yet, STF 3C, creating user defined items
 from the STF 3 records is much faster. (STF 3 tables P70, P71, and P72 can give
 you civilian labor force totals, i.e., civilian employed plus civilian
 unemployed, for race by sex and Hispanic origin by sex.  STF 3 does not have
 race for nonHispanics as provided in the EEO file.)  If you want to total a
 group of occupations, or do not have STF 3 CD-ROMs, this is your only option.

 Creating a total for each area on the file:

 1.  Load data for a particular state

 2.  Select ALL items except OCC_CODE and excluding any geographic codes you
     know you will not need, e.g. COUSUBFP in a state other than the 12 states
     for which that code is provided.  You must include the STATEFP code if you
     plan to add geographic labels later.

     Do not add labels or create user-defined items--they will not be carried in
     the totalling process.

 3.  Select records

     a.  If you want totals for all areas and across all occupations, do not
         select records at all.

     b.  If you want just state or place level totals, select on SUMLEV.  (If
         you want county totals, it is probably faster to sum everything and
         just ignore places on the output.)

     c.  If you want to select a certain set of occupations, select records on
         the OCC_CODE-- but then you must turn off the index that sequences
         output by occupation, as noted below under 4.a.

 4.  At the main menu, select Manipulate files

     a.  If you have selected only certain occupations, discussed in 3c above,
         press 1 for Select an Existing Index, then press <esc> to clear the
         current index.

                                                 13






     b.  Press 5 to Create Totals

         1)  Enter a totalling key--
             CNTY if you only want state and county totals
             CNTY+PLACEFP if you want state, county and place totals
             CNTY+COUSUBFP+PLACEFP to get state, county, county subdivision and
             place totals

         2)  Enter a filename, e.g., EEOTOTMD.  If you want the output somewhere
             other than your \EXTRACT\WORK directory, enter the drive and
             directory along with the filename.

         3)  Enter a description that will help you remember what you have done,
             for example
             Totals for civilian labor force (EEO) for Maryland 3/1/93

         4)  Be prepared to wait.

 5.  When done, select option 2 to load the newly created file of totals.

     a.  When you get to the main menu, first display the data to see what you
         have.

     b.  Now you may select whatever items you want, and add labels to identify
         counties.

     c.  Save this output with Extract Data to a File or Print.  If you create a
         .DBF file, that output can be merged in with detailed EEO data for
         computing each occupation as a percent of total civilian labor force.



























                                                 14





 Tutorial 5C:  Create subtotals for major occupation groups.

 This exercise creates a set of subtotals for each area corresponding to the 13
 major occupation categories shown on STF 3.  The code GROUP is in the same file
 as the occupation titles, so it can be added as a label prior to totalling.
 EXTRACT cannot generate totals based on a label outside the current file, so
 you must first create an intermediate file that merges the GROUP code in with
 the other data you want.

 1.  Create an intermediate extract data base that includes the GROUP code along
     with all the detailed records you want to sum.

     a.  Select only the items you need.  Do not include the OCC_CODE since we
         will make it irrelevant.  Do include the STATEFP code and any other
         geographic codes (e.g., CNTY) you may need to distinguish the output. 
         The fewer data items you select, the smaller will be the intermediate
         data base you are creating.  You may even want to create user defined
         items to combine detail you do not need.  For example, if you need only
         data on Hispanic and race, and not separately for male and female,
         create a series of user-defined items that sum ITEM1 and ITEM2 (total),
         3 and 7 (Hispanic), 4 and 8 (White, nonHispanic), etc.

     b.  Add the GROUP code as a label

         1)  From the main menu, enter 3 to Add labels

         2)  Cursor to OCC_CODE and press <enter>

         3)  When only TEXT shows as an option, press A to show all fields

         4)  Highlight GROUP and press <enter>

     c.  Select records to include only those summary levels you need.

     d.  Display the data to the screen to see what you have--records for each
         detailed occupation with the GROUP code attached to each one.

     e.  Create a .DBF extract file of these data.

         1)  From the main menu, enter 8 to Extract a Data File

         2)  You may well want to do a dry run (option 5) to make sure that the
             file will fit in the hard disk space you have available.

         3)  Select option 1 to create a .DBF file

         4)  Enter a filename, and then a description that befits the temporary
             use of this file.

         5)  When complete, take option 2 to load the new data file into
             EXTRACT.

 2.  Create totals or subtotals

     a.  From the main menu, enter 4--Manipulate files


                                                 15





     b.  Enter 5--Create totals or subtotals

     c.  Enter the key expression PLACEFP+CNTY+GROUP.  (You want to generate a
         new summary for each combination of geography and group.  If you have
         selected records only for counties, the expression would be CNTY+GROUP;
         if you include both county subdivisions and places, you will need
         CNTY+COUSUBFP+PLACEFP+GROUP.)

     d.  Specify a filename

     e.  Enter a description for your MY_FILES catalog

     f.  When the totalling is complete, select option 2 to load this new data
         set of totals for viewing or further use.  In the future, you may
         access these data through your MY_FILES catalog.

     g.  Note that this process gives you 13 subtotals for each area, but not a
         grand total for the entire civilian labor force. To create grand totals
         and merge them into this data base requires several extra steps: 
         selecting all items except the GROUP code, totalling on CNTY or other
         geographic codes, then using two other Manipulate Files functions,
         first to merge "vertically" the new information in, then to sort the
         data together based on geography plus occupation group code. Give it a
         try!

































                                                 16





 Tutorial 5D:  Creating totals for occupation groups of your own choosing.

 This exercise follows the same basic steps as Tutorial 5C, except that you will
 be totalling the data according to groupings of the detailed occupation
 categories that you define, rather than taking the 13 major groups defined for
 census purposes.

 You will be actually modifying an auxiliary file.  This should be okay if you
 confine your changes to the otherwise blank USER field.  Even if you make a
 major mistake, you can always reinstall the auxiliary files from EEOAUXIL.EXE,
 in which case you will lose any codes added to the occupation code list.

 1.  Develop your coding scheme

     a.  You may wish to first annotate the codes you want to assign on paper,
         (in Appendix G of the paper documentation or on a printout of
         \DOCUMENT\APPEND_G.ASC), so that you are not facing coding decisions
         when you are at the computer.

     b.  You will be entering a code in a 3-character field.  You may use
         numeric or alphabetic characters, but the codes you assign will
         determine the sequence of data displayed.  If you use a 1- or 2-digit
         code, be sure to left- or right-justify them consistently.

     c.  You will assign the same code to every category you want grouped
         together.

 2.  Post your code list to the OCCUPATI.dbf code list.

     a.  From the main menu select 10--Advance Options

     b.  Enter 2 to Display a Secondary File.  This mode gives you the ability
         to actually change a file.  (Never try to change a file on a CD-ROM--
         the system will bomb.)

     c.  Enter filename:  C:\EXTRACT\1990AUX\OCCUPATI.DBF (or modify the path if
         you have stored auxiliary files elsewhere.)

     d.  Cursor to the USER column; this is the column in which you will be
         entering the codes you want to assign.

     e.  To enter a code, cursor to the appropriate location, type the code,
         then press <enter> to confirm.  Then you can cursor to the next
         location.

     f.  You may ignore the summary lines that have a blank OCC_CODE. There are
         no corresponding records in the data base.

     g.  If you leave the USER code blank for a line that has an OCC_CODE, that
         category will contribute to a total for all lines you leave blank.

     h.  After you have entered the last code and pressed <enter>, press <esc>
         twice to return to the main menu

 3.  Now use Add Labels to merge these entries with the data records.


                                                 17





     a.  From the main menu, select 3--Add labels

     b.  Cursor to OCC_GROUP and press <enter>

     c.  When only one option, TEXT, is displayed, press A to show all fields. 
         Cursor to USER and press <enter>

     d.  At the main menu, display the data to the screen to confirm that the
         codes are linked the way you want them.

 4.  Select records, display, and create the intermediate data file, as outlined
     in Exercise 5C, steps 1.c. to 1.e.

 5.  Resort the data base by geography and occupation group code. This extra
     step is necessary if any of your occupational groupings were not completely
     in sequence with the original codes, or if you left any categories
     unassigned.

     a.  From the main menu, enter 4-Manipulate files

     b.  Enter 2-Create a New Index

     c.  Specify a key expression that includes all relevant geography as well
         as your new USER code, e.g., PLACEFP+CNTY+USER.  If you are summing
         only the county records, the key could be CNTY+USER.  (Do not include
         in your indexing key any variables you did not extract to the current
         file.)

     d.  Enter a filename for the index file.

     e.  Enter a description like "Sequenced by USER occupation code"

 6.  Create totals using the steps in section 2 in Tutorial 5C.  The key for
     totalling should be the same as the key you used for indexing in step 5c
     above.

 ===========================================

 Comments on this series of tutorials is welcome.  Call Paul Zeisset or Bob
 Marske at 301/763-1792, or write the Economic Census Staff, Bureau of the
 Census, Washington, D.C.  20233.
















                                                 18
