Homework assignment QDA I. in summer 2015


Note: From no. 4 on, there are valid assignments for 2015 (originally, they are from 2014).

Back to the main page    Quantitative Data Analysis I. (presentations, readings, etc.)
What we have done QDA II. 2015 - summer semester
Homework must be writen in doc MS Office format (possibly in txt, rtf or pdf) with your answers to the questions and brief interpretation of the results (select only adequate results from SPSS output).
Send it to the email jiri.safrATseznam.cz; in the subject, please write: QDA2, homework NO., YOUR NAME

0. (18/2/15) No homework (in fact, there was no class due to the bomb alarm) except for filling in knowledge of survey design by reading handouts and literature on metodykv.wz.cz and elsewhere.


1. homework (11/3/2015)
Get Data TV&Books FHS 2015 - NOT CLEANED VERSION only for the 1. homework (notice, the dataset is continuously updated - courses QDA1, AKD1-daily, AKD1-distance i.e. all HiSo courses in summer 2015 completed).
Examine all basic variables (from STUDIUM to ddmm), i.e. find out whether all their values are "reasonable" (they make sense to you) and if not define possible missing values (via MISSING VALUES).
For appropriate range of values see questionnaire (ppt) + common sense! (e.g. a day has only 24 hours ...).
Make tables of frequencies (FREQUENCIES). Show - paste them into MS Word document (doc) with raw data (not cleaned yet) and right after it if relevant (where you defined any missing value) show the table again after cleaning it (defining missing values) and try to comment the result but only very briefly. Attach adequate syntax commands (it can be on the whole in the appendix). If you are still not able to work with syntax commands and you are using menu + mouse, never mind, document your work of "data cleaning" on the tables before/after defining missing values by words.
Don't forget to state in the subject of your email : QDA1, Homework 1, YOUR NAME

Instructions and some clue: Principle of MISSING VALUES command as well as making descriptive tables (FREQUENCIES) can be found in Syntax 11/3/2015  and Syntax 18/3/2015  or in an older Syntax 5/3/2014  
Everything is explained in the presentation 2. Missing Values: identification, assigning and their analysis (1.) (slides 2-14)  See also 1. Introduction to SPSS/PSPP (1.) (program environment, data input, labelling variables, basic settings)  and you can also watch a video Intro to SPSS (the very basics) and read SPSS Statistics for Students: The Basics (very helpful guide at SSCC University of Wisconsin - Madison)


2. homework (11 and 17/3/2015) Data TV&Books, FHS 2015 This is a cleaned version, i.e. missing values defined (for the rest of the homework and the classes; updated 15/3/2015) Use it for all the rest of homework and classes
Calculate descriptive statistics (mean, standard deviation, median, or percent and mode) for the variables: Gender [gender], Work [work], Num. books at home when child [books_chilhood], Professional/study books read last year [books_profes] and Friends [friends]. So make summative description of the data, using e.g. FREQUENCIES in SPSS/ PSPP: for categorical variables (%, mode) and for numeric variables (mean, median, standard deviation).
Everything is explained in the Syntax 18/03/2015 or in the older Syntax 19/03/2014 and in the handout 3. Descriptive statistics: an exploratory univariate analysis as well as in the chap. 14. Quantitative data analysis (p. 322-325) and chap. 15. Using SPSS for Windows (p. 348-352)  in Bryman, A. 2008. Social research methods. Oxford: Oxford University Press.
Beware of "inadequate" values (First, check out whether you have set them as MISSING). Insert Tables form the SPSS output into the MS Word document (here are instructions/tricks on SPSS-doc) and comment the results briefly sociologically. Also add commands in the syntax which you have applied.


3. homework (18/4/2015)
Compute mean and standard deviation in two groups of HiSO students of AKD1 2013: Daily (denní) and Distant (kombinované) studium:
1. by hand (paper/pencil - MS word, calculator), 2. in MS Excel or other spreadsheet program (by setting function into cell) and finally after you copy data into SPSS) in SPSS, here you also compute median and graphs Histograms. (notice results for StDev should be similar but not necessary the same.)
DATA (also in Descriptive statistics an exploratory univariate analysis (slides ca 21-26)):
Daily studium (age): 23 25 24 23 24 23 22 23 22
Distant studium (age): 33 30 48 25 31 46 49 38 26 28 26 31
All 3 computations/outputs put into the word document (computation by hand, Excel table or printscreen of the sheet at least, SPSS table and picture-graph), don’t forget to add small interpretation of difference between the two classes (values distribution using mean, median, stddev).
Everything is explained in Syntax 18/03/2015 and in the handout Descriptive statistics an exploratory univariate analysis

(so far updated for summer 2015)
For assignments 4. to 8. there are help links to SPSS syntaxes we did in 2014.


4. homework (13/4/2014) Data TV&Books, FHS 2015
Bivariate analysis of means of ratio (numeric) variable in categories of categorical variable.
Compare the means of the following ratio (numeric) variables (e.g., using the MEANS command): watching TV [TV], number of books read in total [books_all], professional books [books_profes] and entertainment books [books_fun], and number of friends [friends] in subgroups defined by categories of Gender [gender] and Class/semester [STUDIUM] respectively.
When interpreting the results for Class [STUDIUM] focus only on your own group – i.e. QDA I. in summer 2014 (=15. category) – compare it with the total mean (whole sample) and also with the group with the highest average values.
First before computing means, examine values of the dependent variables (check it by Frequencies and possibly set "deviant values" to user MISSINGS) and also beware of sufficient number of cases in the subgroups when interpreting means. Make comment on the results – try to interpret the numbers substantively also when doing so note the size of the standard deviation (StdDev) within the subgroups (by classes and gender). Add commands in the syntax which you have applied.
Everything is explained in the upgraded Syntax 9/04/2014 (from line 83). You can also watch video (if you prefer clicking by mouse on the menu) SPSS Tutorial 10 - Comparing Means (aprox. from 6th min.) and especially SPSS Tutorial 11 - Comparing Means - Interpretation of Results


5. homework (16/4/2014) (added 21/4/14)
Data TV&Books, FHS 2015
Recoding of ratio (numeric) variable into categories and bivariate analysis of means in these new categories.
Recode – collapse the ratio (numeric) variable watching television (in hours/a day) [TV] into three categories based on terciles (i.e. proportionally equally-represented groups). Use the name TV3t and make labels of the new variable. Find out how different are those who are watching television "below average" (I. tercile), "average" (II. tercile) and "above average" (III. tercile), i.e. new recoded categories, in the average number of books read [books_all] and number of friends [friends]. And what conclusions can be reached if Median is used as a central tendency measure instead of arithmetic mean? Interpret the results substantively.
Recoding ratio variable is explained in the upgraded Syntax from 16/04/2014 (collapsing into terciles starts at line 120) where there is also how to compare means in soubgroups using command MEANS (from line 186). It can be found in Syntax from 9/04/2014 (from line 83) as well and it was also the subject of the 3. homework.


6. homework (23/4/2014) (added 26/4/14)
Data TV&Books, FHS 2015
Contingency table - crosstabulation (%). The task builds on the previous 4. assignment – it addresses the same question but using categorical data.
Make two contingency tables for categorical dependent variable Total books read and categorical explanatory variables: a) Watching TV, and b) Number of friends.
First, create a categorical variables [books_all3t] by recoding ratio variable [books_all] into tertiles and number of close friends [friends] recoding into tertiles in the same manner [friends3t] (you may already have them from the seminary); variable TV3t (tertiles of TV) you already created in a previous 4. task.
Is the number of books read related to watching television and to a number of friends? Interpret the results sociologically. And are result the same as in task 5, where the dependent variables books_all3t and friends were numerical while independent variable TV3t was categorized?
(Use the command CROSSTABS with the COLUMN percent. It is explained in the upgraded Syntax from 23/04/2014; see also chapters Cross-tabulations (Chap. 1 in Treiman 2009) and Elementary Analyses. (Pp. 375-394 in Babbie 1995).)


7. homework (7/5/2014) (added 18/5/14)
Data TV&Books, FHS 2015
Contingency table (%) with controlling for the effect of 3-rd variable.  The task builds on the previous assignment 5 (and thus also 4).
Make a contingency table for the dependent variable Total number of books read (tertiles [books_all3t]) and two explanatory variables a) Watching TV [TV3t], b) Number of friends [friends3t], c) the same two contingency tables but separately for men and women [gender].
What is the relationship between the total number of books read and watching TV respectively number of friends you already know from the previous homework, it is sufficient to copy the tables and the result (watch the same number of cases for the current data size it is n = 128). And newly add the answer to the question: Are these relationships (Books-TV and Books-Friends) the same for men and women? At the same time when interpreting the 3-rd level describe which group reads the least and which the most (combination of sub-grouping: TV x Gender and Gender x Friends).
In sum, construct two contingency tables (just copy those from the previous homework) and two new tables extended with the third controlling variable – gender. Interpret the results sociologically.
In SPSS use the command CROSSTABS (again, with column percent) + adding a third controlling variable; see Syntax from 14/05/2014 and new presentations on 3. Contingency tables and analysis of categorical data - introduction   4. Contingency tables: multivariate analysis and elaboration - introduction to third level of data sorting  pdf  (both updated 5/6/14) and also Map of bivariate analyses configuration.
How to format tables, present and interpret the relationships between variables in contingency table see examples in Cross-tabulations (Chap. 1 in Treiman 2009) and Elementary Analyses (Chap. 15 in Babbie 1995) and for elaboration (multivariate contingency tables with three variables) see More on tables (Chap. 2 in Treiman 2009) and Elaborating bivariate relationships (Chap. 12 in de Vaus 1985) (pp. 161-170)


8. homework (14/5/2014) (added 24(21)/5/14) It is the last homework.   Download new dataset from the representative survey of adult population in Czech Republic   Data ISSP 2007 - Leisure Time and Sports, the Czech Republic (HiSo version, english)  (last revision 24/5/2014)      Czech questionnaire (it applies to original Czech variable names), Basic questionnaire in English    More information on ISSP.org
Contingency table - crosstabulation (%) - Introduction into Elaboration (controlling for the effect of 3-rd variable).  
First, recode the original ordinal variable [q1_d] "How often the respondent reads the book" into dichotomous variable [Read2] with values 1 = reads at least once a week and 0 = less often/ not at all.
Make a contingency table for dependent variable [Read2] with the explanatory variables a) age categories [Age4], and b) the same but with addition of a control variable education [edu3]. Answer the following questions:
What is the relationship between the frequency of reading books and age? Is this relationship somehow modified by education? Or is it the same in all three educational levels? When interpreting results use the differences in percentage points between categories and also coefficients of association/ordinal-correlation. Further describe which group reads the least, and the most (combination: Age4 x edu3). Interpret the results sociologically.
In sum, you will construct two tables: bivariate and trivariate (with controlling variable – factor – education) and their appropriate coefficients associations/correlations for general zero-order association and then for partial associations.
In SPSS use the command CROSSTABS (again, with column percent and ordinal correlation Gamma) + adding a third controlling variable; see Syntax from 21/05/2014 and Syntax from 14/05/2014 and new presentations on 3. Contingency tables and analysis of categorical data - introduction   4. Contingency tables: multivariate analysis and elaboration - introduction to third level of data sorting  pdf  (both updated 5/6/14) and also Map of bivariate analyses configuration.
How to format tables, present and interpret the relationships between variables in contingency table see examples in Cross-tabulations (Chap. 1 in Treiman 2009) and Elementary Analyses (Chap. 15 in Babbie 1995) and for elaboration (multivariate contingency tables with three variables) see More on tables (Chap. 2 in Treiman 2009) and Elaborating bivariate relationships (Chap. 12 in de Vaus 1985) (pp. 161-170)

And don't forget to state in the subject of your email: QDA1, Homework NUMBER, YOUR NAME


References:
Babbie, E. 1995. The Practice of Social Research. Wadsworth Publishing; 7th. ed.
de Vaus, D., A. 1985. Surveys in Social Research, First Edition, Hemel Hempstead/Winchester, Mass.: Allen & Unwin Ltd.
Treiman, D. J. 2009. Quantitative data analysis: doing social research to test ideas San Francisco: Jossey-Bass.



Back to the main page