Sunday, January 13, 2013

Criterion and Norm Referenced Evaluations



Criterion and Norm Referenced Evaluations


Criterion referenced tests (CRT) and norm referenced tests (NRT) are two forms of evaluation primarily utilized for student assessment.  First, CRT examines the individual’s proficiency in regard to a particular line of training or curriculum.  Such methods disclose the effectiveness of the instruction.  Hence, the individual is evaluated based on what they have learned or retained rather than in comparison to other individuals.  However, NRT evaluates individuals in comparison to others.  More specifically, mass numbers of students are evaluated in effort to identify varying levels of performance, from high to low.  Often, NRM is employed for student placement or advancement.  Examples of a NRT are the GRE, SAT, or ACT.  Although both CRT and NRT measure individual performance and are utilized for standardized testing, it is critical to realize CRT measures performance based on a certain standard whereas NRT measures individual performance in relation to other individuals (Bond, 1996).

                                                           References
Bond, L.A. (1996). Norm- and criterion-referenced testing. Practical Assessment,
Research & Evaluation, 5(2). Retrieved January 3, 2013 from


Review: Validity of Work-Life Balance Assessment



Review: Validity of Work-Life Balance Assessment

            Research provides various constructs of evaluating work-life balance (WLB).  Regarding such assessments it is imperative to consider the validity of the testing.  A particular case recorded in the article, Satisfaction with work-family balance among German office workers (2010), examined the concept of WLB.  The test was qualitative and multifaceted, measuring satisfaction with work-family balance, organizational time expectations, psychological job demands, job insecurity, negative work-to-home interference, job control, social support at work, and controls.  Researchers utilized a Likert scale to evaluate each factor.  In addition, the assessment of each component is seemingly appropriate and comprehensive as they relate to the overall concept.  Hence, the degree of accuracy in relation to the process, technique, and tools utilized to determine the theory in question supports the validity of the study (Russ-Eft & Preskill, 2009).

            The first factor, satisfaction with work-family balance, was measured using three items: (1) “How satisfied or dissatisfied are you with the way you divide your time between work and personal life?”, (2) "How satisfied or dissatisfied are you with your ability to meet the needs of your job with those of your personal or family life?", and (3) "How satisfied or dissatisfied are you with the opportunity you have to perform your job well and yet be able to perform home-related duties adequately?" (Beham & Drobnic, 2010, p.677).  The range of responses went from 1, very dissatisfied, to 5, very satisfied.  The next element, organizational time expectations, evaluated worker’s views of the company’s time requirements.  This was measured from 1 – 5, strongly disagree to strongly agree.  Third, psychological job demands were examined using five items from the Swedish Demand-Control-Support Questionnaire (DCSQ).  The scale employed ranged from 1 – 4, never to always.  Then, job insecurity was tested by examining four items measured from 1 – 5, strongly disagree to strongly agree.  The fifth component, negative work-to-home interference, employed three items from the SWING Work-Home Interaction Survey Nijmegen.  Responses ranged from 1 – 4, never to always.  Sixth, job control was assessed employing two items from the DCSQ.  Responses ranged from 1 – 4, never to always.  The next factor, social support at work, addressed five items of the DCSQ and rated responses from 1 – 5, strongly disagree to strongly agree.  The last component, controls, considered the variables associated with the study; sex, age, organizational tenure, number of children, supervisor status, and type of organization.  Also, it should be noted, 716 online questionnaires were collected via email from the HR departments of 2 organizations, a financial service and an information technology company, both in Germany (Beham & Drobnic, 2010). 
            Although the method of testing was deemed suitable, researchers provided a listing of limitations associated with the study as well.  Such issues included the fact that cross-sectional non-experimental designs fail to account for causal conclusions.  Results are often considered vague in these instances.  Therefore, the authors recommend experimental research designs and longitudinal studies in forthcoming studies.  Additionally, the fields of work selected, financial service and an information technology, limit potential outcomes.  Future studies should include individuals from a larger sample of occupational backgrounds.  Another limitation affecting the validity of the research involves the reliance of participant feedback.  Individuals may refrain from correctly answering items.  Participants must respond openly and honestly concerning their occupational and leisure experiences inclusive of their opinions, beliefs, and emotions associated therein.


References

Beham, B. & Drobnic, S. (2010). Satisfaction with work-family balance among German office
workers. Journal of Managerial Psychology, 25(6), 669-689.
Russ-Eft, D. & Preskill, H. (2009). Evaluation in organizations: A systematic approach
to enhancing learning, performance, and change (2nd ed.). New York:
Basic Books.

Reliability



Reliability
            When selecting an IQ test to administer, one must consider the reliability of the testing.  Reliability is defined as the coherence of the instrument in conjunction with the information collected using said instrument as the data is gathered over time.  It is the repeatability of the measurement which determines the degree to which the test is consistently effective (Russ-Eft & Preskill, 2009).  Additionally, when estimating reliability various components should be considered.  Such factors include the differences between observed scores and true scores, how standard errors of measurement and reliability coefficients provide indicators of reliability, how reliability is estimated, and the various factors that can affect the reliability of a test.  The following discussion expounds on these topics.
            First, as part of the IQ test selection process observed scores and true scores ought to be defined.  Observed scores consist of every measurement taken for each test, whereas the true score marks the constant measure.  Furthermore, each observed score contains the true score in addition to a degree of error (Thorndike & Thorndike-Christ, 2009).  A test example is a typing assessment.  Typically, an individual’s words per minute (wpm) are recorded.  If an individual is applying for various administrative assistant jobs, undoubtedly they are required to repeatedly complete a typing test for speed and accuracy.  Each of the test scores exhibits observed scores, as the constant wpm represents his or her true score.
            Second, in relation to reliability, standard errors of measurement and reliability coefficients are imperative.  The standard error of measurement refers to the typical discrepancy of measurement among scores.  More specifically, this expresses the common repeated measures in which the score deviates from the true or average score (Thorndike & Thorndike-Christ, 2009).  Hence, the range of observed scores is consistent which supports the reliability of the test.  Concerning reliability coefficients, a constant relationship between varying elements throughout testing demonstrates reliability.  Order is maintained from test to test regardless of the observed scores.  Both concepts, standard error of measurement and reliability coefficients, are interrelated in estimating reliability.  Given time limitations, researchers often utilize a couple of measurements to approximate the average scores and deviations as well as the order of the set.  Observance of the scatter sustains reliability.  In addition, it is important to estimate the precision of a score in effort to verify whether the level of variability is due to errors in measurement versus an inconsistency in true scores.  Also, this process “represents a more exacting definition of the test’s ability to reproduce the same score (Thorndike & Thorndike-Christ, 2009, p. 137).
            The final points of discussion are the factors affecting reliability and the correlation between reliability and the level of confidence one can place on an individual’s score.  In general, there are four factors affecting reliability; variability of the group, level of the group on the trait, length of the test, and operations used for estimating the reliability.  First, variability of the group refers to the consistency of order.  In cases in which order is maintained, the reliability coefficient is more precise.  However, if the order varies from test to test, reliability is adversely affected.  Second, the level of the group on the trait regards the experience or proficiency of participants in relation to the featured characteristic.  Hence, test accuracy may vary considering field goal percentages among a group of high school basketball shooting guards versus field goal percentages of a group of NBA shooting guards.  The third factor affecting reliability is the length of the test.  Generally, the longer the test the more accurate the scores are due to the fact that mannerisms or performance are repetitively displayed.  Last of these, operations used for estimating reliability are critical.  In short, various methods of testing provide varying levels of reliability (Thorndike & Thorndike-Christ, 2009).  Now, in regard to reliability and confidence, there is a direct correlation.  The more reliable the test, the more confidence one can place on an individual’s score.  Thus, an unreliable test results in less confidence concerning the accuracy of the outcome.       

References
Russ-Eft, D., & Preskill, H. (2009). Evaluation in organizations: A systematic approach
to enhancing learning, performance, and change (2nd ed.). New York:
Basic Books.
Thorndike, R. M. & Thorndike-Christ, T. M. (2009). Measurement and evaluation in psychology and
            education (8th ed.).  Upper Saddle River, NJ: Prentice Hall.