PhD in Second Language Studies Dissertation Defense
A validation argument for cloze test item function in second language assessment
Chair: James Dean Brown
Monday March 14, 11:00 a.m.–1:00 p.m.
Moore Hall, Room 155A
This study presents a validation argument for cloze item function with the goal of systematically identifying what it is that cloze items are able to measure and whether or not this differs across native users of English and second language test takers. While cloze tests have been said to measure reading ability in L1 contexts, most L2 research has linked their function to measures of general language proficiency. In addition, studies have shown that cloze tests measure comprehension at a higher-level than just sentence level knowledge, though doubts still remain as to what kinds of interpretations can be made about cloze test performance for L2 learners. This study explores this issue more closely through examining evidence to construct a warranted interpretation/use argument (Kane, 2013) for cloze item function.
In order to accomplish this, performance on fifteen 30-item cloze passages was examined for both native users of English (n = 675) and second language learners from Japan (n = 698) and Russia (n = 1548). Passages were examined in relation to their composition, syntactic structure, and cohesive features based on a number of coded and automated text analysis methods. Item-level features for each of the 450 cloze items were also classified according to their composition and their interaction with surrounding context. For the latter, native user data were collected (n = 2475) to gather information about the levels and directionality of contextual information utilized by the individual items through a series of sentence completion tasks. These tasks, as well as L1 cloze test data, were gathered using Amazon Mechanical Turk, an online, crowd-sourced platform for data gathering. Variables for each passage composition, syntactic structure, cohesion, and item characteristics were explored in relation to Rasch logit measures of item difficulty for both the L1 and L2 cloze datasets, and used to create a structural equation model explaining the influence of each on item function.
The results indicated that cloze items performed very similarly for both native users and second language learners of English, with the only major difference being that the items were more difficult for the L2 test takers. Items were found to access context on multiple levels, and different classes of items functioned well for both groups of examinees. Other similarities were also observed through test analyses, item analyses, and correlations between item difficulty and passage-level variables. In addition, a single structural model was able to explain item function for both groups. Differences were found for L1 and L2 examinees in how factors for passage composition, structure, cohesion, and items related to item function, however, even these results seemed more in line with differences in the ability of the examinees rather than differences in the construct being measured for each. Overall, the evidence pointed to cloze items measuring the same construct for L1 and L2 test takers, and furthermore that this construct is closely related to reading ability.