Czech self-efficacy scale for physical education majors towards children with disabilities

ish et al., 2006). The term adapted physical education (APE) describes the educational services delivered to students with disabilities in physical education (PE). Advocacy for inclusive practice, in which students with and without disabilities are educated together, began in the 1980s. This philosophical shift from separate to integrated programs was intended to ensure equitable opportunities, the use of appropriate supports and empower students with disabilities (Hodge, Lieberman, & Murata, 2012). The greatest barriers in the Czech Republic in the inclusion process are architectural and attitudinal barriers (Kudláček, Ješina, & Štěrbová, 2008). Teachers worry about the lack of support, their competencies as well as safety and participation of students (Rybová & Kudláček, 2013). If such barriers are not resolved teachers may not be empowered to believe that inclusion could be successful. Introduction


Introduction
Including students with disabilities into general physical education (GPE) is currently common practice in most European countries (Klavina, 2008;Kudláček, Ješina, & Flannagan, 2010;Morley, Bailey, Tan, & Cooke, 2005). While the inclusion process has been mandated by international legislation, many barriers have been encountered such as inadequate teacher training and professional development, lack of competence (Rybová & Kudláček, 2013), limited support (Rybová & Kudláček, 2013), large class sizes, and time and administrative demands (Kodish, Hodges Kulinna, Martin, Pangrazi, & Darst, 2006), as well as L. Baloun et al. Attitudes of mainstream teachers towards the inclusion of children with special needs in ordinary schools were surveyed by Avramidis and Norwich (2002) who described studies of teachers' attitudes towards inclusion from the review of literature. Kudláček, Válková, Sherrill, Myers, and French (2002) developed the Czech version of the Attitude toward Teaching Individuals with Physical Disabilities in Physical Education (ATIPDPE) which was based on the Theory of Planned Behavior (TPB; Ajzen, 1991;2005). This instrument was used in several studies Doulkeridou et al., 2011) which helped to describe attitudes of PE students toward individuals with physical disabilities. According to the TPB (Ajzen, 1991;2005) as well as the Social Cognitive Theory (SCT; Bandura, 1989;2001) there is a close relationship between attitudes and motivation to perform an activity (Říčan, 2009).

Self-efficacy
Self-efficacy is a concept based on a large theoretical framework known as SCT (detailed in Bandura, 1989;2001). According to Bandura (1997, p. 3) self-efficacy is described as "beliefs in one's capabilities to organize and execute the courses of action required to produce given attainments". Unless people believe they can produce a desired effect by their action, they have little incentive to act. Efficacy belief, therefore, is a major basis of action. People guide their lives by their beliefs of personal efficacy (Bandura, 1997). According to Bandura (1994) psychosocial skills contribute more heavily to career success than do occupational technical skills. Bandura (1994) notes that, the task of creating learning environments conducive to development of cognitive skills rests heavily on the talents and selfefficacy of teachers. Professionals with a high sense of efficacy about their teaching capabilities can motivate their students and enhance their cognitive development. Teachers who have a low sense of instructional efficacy favor a custodial orientation that relies heavily on negative sanctions to influence students to study.
Self-efficacy beliefs reflect one's capabilities to exercise control over events and estimations of competence to execute given tasks. Efficacy beliefs affect performance, influence the selection of tasks, and are a key factor in self-regulation of motivation (Humphries, Hebert, Daigle, & Martin, 2012). In PE research, teacher self-efficacy has been linked to professional development Martin, McCaughtry, Hodges Kulinna, Cothran, & Faust, 2008), and teachers and students behavior (Martin & Hodges Kulinna, 2004;Martin & Hodges Kulinna, 2005). Martin and Hodges Kulinna (2003), Stephanou and Tsapakidou (2007) conducted research with a selfefficacy scale on a sample of PE teachers. They were followed by Humphries et al. (2012), who developed the multi-dimensional teaching efficacy instrument specific to personal educators' for PE. One self-efficacy subscale of this instrument measures efficacy for teaching students with special needs in regular PE class. The importance of self-efficacy for PE teachers was emphasized by Pan's (2014) findings showing that there is a positive causal relationship among teachers' selfefficacy and students' learning motivation, learning atmosphere, and learning satisfaction in senior high school physical education. Hutzler, Zach, and Gafni (2005) applied the selfefficacy instrument to students of PE teacher education in regard to the inclusion of students with disabilities in general physical education. In relation to inclusive PE lesson Hutzler et al. (2005, p. 312) note: "Teachers who possess low self-efficacy expect failure in an inclusion setting, they apparently prefer avoiding the problem rather than seeking resources to confront it." Further Hutzler et al. (2005) claim: If teachers possess high self-efficacy, they do not perceive themselves as unable to cope with the expected norm and may perceive the situation as challenging. If, however, a teacher possesses low self-efficacy, the incongruence between the norm and the perceived self-efficacy would typically reduce the attitude towards participation of this child in the regular class. (p. 312) Block, Hutzler, Barak, and Klavina (2013) developed the instrument SE-PETE-D. It "can be used to identify the discriminative power of demographic variables hypothesized to be meaningful predictors of PETE majors' SE, including teaching experience, academic course work, practicum and nonteaching specific exposure to students with disability" (Block et al., 2013, p. 200). SE-PETE-D measures PETE majors' self-efficacy toward including students with intellectual disabilities (ID), physical disabilities (PD) and visual impairment (VI) in PE frameworks, including teaching skills, playing sport games and performing fitness activities (Block et al., 2013). Tekidou, Evaggelinou, Papaioannou, and Block (2015) determined physical educators' self-efficacy beliefs toward including students with disabilities (physical, intellectual and visual) in PE classes in Greece with using self-efficacy scale. Eden and Hutzler (2015) examined self-efficacy of PE teachers in elementary schools regarding teaching students with Pervasive Developmental Disorders (PDD), Attention Deficit and Hyperactivity Disorders (ADHD) or Hard of Hearing (HH) in regular classes. Jovanović, Kudláček, Following the vignette, three sets of questions with varying numbers were presented, focusing on how confident the respondent felt in specific context of conducting fitness testing, teaching sport skills, and organizing the actual playing of sport. An example of a question targeting organizing the actual sport with the class was: "How confident are you in your ability to make the environment safe for Ashton during the game?" (Block et al., 2013, SE-PETE-D).
The total number of self-efficacy scale items in SE-PETE-D is 25 (for all three scales ID, PD, VI). There are 6 items in the ID scale, 10 in the PD scale, and 9 in the VI scale There are 8 items in the original English version SE-PETE-D in the part named demographic questions (Block et al., 2013). For the Czech version of SE-PETE-D, 3 items were added in part with demographic questions. They are questions about gender, university attended, study program and field of study.
Participants rated their degree of confidence to complete situational-specific GPE activities for each of the targeted disabilities on 1-5 scale. Bandura (2006) recommends item scaling deviation for 0-10, but SE-PETE-D was created according Myers, Wolfe, and Feltz (2005), which suggest that 1-5 scale is effective when measuring self-efficacy as 0-10 scale. The 1-5 scale was used in the SE-PETE-D with the following criteria: 1 = no confidence, 2 = low confidence, 3 = moderate confidence, 4 = high confidence, 5 = complete confidence (Block et al., 2013). Each disability scale was interpreted separately for use in the data analysis.
Data analysis in original version of SE-PETE-D was carried out in several steps. Expert review was established to ten professors from the U.S. and Europe with expertise in self-efficacy theory and five graduate students with expertise in APE. The assessment of construct validity was conducted with using of confirmatory and exploratory factor analyses. The data processing was done separately for all three scales. Coherence items were subjected Cronbach analysis and were ranged from .73 to .89. Internal consistency was .86 for ID scale, .90 for PD scale, and .92 for the VI scale The total sample for the second phase of validation was 486 participants (170 females, 316 males) for the original English version of SE-PETE-D. The participants' ages ranged from 19 to 46 years (Block et al., 2013). Block et al. (2013) indicated following factor structure: The ID subscale included two factors, F1 peers' instruction (items no. 1, 4, 6) and F2 staying on task (items no. 2, 3, 5). The PD subscale included three factors, F1 specific adaptations (items no. 1, 2, 5, 7), F2 peers' instruction (items no. 10, 8, 3) and F3 safety (items no. 4, 6, 9). The VI subscale included two factors, F1 specific adaptations (items no. 7, 6, 9, 1, 4) and F2 peers' instruction (items no. 2, 8, 5, 3). Block, and Djordjević (2014) used Serbian version of the instrument SE-PETE-D for determination of the level of self-efficacy among students of Sport and Physical Education at three universities in Serbia. Taliaferro, Hammond, and Wyant (2015) determined if participation in an APE courses with associated practicum improved preservice physical educator self-efficacy beliefs toward inclusion of individuals with specific disabilities over time and whether changes in these beliefs differed depending on the course. This study (Taliaferro et al., 2015) sought to investigate how these experiences affected participants' development of self-efficacy beliefs. Results showed that a combination of APE coursework and practicum experience is an effective means to influence preservice physical educators' beliefs toward inclusion (Taliaferro et al., 2015).
The main purpose of this study was to determine the validity and reliability of the Czech version of SE-PETE-D in a Czech setting.

Methods
Instrument SE-PETE-D contains four parts (three self-efficacy scales and a demographic part). They are three scales referring to ID, PD and VI; the fourth part is called "demographic questions" and contains questions about e. g. age, study program, field of study or experience with adapted physical activity. After a detailed description of the purpose of the survey and how to complete the survey, three self-efficacy scales (ID, PD, VI) follow, each preceded by a vignette demonstrating a student with an ID, PD or VI, who would be attending a GPE class. Below is an example of the vignette for a student with PD: Ashton is a high school student with a spinal cord injury. He cannot walk, so instead he pushes himself in his wheelchair to get around. Ashton likes playing the same sports as his classmates, but he does not do very well when playing the actual game. Even though he can push his wheelchair, he is slower than others and tires after pushing his chair for only 1-2 minutes. He can pass and serve a volleyball, but not far enough to get it over the net. He can catch balls tossed straight to him. However, he does not have the upper body strength to shoot a basketball high enough to make a regulation basket. Because he cannot use his legs, he cannot kick a soccer ball, but he can push the ball forward with his chair (Block et al., 2013, SE-PETE-D).

Translation
Given the nature, education and cultural background (USA, Israel and Latvia) of the authors of the SE-PETE-D questionnaire and the number of language variations in which the survey has been already transformed, we can confirm that this tool is ready to address the needs of various cultural backgrounds. For translation we followed the back-translation procedure described in Banville, Desroisiers, and Genet-Volet (2000), a technique based on the work of Brislin (1970;1986). This technique is recommended for its capacity to reflect potential cultural differences between the culture of the original language and new cultural settings. The main advantage of back-translation is that it gives translators some control over the instrument development stage since they can examine original and back-translation versions and make inferences about the quality of the translation (Brislin, 1986). In this study, the four translators (A, B, C and D) were bilingual researchers. First, two translators (A and B) translated the original English version into Czech language. Having two persons doing the translation in parallel avoids the bias that only person might have. When translators A and B finish, they compare their versions. If differences are found, the translators engage in discussion to arrive at a consensus. The Czech version from this collaborative work was then given to two other bilingual persons in the field of adapted physical activity (C and D) who retranslated the instrument back into English. Neither C nor D knew the original version. The next step consists of evaluating the versions by C and D and compares each statement to the original version to determine similarities. The evaluation is done by a three-person committee. The purpose of the committee is to prevent possible biases by a single researcher. If the meaning of the retranslated statement is the same as the original, the translated statement is kept. All of the 25 statements were closely analyzed by the committee. In our study, no statement from C or D, when compared with the original, was identical. Backtranslation revealed bias first, because the first part of all 25 statements in the back-translation version was different from statements in original version. The first part of all 25 statements in back-translation version was formulated as follows: Are you confident that you are able to…? (the answer of this question is yes/no), but in the original version, the first part of all 25 statements was How confident are you in your ability to…? (the answer to this question is one number from 1-5 scale). All modifications to the Czech version in the first part of all 25 statements were made by the committee. After these changes, statements in the backtranslated version were comparable with the original version (How confident are you in your ability to…?).
Then the committee determined that 17 statements (68%) had the same meaning. The 8 remaining statements were closely analyzed by the committee and other modifications were made to the Czech statements so the meaning of the Czech and original statements would be the same while respecting the Czech syntax. Multiple versions of problematic statements were provided to committee. When the committee was satisfied with all statements, a final experimental version was developed. Relevant terms were adapted from the US culture to the Czech culture e.g. 9 th grade at high school was used in the original version and we changed it to 9 th grade in secondary school, because the Czech school system has 9 th grade at secondary school.

Experts' reviews
The experts' reviews were the next step after the backtranslation procedure. Two Czech experts of kinanthropology and adapted physical activity performed this step. They were asked about clarity, conciseness and terminological precision of SE-PETE-D. Experts followed content validity protocol evaluating suitability of every statement for the purpose of assessing self-efficacy with relation to the inclusive physical education. They were asked to evaluate each statement and also provide feedback to the completed questionnaire.

Verification of clarity items
Fourteen master's students (11 females, 3 males) of APE (Faculty of Physical Culture, Palacký University Olomouc) conducted the verification of clarity items and whole text. The time required to complete the questionnaire was also monitored. The results of this study were presented by Baloun, Kudláček, and Ješina (2013).

Participants
We determined a minimal sample size in accordance with recommendation of Anderson and Bourke (in Gavora, 2010): 10 participants are required for one item for assessment validity and reliability of the questionnaire. That equals to 250 participants (25 items in SE-PETE-D). The data collection for testing of reliability and validity was conducted on the total sample, which was created from 304 participants. The responses of 52 participants were eliminated from analysis due to incomplete data. Total sample for validation for the Czech version was 252 participants (101 females, 151 males). The male participants' age ranged from 19 to 49 years and the mean was 24.28 ± 4.26 years. The female participants' age ranged from 19 to 43 years and the mean was 23.57 ± 4.31 years. Out of all the participants, 180 were regular students and 72 were distance students.
The participants were students of PE from Czech public universities -84 students were from the Faculty of Physical Culture (Palacký University Olomouc), 63 students were from the Faculty of Sport Studies (Masaryk University in Brno), 39 students from the Faculty of Physical Education and Sport (Charles University in Prague), 46 students from the Faculty of Education (University of West Bohemia in Pilsen) and 20 students of the Faculty of Education (Jan Evangelista Purkyně University in Ústí nad Labem). The data collection for this research was conducted in 2013. Participants signed informed consent. This study was approved by the Ethics Committee of the Faculty of Physical Culture, Palacký University Olomouc (no. 3/2013).
The distribution of participants across the years enrolled in faculty was 33 in the first year (Bachelor), 61 in the second year (Bachelor), 34 from the third year (Bachelor), 61 from the fourth year (1 st of Master) and 63 in the fifth (2 nd of Master) or above. One hundred and fifty-seven participants reported having attended APE course and 95 reported not having attended any APE course. Our sample of participants was created from 77 students of PE teacher education program, 111 students of PE and sport, 40 students of adapted physical education and 24 students of adapted physical activity. For the Czech version, 3 items in the part with demographic questions were added. They are questions about sex, current university, and study program and field of study.

Data analysis
We used the test of internal consistency Cronbach's alpha coefficient. Cronbach's alpha coefficient was evaluated within each subscale.
Test-retest was used to examine stability among items in each subscale. The sample of participants for assessment test-retest reliability was created from 17 master's students (17 female, mean age 23.82) of APA or APE. The period between the test-retest was 14 days. Test-retest reliability of each of the subscales was assessed by using Spearman's rank correlation.
The factor structure and construct validity were assessed by Confirmatory Factor Analysis (CFA). Based on the Block et al. (2013) results of factors structure the data were analysed by means of CFA. CFA can be used to test hypotheses and provide rather more in the way of inferential statistics. Typically, a CFA is used to compare alternative factor structures using appropriate measures and samples (Giles, 2000). CFA produces were performed using structural equation modeling (SEM). SEM is a set of techniques for testing a theory by examining correlation, covariance, and even means differences between sets of dependent and independent variables of all shapes and size. SEM can be illustrated using an elaborate and sophisticated form box-and-arrow model known as a path diagram. SEM is not a fixed method like ANOVA (Giles, 2000).
The goodness-of-fit of each model was assessed using chi-square, normed fit index (NFI), comparative fit index (CFI) and root mean square error of approximation (RMSEA) in accordance with the procedure used by Block et al. (2013). Insignificant chi-square results at a .05 threshold are considered as an acceptable model fit (Hooper, Coughlan, & Mullen, 2008). Values greater than .90 or .95 are considered as an acceptable model fit for the NFI and CFI (Block et. al., 2013;Giles, 2000, Hooper et al., 2008. RMSEA values below .05 are considered to reflect good fit to the model, values .05-.10 moderate fit, and values greater than .10 bad fit (Hooper et al., 2008). More recently, it was generally reported in conjunction with the RMSEA and in well-fitting model the lower limit is close to 0 while the upper limit should be less than .08. We used SPSS software (Version 20.0; IBM Corp., Armonk, NY, USA) and AMOS (Version 23; IBM Corp., Armonk, NY, USA) for processes of determine validity, reliability and evaluation descriptive statistic.

Results
The findings of the study are reported in terms of the descriptive analysis, the reliability analysis and CFA with structural modeling and goodness of fit indices.

Reliability evidence
The descriptive statistics (mean and standard deviation) of instruments' items and scales' Cronbach's reliability coefficients are presented in Table 1. Cronbach's alpha reliability for all items in each of the scales was good in two cases (α for PD scale = .87, α for VI scale = .90) and acceptable for ID scale (α = .76). The descriptive statistics (mean and standard deviation) of ID, PD, VI scales according to year of study are presented in Table  2. The Spearman correlation coefficient for assessing test-retest reliability was ranged from .53 to .78 (r = .78 for ID scale, r = .53 for PD scale and r = .69 for VI; Table 3).

Construct validity (CFA)
CFA were performed for the Block's et al. (2013) models that were found by EFA and next tested by CFA. The results of the CFA and path diagram can be seen in Figures 1, 2, 3. To improve the model-data fit, we created a "modification index". Several errors were suggested to be correlated. These correlated variables   share some content, such as, for examples, instruction in items 5 and 6 (ID scale). All items had factor loadings lower than .50 and we did not delete any items. CFA analysis of ID subscale was conducted based on 6 items, two-factor model. The chi-square statistic was χ 2 = 6.89 (df = 7), (p = .44) and all paths were significant in the two factor model. The other fit indices indicated that the two factor model had acceptable fit: CFI = 1.00, NFI = .98 and RMSEA < 0.001. CFA analysis of PD subscale was conducted based on 10 items, three-factor model. The chi-square statistic χ 2 = 78.61 (df = 30) was highly significant (p < .001) and all paths were significant in the two factor model. The other fit indices indicated that the two factor model had acceptable fit: CFI = .96, NFI = .94 and RMSEA = .08 CFA analysis of VI subscale was conducted based on 9 items, two-factor model. The chi-square statistic χ 2 = 53.88 (df = 20) was highly significant (p < .001) and all paths were significant in the two factor model. The other fit indices indicated that the two factor model had acceptable fit: CFI = .97, NFI = .96 and RMSEA = .08.

Discussion
The purpose of this study was to determine the validity and reliability of a disability-specific self-efficacy instrument SE-PETE-D in a Czech setting. As reported in the methods section we conducted measurement of Cronbach alpha coefficient, a test-retest for reliability and CFA for assessment of validity. The results for coefficient of Cronbach's alpha reached a good value (α for PD scale = .87, α for VI scale = .90, α for ID scale = .76) because Nunnally (in Panayides, 2013) recommends coefficient of reliability .70 or higher. According to stricter measure (Coolican, 1999) high values are around .75 up to 1 for Cronbach's alpha coefficient and our results met these values.
The results of test-retest showed that two subscales (ID and VI) obtained a satisfactory coefficient (r > .68). The correlation coefficient for PD scale is lower than recommended .60 (Vallerand in Banville et al., 2000). However, the test-retest reliability requires additional research, because the sample size was only 17 female master students and for more precise data minimal 20 participants are recommended in accordance with Banville et al. (2000). Findings from CFA of the SE-PETE-D in the Czech sample indicated that Czech students exhibit the same factor structure as the larger American sample tested by Block et al. (2013).
Similarly like in Block's et al. (2013) results of the chi-square were significant in all models except for ID (p = .44), other goodness of fit measures demonstrated acceptable model fit. For example, in three evaluated models the only NFI for PD scale (.94) did not exceed the strictest cutoff criteria .95 referred to in Hooper et al. (2008). Taliaferro et al. (2015) noted that the factor structure of the self-efficacy scale in their study differed from previous findings by Block et al. (2013) proposing 2 or 3 factor in each of the ID, PD, and VI subscales. Taliaferro et al. (2015) found a unidimensional factor structure in the ID, PD, and VI subscales. According to Osborne and Fitzpatrick (2012) the replication of factor structure is problematic and when the same model is applied to a new sample, the model is rarely a good a fit. Taliaferro et al. (2015) hypothesized that the factor structure of SE-PETE-D may not be stable or generalizable.
We did not delete any items, because factor loadings were on appropriate values. When the factor loadings are lower than .30 the researcher has to eliminate given items from the test. In a situation when the researcher works with multifactor test he/she then accepts items with high factor loadings only in one factor and with factors loading close to zero in another factors (Urbánek, Denglerová, & Širůček, 2011). Results from the study of Jovanović et al. (2014) have shown significant differences among students' level of self-efficacy from three universities toward teaching student with ID and PD in GPE classes. Differences between genders were tested by t-test for independent samples. There were no significant differences found between genders (Jovanović et al., 2014). The Czech version of SE-PETE-D (SE-PETE-D-CZ) will be used to study the impact of APA training programs (e.g. workshop, course and training) on the self-efficacy of general physical educators or students of GPE to include student with disabilities into the GPE setting. Previous research has found that one day workshops grounded in social cognitive theory did not have a significant positive impact on physical educators' self-efficacy beliefs toward inclusion (Taliaferro & Harris, 2014).
These preliminary results suggest that the SE-PETE-D-CZ is an appropriate instrument for measuring selfefficacy toward including students with intellectual disabilities, physical disabilities and visual impairment in PE frameworks. However, the instrument development and the establishment of validity and reliability is an ongoing process. The first limitation of our study was the use of the 5 point Likert scale. Preston and Colman (2000) argue that for the determination of reliability, stability and inter consistency, 1-7, 1-9 or 1-10 points scale are more appropriate. The second limitation of the current study is the fact that the scale stability (testretest reliability) was carried out by a small sample of participants (N = 17). The third limitation of the study is the relatively low number of master's students in a sample of participants.

Conclusion
Cronbach's reliability analyses, performed in each of the subscale, confirmed their internal consistency. Testretest reliability showed satisfactory Spearman correlation coefficient for ID and VI scales. CFA performed also in each of the subscale, confirmed two factors in ID scale (F1 peers' instruction, F2 staying on task) three factors (F1 specific adaptations, F2 peers' instruction, F3 safety) in PD scale and two factors (F1 specific adaptations, F2 peers' instruction) in VI scale. Moreover, it has demonstrated acceptable measurement model fit and we suggest that there is sufficient evidence to sustain the construct validity of the scale. Therefore, the results provide justification for the use of the SE-PETE-D in its Czech version. The particular subscales with the items confirmed exhibit significant construct validity evidence that they can be used to explore the impact of different programs on self-efficacy in PE teacher education students but we have to continue in the process of validation of SE-PETE-D-CZ in the Czech setting. In conclusion, the SE-PETE-D is a promising measure for assessing the PETE majors' selfefficacy toward including students with intellectual disabilities, physical disabilities and visual impairment in PE frameworks, including teaching skills, playing sport games and performing fitness activities.