1. Background

The modern educational reform movement has continually promoted high teaching quality since the 1980s. In the education system of Taiwan, a reliable evaluation system for probationary teachers has been developed. First, since the approval of the Teachers Cultivation Law in Taiwan by legislators in 1994, both colleges and general universities have offered education programs for teaching mathematics. Second, during the teachers’ probationary period, both an experienced math teacher and a university professor play the role of supervisors and mentors. However, not all experienced school teachers have mentor training experience. Furthermore, the opinions about the criteria for assessing probationary teachers may differ between school teachers and university professors. Third, after 2002, the probationary period for new teachers in secondary schools in Taiwan was changed from 1 year (two semesters) to half a year (one semester). Therefore, identifying ways to make the half year of apprenticeship as effective as a 1-year apprenticeship is crucial. Fourth, the vision of the 12-year compulsory education is “Making All Students More Successful,” which emphasizes that students are active learners. Core competencies refer to the knowledge, ability, and attitude that an individual must possess to adapt to daily life situations and face future challenges (Ministry of Education, 2014).

Based on the aforementioned concerns, establishing an appropriate quality assessment system that not only ensures that probationary teachers’ grades are objective and consistent but also maintains the quality of all future mathematics teachers is essential. This is the rationale for developing a set of indicators to serve as a fair standard for evaluating each probationary math teacher in secondary schools. Accordingly, this study applied a modified Delphi method to establish a set of feasible and practical indicators for the evaluation of the teaching competence of probationary mathematics teachers in Taiwan. Moreover, the proposed modified Delphi method was tested and verified.

2. Review of Literature on the Delphi Method

The Delphi technique is a process of collecting and refining the opinions of experts in order to obtain a consensus on a particular topic of present or future action, especially topics for which there is little knowledge of certainty (Dalkey & Helmer, 1963; Fischer, 1978; Hardy et al., 2004; Powell, 2003). The Delphi method has been widely applied in education, business, industry, heath care, and many other fields worldwide.

Although the Delphi method is notable for its democratic, structured approach and participant anonymity, little is known about the minimum level of agreement required to achieve a consensus, thus leaving the technique open to criticism (Goodman, 1987; Keeney et al., 2001; Osborne et al., 2003; Powell, 2003; Reid, 1988; Rowe et al., 1991; Williams & Webb, 1994). Concerns regarding this technique include how to achieve a consensus for an indicator and what is the minimum level of agreement to reach a consensus in a given situation.

Let us take a traditional Delphi consensus criterion as an example: A mean higher than 4 with a standard deviation less than 1. Table 1 presents the results for three indicators evaluated by 25 experts. They were rated on a 5-point Likert scale (ranging from 1 = strongly disagree to 5 = strongly agree). After the feedback from these experts was analyzed, all three indicators were found to meet the traditional criteria, but the consensus for each indicator did not seem to be consistent. For instance, 10 experts rated an indicator as “5”; 5 experts rated as “4”; and 10 experts rated as “3.” Scores of 5 and 3 accounted for 40% of the total scores in the expert group. Therefore, using the criterion “the mean higher than 4 with the standard deviation less than 1” to identify the degree of consensus does not seem suitable in this case.

3. Methods

Step 1: Set the target values

Because the expert panel in this study comprised 25 experts and a topic from the education field was selected, we used a higher criterion to reach a consensus. After discussion with other

researchers, we decided to use a 5-point Likert scale and set the following target values: (1) the percentage of respondents choosing “5” should be greater than 65% without no one choosing “1”or “2”; (2) For mean, if 17 experts choose “5” and 8 experts choose “3”, set the mean as 4.36. (3) For standard deviation, if 17 experts choose “5”, 4 experts choose “4”, and the other 4 experts choose “3”, set the SD as 0.77.

Step 2: Compare the observation results with the target values

After determining the target values, means, and standard deviations, we perform the statistical hypothesis process to reach a consensus as follows.

(1) Testing of the Standard Deviation

First, we compare the sample standard deviation for each indicator with the target value through statistical hypothesis testing. If the null hypothesis is not rejected, it indicates that the expert panel’s opinions are not consistent, and that the indicators must be modified and experts must be consulted again for further analysis until a consensus is reached. Otherwise, if the null hypothesis is rejected, it indicates that the experts have reached a consensus, and that further input from the expert panel is not needed. We then proceed with testing the mean of the indicator.

(2) Testing of the Mean

After the standard deviation is tested, the mean of the indicator is assessed to validate its’ importance.

If the null hypothesis is rejected, it indicates that a consensus has been reached for the expert panel’s opinions, and that further input from the expert panel is not needed. Otherwise, if the null hypothesis is not rejected, the indicators must be modified, and that the experts must be consulted again for further analysis until a consensus is reached. If the experts are consulted again, the standard deviation testing must also be performed again.

To sum up, this modified method is more stringent than the traditional Delphi method and more efficiently identifies final indicators because some ideal indicators may be preserved only through one or two rounds.

4. Results

This study found that a set of 4 aspects, 7 vectors, and 34 indicators are essential components of teaching competences of middle school student teachers.

5. Conclusion

(1) A set of indicators were established in this study by considering aspects such as mathematics teaching, class management, evaluation, and attitude. These indicators emphasized “perceptible,” “hands-on learning,” “motivation,” and “technology integration,” which corresponded to the core competencies of the 12-year compulsory education.

(2) The proposed modified Delphi method can be used to effectively and accurately judge the expert panel’s consensus. For example, a consensus was reached for the opinions of the expert panel regarding 16 indicators in round 2; therefore, the experts only had to judge the last indicators in round 3.