The previous chapter dealt with the effects of SR and the other variables on mean token durations. This chapter will address the first research question of this thesis; specifically, which variables had an effect on whether or not a vowel was devoiced.
Recall from Ch. 2 it was argued that, following the analysis of Tsuchida (1997), devoicing is actually two processes: a phonological one involving relinking of a preceding obstruentÕs [+spread glottis] to a following high vowel, with or without a temporal shift in the alignment of the feature; and a phonetic process that operates in a gradient fashion when phonological devoicing does not apply. TsuchidaÕs analysis was based on examination of glottal spreading and muscle activity traces; additional evidence for a shift of productions from the current data set supported that analysis with the modification that not all speakers display a temporal shift in the alignment of the glottal spreading gesture--one participant in this study lengthened the spreading gesture instead.
The outline of this chapter is as follows. §6.2 presents discussion of the statistical distribution of data from the current data set that lends further support for positing both phonological and phonetic loss of voicing of high vowels. §6.3 reviews the design of the statistical models used to test the effects of selected variables on the frequency of devoicing, while §6.4 presents the results of those tests. Finally, §6.5 summarizes this chapter.
This section provides further evidence to support the positing of two vowel devoicing processes in Japanese in the form of statistical examination of the data.
As noted in §2.2.3, one of the reasons gestural overlap was applied to HVD in Jun & Beckman (1993) was to account for the wide range of voicing durations seen at a given SR. This wide range of values, or gradiency of values (Beckman 1994), can be seen in the current data set as well in the form of a plot of token durations against vowel voicing durations that will now be presented below.
SR was judged in the current study as a function of the token duration (see §4.6.3.1). Within a given interval of token durations, a wide range of voicing duration values can be seen. As an example, the voicing durations and token durations of each participantÕs productions of the token chichi ÔfatherÕ containing a voiced 1st mora vowel are plotted against each other in a scattergram. Plotting one variable against the other in this fashion allows changes in one variable to be seen in the context of changes in the other, and the relationship between them can be both visually inspected and quantitatively measured.
The first plot gives the data points for all voiced 1st mora vowels from the token chichi, while the 2nd gives only those data points where the 2nd mora vowel was not devoiced; i.e. both vowels were voiced, disallowing large changes in overall token durations due to devoicing of the 2nd vowel. Participant SM is not included in these plots since this participant did not produce any voiced 1st mora vowels for this token.

Figure 6.1 Scattergram of token duration and voicing duration for voiced 1st mora vowels of token chichi ÔfatherÕ; 9 participants; all productions.

Figure 6.2 Scattergram of token duration and voicing duration for voiced 1st mora vowels of token chichi ÔfatherÕ; 9 participants; no devoiced 2nd mora vowels.
As can be seen, in the interval between token duration = 267 ms to 572 ms, the duration of the fo activity (voicing duration) for the first vowel of chichi varied from between 20 ms to 157 ms.
Note also that the variation in voicing duration is not dependent on the variation in token duration. Although there are many longer voicing duration values at the higher values of token duration, and the minimum voicing duration values increase as token duration increases, there is still a great deal of variation in the voicing duration values (e.g. for the token duration range of 450 ms to 500 ms, voicing duration varies from appr. 35 to 115 ms). This is true even in the second plot, where tokens containing a devoiced 2nd vowel which would have affected overall token durations have been excluded. The wide range of observed values indicates that adjustments in voicing duration are being made separately from global adjustments in token duration. Gestural overlap of the devoicing gesture for the obstruent preceding the vowels within a given range of freedom provides an explanation for both the variation of voicing durations and its independence of token duration. However, further research is needed to determine the link between vowel durations (as evidenced by both voicing and formant structure activity) and duration of voicing activity.
Statistical examination of the voicing durations found in the current data set that lend support to two devoicing processes will now be presented.
Further support for the characterization of devoicing of high vowels as phonological can be found in both visual and statistical examination of the current data set. This examination also provides evidence for gestural overlap of voicing instructions of high vowels in all devoicing environments, contrary to the restriction of overlap to high vowels between fricatives posited in Tsuchida (1997). (Recall that in that work it was found that only between two voiceless fricatives were two distinct muscle activations and associated glottal spreadings observed; in all other environments, one muscle activation accompanied by one temporally shifted glottal spreading was found.)
In order to examine the relationship between voicing durations and SR, the two variables voicing duration and token duration are presented together. Scattergrams of voicing duration vs. token duration for this data set are given below. In each of the following scattergrams, token duration as an indicator of SR is along the x-axis (token duration), while vowel voicing duration for each vowel is along the y-axis (voicing duration) (note 6-1). The 1st plot gives the scattergram for all vowels, while the next 3 give the scattergrams for the fast, normal and slow SR vowels, respectively.

Figure 6.3 Scattergram plot of vowel voicing duration vs. token duration, all SRs (total n = 3598; devoiced n = 1696; voiced n = 1902).

Figure 6.4 Scattergram plot of vowel voicing duration vs. token duration, fast SR only (total n = 1200; devoiced n = 702; voiced n = 498).

Figure 6.5 Scattergram plot of vowel voicing duration vs. token duration, normal SR only (total n = 1200; devoiced n = 562; voiced n = 638).

Figure 6.6 Scattergram plot of vowel voicing duration vs. token duration, slow SR only (total n = 1198; devoiced n = 432; voiced n = 766).
It can be seen that the distribution of the data in each of the 3 SR data sets is roughly the same: there is a main concentration of data points where voicing duration > 0 and another concentration of data points along the voicing duration = 0 axis. The points on the voicing duration = 0 axis can be seen to extend across the entire range of token duration values for which there is a non-zero value for voicing duration for all 3 SRs.
The strength of the relationship between any two variables shown in this type of plot can be determined by examining three things: 1) the distribution of the data points; 2) the placement of what is known as the regression line; and 3) the size of the Pearson correlation coefficient (r), a measure of how interdependent two variables are. Examination of all 3 of the current data sets supports the separation of the voicing duration = 0 data from the voicing duration > 0 data.
One indication of the relationship between the two variables in this type of plot can be seen in the way the data points are distributed about the plot. In a case such as this, where token and voicing durations are being compared, the relationship between the two variables is expected to be linear. That is, a given increase in token duration is expected to produce a given increase in voicing duration.

Figure 6.7 Representation of the linear relationship between voicing duration and token duration.
In general, the stronger a linear relationship is between two variables, the more tightly the data points will be clustered about some centrally-located line. A maximally strong linear relationship (e.g. y = 3x) would result in all data points falling along a single line; less strong linear relationships would result in the data points ranging over a greater area; no relationship would result in data points spread out throughout the plot. This is represented below, where the diagonal line and shaded areas represent the potential spread of data points about the field defined by the two variables (Hatch & Lazaraton 1991: 430).

Figure 6.8 Idealized data distribution in a scattergram due to linear relationships of various strengths between two variables.
In a case such as this study, where there is a positive correlation between the variables (i.e. voicing duration increases as token duration does), then the band of data points should rise across the plot as in Figure 6.8 b above. For the current data set, this can be seen for the main cluster of data points in Figures 6.3 to 6.6; although somewhat loosely grouped, the main cluster of data points does rise diagonally across all the plots. This confirms a basically linear relationship between the two variables.
It can also be seen that data points corresponding to devoiced vowels (i.e. those data points on the line voicing duration = 0) are separate from the main cluster of data points; they stretch out along almost the entire token duration axis, and, as noted earlier, across the entire range of token duration values where voicing duration is non-zero for each of the 3 plots for each SR.
From a statistical viewpoint, this immediately indicates that these two groups of data points, the zero values and the non-zero values, should be analyzed separately (Abacus Concepts 1994: 318). This is because the zero values appear to be largely independent of voicing duration (i.e. voicing duration = 0 values occur from token duration appr. 140 ms to 680 ms) while the non-zero values appear to be generally dependent on SR (i.e. on average voicing duration increases along with token duration). Phonologically, the many data points lying along the length of the token duration axis indicate that vowels devoice independent of how fast or slow the SR is as judged by the duration of the token. This supports the contention that devoicing is no longer a strictly fast-speech phenomenon (Vance 1987; Beckman 1994; Varden 1997; Kondo 1997; Tsuchida 1997).
Simple observation of the distribution of the data points from the current data set, then, supports the position that two processes are responsible for the distribution of the data--one process responsible for the devoicing of vowels corresponding to the voicing duration = 0 data points, and another responsible for the main cluster of voicing duration > 0 data points rising diagonally across the plot.
Confirmation of this relationship between two variables is how well a regression line can be fitted to the data points.
A regression line can be thought of as the ÔaverageÕ of all of the data points; the regression line is calculated so the that the sum total of all of the distances from each data point to the regression line is minimized. It is an idealized central distribution line describing the linear relationship between the variable on the x-axis and the variable on the y-axis. Since it is calculated from the data points in a given plot, it is said to be fitted to the data of the plot; it is also known as a fitted regression line.
Again, in the current data, for any increase in token duration that we see, we expect a corresponding average increase in voicing duration as determined by the slope of the regression line.

Figure 6.9 Relationship between token duration increases and voicing duration increases as defined by the slope of the regression line.
A scattergram plot with a regression line fitted to the data is known as a regression plot. The regression plot for the current data set is given below.

Figure 6.10 Regression plot of vowel voicing duration vs. token duration, voiced and devoiced vowels (total n = 3598; devoiced n = 1696; voiced n = 1902).
The location of the fitted regression line with respect to a cluster of data points indicates how well the overall linear relationship described by the line accounts for distribution of data points in that cluster. If the fitted regression line is centrally located with respect to a cluster of data points, then it provides a good indication of the relationship between the two variables.
As can be seen above, the fitted regression line falls well below the main cluster of data points. This is due to the many data points along the voicing duration = 0 axis; the heavy concentration of data points falling on this axis draws the fitted regression line away from the center of the non-zero voicing value data points. Fitting the regression line to both groups of data points results in a line that describes neither group well. The regression line in this plot does not give a good indication of the expected linear relationship between the two variables.
However, the location of the regression line fitted to only the voicing duration > 0 data points does give a good description of the expected linear relationship between the two variables. This can be seen below, where the fitted regression line falls in the middle of the main cluster of data points.

Figure 6.11 Regression plot of voicing duration vs. token duration, voiced vowels only (n = 1902).
This same pattern can be seen in plots of each SR by itself, indicating that the non-zero productions at all three SRs were distributed according to the same mechanism.
This distribution of data points again indicates that the dependency of voicing duration on token duration is best described by taking into account only the voiced vowels (i.e. voicing duration > 0) and not including the devoiced vowels (i.e. voicing duration = 0). The extension of this is that the devoiced vowel durations are due to a different process than the voiced vowel durations.
In addition to simply observing the location of the fitted regression line with respect to the apparent middle of the data cluster(s), the spread of the data points about the common center line defined by the fitted regression line can be given a numerical value known as the Pearson correlation coefficient.
The Pearson correlation coefficient (r) is a ratio of the degree to which two variables change together and the sum of the degrees to which they change individually (e.g. the change in X and Y together divided by the sum of the change in X and the change in Y). This ratio gives a measure of how closely correlated two variables are. The higher the correlation between the two variables, the tighter the data points in a regression plot like the one in Figure 6.11 above. A maximally strong linear relationship where all data points fall on a single line would have an r of 1 (Figure 6.8a above); if there is absolutely no correlation between the two variables, r will be 0 (Figure 6.8c above).
The sign of r indicates the direction of the correlation. If r is positive, there is a positive correlation between the two variables--the dependent variable will increase as the independent variable increases. If r is negative the correlation is inverse--i.e. the dependent variable will decrease in value as the independent variable increases.
The coefficient of determination for the plot in Figure 6.10 above of all data points is r = +.744 (note 6-2). This can be interpreted to mean that about three-fourths of the variation seen in voicing durations is caused by whatever is causing the variation in token durations, taken to be the participantsÕ changes in SR. This value is well above the minimum significant value given in most statistical tables due to the large size of this data set (n = 3598); e.g. the minimum significant value in the table in Gravetter & Wallnau 1996 for the type of test run here (a two-tailed test at the .05 level of significance) for a sample of 100 data points is only |.195|.
However, an higher correlation between the voicing durations and token durations can be achieved by removing all of the devoiced vowels from the data. The PearsonÕs correlation coefficient for the plot of only voiced vowels (Figure 6.11 above) is r = .960. This can be interpreted to mean that about 96% of the variation seen in the voicing durations of the voiced vowels can be attributed to whatever caused the variation in the token durations--again, taken to be the change in participantsÕ SR.
This exceptionally high value for r confirms that the relationship between vowel voicing durations and token durations is best analyzed with all devoiced vowels removed from consideration. That is, taking the mechanism that causes full devoicing to be different from the mechanism that causes vowel voicing durations to vary with token durations is the analysis that best accounts for the distribution of the data.
That voicing durations are dependent on token durations in a linear fashion that is more flexible at slower SRs for voiced vowels only is further supported by examining the data for each subset of data when the entire data set is divided by each SR, each mora, each participant, and each token separately. In each case, the same increase in r can be achieved by including only those vowels that were voiced--that is to say, regardless of which subset of the data one looks at, a better description of the relationship between vowel voicing and token durations is achieved by considering only the voiced vowels.
The following table lists the relevant regression values for voicing duration for all vowels, and for voiced vowels only when the data is split by SR, mora, participant, and token. (As before, the one outlier as token duration = 867 ms has been excluded.)
Table 6.1 Regression summary for voicing duration vs. token duration; full data set and segregated by variables; all vowels and voiced vowels only.

In all cases a better description of the interdependence of token duration and voicing duration (i.e. how much vowel voicing durations are changing with changes in token durations) is provided by restricting the data to only the voiced vowels. Again, this strongly confirms that the voicing duration = 0 data should not be included in the analysis of the rest of the data where voicing duration > 0; these two data subsets are due to different processes.
This is taken to indicate that the voicing duration = 0 data (i.e. the devoiced vowels) are due to a phonological process of devoicing that operates at all SRs. The relationship between token duration and voicing duration of the remaining data where voicing duration > 0 are taken to be due to a phonetic process (i.e. variation of the strength and/or duration of the glottal spread associated with the preceding consonant). In addition to the global shortening of voicing durations as SR increases, the range of voicing duration values found at any one value of token duration on the scattergrams above indicates that the amount of overlapping of glottal instructions is somewhat variable. It remains for further research to determine whether this variability is due to lengthening of the glottal spreading gesture, shift in the temporal alignment of the gesture, or some combination of the two.
The following four considerations were discussed in §2.2.5 and §6.2.1 above:
1) devoicing of high vowels between plosives occurs even at slow SRs where there is ample time to initiate any existing articulatory instructions;
2) there is only one glottal muscle activation/spreading gesture when high vowels are devoiced, but two glottal muscle activations/spreadings when non-high vowels and high vowels between fricatives are produced (Tsuchida 1997);
3) observed voicing values are gradient for any given SR (i.e. token duration) for both high vowels (current data) and non-high vowels (Maekawa 1990); and
4) disregarding devoiced vowels in examining the relationship between token durations and voicing durations is justified by the distribution of the data.
Based on these considerations, it is concluded that both processes affecting voicing are required to account for the data observed in this and other studies. In summary, the data from the current study support two loss of vowel voicing processes:
1) a phonological process is responsible for the devoicing of high vowels between voiceless plosives (i.e. a spread of [+sprd glottis] from a preceding stop or fricative) which may occur at all SRs; and
2) a phonetic overlapping of voicing instructions by the glottal spread associated with the preceding consonant which increases with SR is responsible for a loss of voicing duration in all vowels; it is this phonetic mechanism that causes devoicing of non-high vowels (Maekawa 1990) and high vowels between fricatives at fast enough SRs (Tsuchida 1997).
Including a phonetic reduction of voicing durations in the analysis of the data for high vowels as well as non-high vowels leads to the prediction that there will be high vowels that will lose all their voicing due to overlapping of glottal gestures at high SRs. Indeed, this may account for the statistically significant increase in the number of devoiced vowels seen in the previous chapter in the discussion of the effect of SR on whether or not a vowel is devoiced.
In the last chapter ANCOVA models were used to determine the relative effects of the different variables on mean token durations. In this chapter, ANOVA (Analysis of Variance) models will be used to determine the relative effects of the variables on whether or not a vowel was devoiced. That is, it will be asked how much influence each variable had on whether or not a vowel would be completely devoiced and fall into the group of data points spread out on the y = 0 axis of Figures 6.3 to 6.6 and 6.10.
As in §5.3, the basic ANOVA model presented in this chapter was constructed by the statistical consultants who reviewed this project and constitutes the first stage of what was called a logistic regression model. The first stage checks for effects of the variables on whether or not a vowel was devoiced, a binary process; the second stage checks to see the effects of the variables on the voicing durations of only those vowels that were voiced. The second stage of the analysis will be presented in the next chapter.
The basic ANOVA model constructed by the consultants, quite similar to the ANCOVA utilized in Chapter 5, is as follows:
voiced = SR * token * mora + participant * block + participant * repetition (block)
and
voiced = SR * token * mora + gender * block + gender * repetition (block)
where SR * clitic * mora is the combination of these 3 variables and all possible interactions among them (e.g. SR * token, SR * mora, etc.); participant * block is the interaction between these two variables as well as by themselves, and participant * repetition (block) is the interaction between these two variables and the variables themselves where repetition is nested within block. The major difference between these two models and those used in Chapter 5 are that these models do not contain a regressor, a second dependent variable that is included to check for the effects on the main dependent variable of interest. These models are then ANOVA rather than ANCOVA as in Chapter 5.
As in the model in Chapter 5, SR, token and mora were grouped together since these are what is known as fixed effects, effects that are controlled in or defined by the experiment--an investigation is being made into the effects of these particular 3 SRs, these particular 10 tokens, and these particular two mora of two-mora words. On the other hand, participant and gender are combined with repetition and block because these are what is known as random effects, effects that are not specifically controlled by the experiment--an investigation is not being made into the effects of these particular 10 participants, but we wish to extend the results to the larger population of young Japanese speakers of the Tokyo dialect. Similarly, we are not interested in the effects of the 3 repetitions in each of these particular 2 repetition blocks, but we wish to extend the results to any number of such repetition blocks.
The results of the two ANOVA are given below, again after non-significant variables and interactions were removed one at a time. Again, F-ratios and are reported to two significant digits, and p-values are given as generated by the statistical program.
Table 6.2 Results of ANOVA including the variable gender.
Table 6.3 Results of ANOVA including the variable participant.

As can be seen, the results of the two models were virtually the same except for the size of the effect of gender and participant, evidenced in the different sizes of F-ratios.
The results are substantially different than those of the models in Chapter 5, however, in that the largest single effect by far was the effect of mora. While the effect of SR is also large enough to merit notice, the effect of mora was quite a bit larger. mora also interacted with both SR and token; that is to say, the effect of mora on whether or not a vowel was voiced was dependent on which SR is being looked at, and which token the vowel was in. Also, both gender and participant interacted with block; that is, the effect of either gender or participant on whether or not a vowel was voiced differed depending on which repetition block is being looked at.
As in the last chapter, results will be presented starting with the significant interactions. Although the above ANOVA were used to check whether or not a vowel was voiced, since it is the first stage of a logistic regression analysis that will be continued for the voiced vowels only in the next chapter, the results will be discussed in reference to the percentage of vowels devoiced by the participants. (Note that identical values are obtained by running ANOVA on values of a variable devoiced which is the exact inverse of the variable voiced used above.)
6.4.1 Effects on devoicing involving mora - top
6.4.1.1 Effects of the 2-way interaction mora * token - top
The significant effect of the 2-way interaction between mora and token indicates that although the mora a vowel is located in had a large influence on whether or not that vowel would devoice, the influence was not consistent for each token used in the study--the mora a vowel is located in has a larger effect on whether or not the vowel is devoiced for some tokens than for others. This interaction can be seen in the line plot for these two variables in Figure 6.12 below, where the tokens used in the study are on the x-axis and the percentage of vowels devoiced is on the y-axis.

Figure 6.12 Interaction line plot for the effects of token * mora on percentage of vowels devoiced.
The effect of the different token mora on the percentage of vowels voiced can be seen in the vertical separation between the two segmented lines. The percentage of vowels devoiced in the 1st mora can be seen in the segmented line marked with circles (generally, the higher of the two segmented lines). The percentage of vowels devoiced in the 2nd mora can be seen in the segmented line marked with squares (generally, the lower of the two segmented lines). This shows the higher percentage of devoiced vowels in the 1st mora of most tokens indicated by the large value for mora in the ANOVA results given above.
The effect of the different tokens on the percentage of devoiced vowels can be seen in the varying slope of the segments of each line--if different tokens had no effect, the slopes of the segmented lines would be the same (i.e. each segmented line would be one horizontal line across the plot). This shows the influence that different tokens had on whether or not a vowel was devoiced that is indicated by the significant value for token in the ANOVA results above.
The effect of the interaction between which mora contains the devoiced vowel and the different tokens can be seen in the difference in each set of two data points for each token in Figure 6.12--the two segmented lines do not vary in tandem as one moves from one token to the next across the graph. This is the source of the significant value for the interaction of the two variables in the ANOVA results. Figure 6.12 above shows that a 1st mora vowel was much more likely to devoice for some tokens than for others. In particular, the 1st mora vowel in the tokens chichi, chiki, kuki, tsuchi and tsuki were much more likely to be devoiced. In contrast, the 1st mora vowels in the tokens kichi, kiki, kishi and kuchi were somewhat more likely to devoice, while the 1st mora vowel in kushi was slightly less likely to devoice than the 2nd mora vowel.
As mentioned in Chapter 5, the interplay of all of the effects at work in this data (different segmental material in each mora and clitic and different pitch locations) cannot be conclusively described due to lack of sufficient stimuli control. However, the fact remains that there does not seem to be any set pattern to the fluctuation in the current data set seen in the following plots. In Figure 6.13, the tokens are segregated by the material in the first mora of each token; in Figure 6.14 the tokens are segregated by the following clitic (to or ka); in Figure 6.15, by the material in the second mora of each token; and finally in Figure 6.16 by the standard pitch accent placement (unaccented, 1st mora, 2nd mora).

Figure 6.13 Interaction line plot for the effects of token * mora on percentage of devoiced vowels, tokens segregated by the segmental material of the 1st mora.
It can be seen from Figure 6.13 that having [t¥su] in the first mora of a token significantly increases the chance of the 1st mora vowel being devoiced and the 2nd mora vowel remaining voiced. This same basic pattern can also be seen with the tokens beginning with [t¥shi] (i.e. chichi and chiki), although the influence is not as great. It remains for further research to see if the affricate allophones [t¥sh] and [t¥s] result in the observed increased devoicing rate due to a longer duration oral constriction. However, the [ku] of the token kuki displays a similar low percentage of devoiced vowels in the 2nd mora; possibly this is due to the combined constriction of the velar closure for [k] and the backed position of the tongue for the [u].
Another tentative pattern that emerges from segregating the tokens by 1st mora segmental content is the possible effect of having [ki] in the 1st mora; the 6 data points representing the percentages of vowels devoiced for the 3 tokens kichi, kiki, and kishi tentatively suggest that having [ki] in the 1st mora minimizes the difference in likelihood that a 1st mora vowel is more likely to devoice than a 2nd mora vowel.
Figure 6.14 below shows the same information, but the tokens are segregated by the following syntactic particle.

Figure 6.14 Interaction line plot for the effects of token * mora on percentage of vowels devoiced, tokens segregated by following clitic.
As can be seen, segregating the tokens by the following syntactic particle does not reveal any regular effects on the percentage of vowels voiced; the effect of mora varies widely for both groups of tokens. The only possible correlation that seems worth further investigation is the similarity in percentage of devoiced vowels for the tokens kichi and kishi; it may be that the segmental sequence of [ki] followed by Ci, where C is an affricate or fricative, followed by another [k] (the [k] of the clitic [ka]) helps to minimize the variation in percentage of devoiced vowels.
Figure 6.15 shows the tokens segregated by the segmental material of the 2nd mora.

Figure 6.15 Interaction line plot for the effects of token * mora on percentage of devoiced vowels, tokens segregated by the segmental material of the 2nd mora.
The strongest pattern that emerges when the tokens are is segregated by the segmental material of the 2nd mora is that the effect of having [shi] in the 2nd mora seems to minimize the effects of a 1st mora vowel being more likely to be devoiced--having [shi] in the 2nd mora makes the likelihood of either a 1st or 2nd mora vowel being devoiced close to 50%, the level of chance. Further study involving more tokens with this 2nd mora will be needed to confirm this effect.
Figure 6.16 gives the tokens segregated by the standard pitch accent placement.

Figure 6.16 Interaction line plot for the effects of token * mora on percentage of devoiced vowels, tokens segregated by standard pitch placement.
Segregating the tokens by the location of the prescribed pitch accent location does not reveal any regular effect on the percentage of devoiced vowels due to the standard accent placement for this group. Although the current study utilized only 2-mora tokens, the lack of regular effect supports recent studies (e.g. Mineta 1988, cited by Maekawa 1989; Kondo 1997; Tsuchida 1997, Kitahara 1998) claiming that the placement of lexical pitch accent is no longer having a strong effect on whether or not a vowel is devoiced. (However, see §3.3 of this work for discussion on how pitch accent placement and vowel devoicing do still interact.)
As with the variation seen in the token duration data, in the current data set there is no clear-cut source for the variation in percentage of devoiced vowels due to the effect of the interaction between mora and token. The exact source of the effect of this interaction will be left for future, more controlled research.
The significant effect of the 2-way interaction between mora and SR means that the influence of the SR (slow, normal, fast) on whether or not the vowel devoiced was not consistent for either mora. This inconsistency can be seen in Figure 6.17 below in the interaction line plot for these two variables.

Figure 6.17 Interaction line plot for the effects of SR * mora on percentage of devoiced vowels.
In this plot, the effect of mora on the percentage of devoiced vowels can be seen in the vertical separation of the two segmented lines. As before, it can be seen that a much higher percentage of vowels were devoiced in the 1st mora (marked by the circles) when all tokens are combined as a group than the mean number of devoiced vowels in the 2nd mora (marked by the squares) at all SRs. The effect of SR can be seen in the slope of the segmented lines; the percentage of devoiced vowels can be seen to increase as the SR increased from slow to fast.
Finally, the source of the significant effect on the percentage of vowels voiced due to the interaction between the two variables can be seen in the fact that the two segmented lines are not parallel. The increase in the percentage of vowels devoiced can be seen to increase much more quickly for the 1st mora as SR increased than for the 2nd mora. This is consistent with Kuriyagawa & Sawashima (1989), which also showed a higher incidence of devoiced vowels in the 1st mora of the tokens in that study. As noted there, this is thought to be due to the increased glottal spreading associated with word-initial obstruents than for word-medial stops; it seems plausible that this extends to all word-medial obstruents.
In addition, as discussed above in §2.2.4, the close oral constriction created by the allophonic variation of the token-initial consonant is also thought to play a part. The increased turbulence associated with the close oral closure inhibits voicing for a longer period of time; when the vowel is sustained for a shorter duration at high SRs, it is more likely that the turbulence will cause the complete blocking of the realization of the voicing gesture.
The significant 2-way interaction between gender and block means that not only did the percentage of devoiced vowels differ by gender, but that the difference was not consistent for both repetition blocks. This can be seen below in the line plot for this interaction.

Figure 6.18 Interaction line plot for the effects of gender * block on percentage of devoiced vowels.
The effect of having participants of both genders in the study can be seen in the vertical separation of the line segments in Figure 6.18. The females are represented by the upper line marked with circles and the males represented by the lower line marked with squares. It can be seen that the males devoiced a higher percentage of vowels in both mora, consistent with the results of Yuen & Hubbard (1997).
The effect of having repetitions segregated into two repetition blocks can be seen in the slope of the line segments, with the percentage of devoiced vowels in the 1st block of repetitions on the left and the percentage of devoiced vowels in the 2nd block of repetitions on the right.
The effect of the interaction between the two variables can be seen in the different slopes of the line segments in Figure 6.18; slopes of opposite sign (i.e. one rising and one falling) are always indicative of an interaction between two variables. The percentage of devoiced vowels for the males in the study can be seen to drop slightly from the 1st repetition block to the 2nd, while for the females it rose slightly. That is, the males produced more devoiced vowels in the 2nd repetition block while the females produced more in the 1st repetition block.
The significant effect of the 2-way interaction between participant and block means that, just as in the interaction between gender and block, the differences seen in the percentage of devoiced vowels due to individual participant differences was not consistent for both the 1st block and 2nd block repetitions. These differences in the percentage of devoiced vowels for the different participants in each block can be seen below in Figure 6.19.

Figure 6.19 Interaction line plot for the effects of participant * block on percentage of devoiced vowels.
The effects of participant can be seen in the 10 pairs of data points along the x-axis; the initials for each participant are listed across the bottom of the graph in alphabetical order. The wide range of variation in the percentage of devoiced vowels for the group can be seen.
The effects of having two repetition blocks can be seen in the vertical separation of the two segmented lines. The percentage of devoiced vowels in the 1st repetition block is marked with a circle, generally the higher of the two segmented lines on the graph. The percentage of voiced vowels in the 2nd repetition block are marked with a square, generally the higher of the two segmented lines on the graph. The effect of the interaction between the two variables can be seen in the differences in location of the pairs of points on the graph, in particular where the lines connecting the pairs cross. Again, this shows that the differences in the percentage of devoiced vowels due to the individual variation of the participants was not consistent between the two repetition blocks.
Participants ANa, CS, HK, KT, and YU devoiced a slightly higher percentage of vowels in the 1st block than in the 2nd. Participants ANi, SM, TO, and YS devoiced a slightly higher percentage of vowels in the 2nd block, while participant YK devoiced almost the same percentage--a very high percentage--of vowels in both blocks.
Although gender did have a significant effect on the percentage of devoiced vowels, segregating the data in Figure 6.19 by gender shows that gender is not a sure indication of whether a higher percentage of vowels will be devoiced in one mora than the other.

Figure 6.20 Interaction line plot for the effects of participant * block on percentage of devoiced vowels, segregated by gender.
It can be seen here again that the males of the group did in general devoice a higher percentage of vowels than the females did. It can also be seen that 3 of the 4 males devoiced a higher percentage of vowels in the 2nd repetition block. The females in general produced a lower percentage of devoiced vowels, and 5 of 6 devoiced more vowels in the 1st repetition block. However, note that participant TO devoiced more vowels in the 2nd repetition block than in the 1st, as the males did. In general, however, these findings correlate well with the slower SRs (as evidenced by longer token durations) observed by the females in the 2nd repetition block that was discussed in §5.4.2.1.
Figures 6.19 and 6.20 show the great individual variation in the percentage of vowels devoiced within the group of participants. These differences surfaced even though an attempt was made to select participants from within the Tokyo area who were raised in Tokyo, had parents who were either from or had been in the Tokyo area for a long time, and had minimal contact with other languages. This is consistent with the individual variation in devoicing rate reported in other studies (e.g. Kondo 1990, Maekawa 1993). The plots also show that in general the females in the study devoiced more vowels in the 1st repetition block than in the 2nd, and that the males devoiced more in the 2nd. However, Figure 6.20 shows that while gender differences do exist overall, the performance of any given participant cannot be predicted on the basis of gender.
In this chapter, the analysis of Tsuchida (1997) positing both phonological (i.e. feature-changing) devoicing and phonetic overlap of the voicing gesture was recalled from the discussion in Ch. 2.
Phonetic overlap was held to be responsible for the devoicing of high vowels between voiceless fricatives, as evidenced by the two distinct activations of the glottal spreading muscles and two distinct glottal spreading gestures for the two fricatives surrounding the vowel. At a fast enough SR, this overlap may result in a devoiced vowel. This same overlap process was held to be responsible for loss of voicing of non-high vowels in all devoicing environments.
Phonological spreading of the feature [+spread glottis], on the other hand, was held to be responsible for the devoicing of high vowels in environments other than between fricatives. The sharing of the feature results in a shift of the temporal adjustment of the feature to the midpoint between the voiceless obstruent preceding the vowel and the vowel itself, resulting in a glottal spreading gesture that extends throughout the vowel site (see also the discussion of Iverson & Salmons 1995 in §2.2.4). It was noted in Ch. 2, however, that at least one participant in this study did not temporally shift the spreading gesture, but instead lengthened the spreading gesture to achieve overlap of the vowelÕs voicing. In both cases, only the preceding consonantÕs spreading gesture was involved in the overlap.
This dual-process analysis was supported by observation of productions from the current study. Spectrograms were presented which showed the shift of frication associated with the temporally-realigned [+spread glottis] feature, and those showing lengthening of the spreading gesture.
In addition to the spectrograms supporting TsuchidaÕs (1997) analysis, examination of the distribution of data in this study confirmed that two distinct subsets of data were generated. For one subset of the data, the duration of the voicing activity was correlated with token duration (i.e. voicing duration variation caused by gestural overlap. For the other subset of data; duration of voicing activity was independent of token duration (i.e. vowel devoicing caused by phonological spread and realign of the feature [+spread glottis])
The ANOVA models used to check for effects of the variables targeted in the study were first discussed, and then the results of those checks were discussed. The results of the ANOVA tests discussed above are summarized below.
As opposed to the results reported in Chapter 5, where SR had the largest effect on token durations, which mora of a token a vowel is located in had the largest effect on whether or not a vowel was devoiced. A significantly higher percentage of vowels in the 1st mora of the tokens was devoiced, consistent with the findings of Kuriyagawa & Sawashima (1989). However, the effect of which mora a vowel was in varied in this study both for each SR and for each token.
Examination of the data showed that while the 1st mora vowels did in general devoice much more often, the percentage of devoiced 1st mora vowels varied greatly with token. For tokens beginning with tsu, the 1st mora vowel of this mora devoiced a vast majority of the time, with a concomitant voicing of the 2nd mora vowel. All other variation seen in the token differences needs to be investigated in further, controlled study.
In addition, the percentage of 1st mora vowels devoiced much more rapidly as SR increased than the percentage of devoiced 2nd mora vowels. This increase in 1st mora devoiced vowels is thought to be due to the larger maximal glottal openings for word-initial obstruents than for word-medial ones (Kuriyagawa & Sawashima 1989), and to the increased resistance to air flow created by the allophonic distribution of many word-initial consonants of the tokens in this study (§6.3).
The percentage of vowels devoiced also varied by repetition block for both gender and participant.
It was seen that the males did devoice a significantly larger number of vowels than the females, supporting the results of Yuen & Hubbard (1997). Further, while the males devoiced more vowels in the 2nd repetition block, the females devoiced more vowels in the first repetition block. The source of this variation between the two genders is left for further research, although it will be noted that 3 of the 4 males in the study produced longer duration tokens in the 2nd repetition block; the percentage of devoiced vowels increased even though longer duration tokens were produced.
The higher percentage of vowels devoiced by the females in the 1st repetition block and by the males in the 2nd repetition block was observed to hold across most of the participants. However, one female participant (TO) did devoice a higher percentage of vowels in the 1st repetition block, and one male participant (YK) produced almost an identical number of devoiced vowels in both repetition blocks.
1. For the current discussion of the regression statistics, the one repetition of the token tsuchi Ôdirt; earthÕ by participant YU (2 data points) at token duration = 867ms was removed due to a clear pronunciation with a geminate medial consonant (i.e. [tsutchi] as opposed to [tsuchi]).
2. This and all following r values are generated with no intercept in the regression model; that is, the regression line and the regression coefficient are generated under the assumption that the values of token duration and voicing duration meet at the data point (0, 0). If this condition is not imposed, the regression line and regression coefficient are generated with token duration and voicing duration meeting at a point where token duration is negative. This is equivalent to saying there will be vowel voicing present even though no vowel was pronounced.
3. A one-tailed test checks to see if one variable is bigger than the other; i.e. X > Y. In contrast, a two-tailed test checks to see if the two variables are not equal; i.e. either X > Y or Y > X.
4. There are 5 degrees of freedom (df) not reported in Tables 6.2 and 6.3. These correspond to the degrees of freedom for the variable repetition (block) which, although not directly included in the ANOVA calculations, was included in the model to serve as the error term for the variable block.