HCRC Disfluency Coding Manual: 6

Feature labelling

Each word in the RM and RR is given a label to signify its role: lower case "r", "s", "i" and "d" are used.

Diacritics are used, just added to the end of feature labels:  
-      word fragment ({ab|wor}in original transcriptions) e.g. r-, s-, d-, i-
~      misarticulated word, repeated correctly in repair.
^

     contractions (it's - that's : <S1 s^r IP s^r S1>) (count as one word)

Where word-boundary labelling does not permit a contraction to be split up, the symbol "x" is used to label a word that is irrelevant for the disfluency. e.g. I I've done that == <R1 r IP r^x R1>

c      outer part of complex disfluency (see below)
o      overlap -- speaker stops/is interrupted and other speaker takes over or speaker interrupts other
 
 
Repeated single fragment: repair includes whole word.
but   n- not too far north
<R0.5   r- IP   r                    R0.5>
-
Single word repetition.
right to   my my left
<R1  r IP r                        R1>
-
Two word repetition with Filled Pause.
go to your left    until you're - uh   until you're  level ...
<R2      r       r IP  FP     r       r       R2>
-
Substitution -- two words plus fragment (fragment is substituted).
like  to the r- -  to the left of the burnt forest
<S2.5  r     r   s- IP r    r     s          S2.5>
-
Deletion -- 4 words plus fragment. No repair.
well go down until you're in a ver- - right see the windows...
<D4.5    d    d      d  d  d- IP D4.5>

Coindexing: in simple cases like these there's no need for coindexing of feature labels, as the same serial ordering is found in RM and RR.

Substitutions can be one-to-many or many-to-one. 

NEXT

BACK