Scarlet & Grey
Ohio State University
School of Music


Converging Evidence in Music Research

David Huron



[Introduction] [Converging Evidence in Music] [Introspective Data] [Types of Stimuli] [Types of Responses] [Performance Tasks] [Compositional Tasks] [Notation as Data] [Text as Data] [Modeling and Simulation] [Cross-Disciplinary Convergence] [Cross-Cultural Convergence] [Summary] [Case Example: Pitch Proximity] [The Importance of Converging Evidence] [Confirmation Bias] [Conclusion] [References]


Introduction

Informally, we feel that the way to establish greater confidence regarding some claim is to increase the number of supporting examples or observations. If a music scholar observes that an Italian sixth chord occurs in measure 58 in a work by Antonio Vivaldi, it seems unreasonable to presume that this single observation is adequate evidence supporting the general claim that Italian sixths occur more frequently in music by Italian composers. We might feel more confident if the scholar showed that Italian sixth chords occur more frequently than French or German sixths in (say) 100 works by Vivaldi. We might feel more confident yet if the scholar showed that Italian sixths predominated over other augmented sixth chords in 1,000 scores by (say) thirty Italian composers. And we might feel even more confident, if the scholar also showed that a similar preponderance of Italian sixths did not occur in a large selection of works by composers of French, German, and other nationalities.

This hypothetical research project notwithstanding, unfortunately, merely increasing the number of pertinent observations doesn't always justify an increased confidence that some claim is likely to be true. The problem was identified by the eighteenth- century philosopher, David Hume. Hume was deeply impressed by the science of his day. He was impressed that disciplined observation could lead to fundamental insights about the organization of the natural world. But Hume was also troubled by the philosophical problem of how knowledge might be acquired through observation. He pointed out that, unlike the deductive methods used in logic and mathematics, inductive observation-based methods are inherently uncertain. Hume recognized that no amount of observation could ever prove a particular general claim.

Since the eighteenth century, Hume's worst fears about observation-based knowledge have been amply manifested by science itself. Consider, for example, the case of Newtonian physics. Throughout the nineteenth century, hundreds of physicists carried out thousands of experiments and recorded hundreds of thousands of observations that were consistent with Isaac Newton's formulation that applying a constant force will result in a constant acceleration. In the twentieth century, Albert Einstein famously proposed a theory that contradicted Newton's simpler view. Einstein proposed that, near the speed of light, acceleration is no longer directly proportional to force. Subsequent experiments produced results that were consistent with Einstein's theory. However, the principal philosophical lesson arising from these experiments was not that Einstein was right; the more astonishing finding was that Newton was wrong. Note that a nineteenth-century physicist, aware of the wealth of observations consistent with Newton's theory, would not have been justified in believing that Newton's generalization had been "proved" or was "true."

Given this history, one might be tempted to entertain two (fallacious) conclusions. One conclusion might be that increasing the number of pertinent observations has no bearing on establishing truth. A more sweeping conclusion might be that observation itself is irrelevant to establishing truth. If the volume of evidence is irrelevant, then what do we say to our hypothetical Vivaldi scholar? Would we wish to concur with the scholar who presumes that the presence of an Italian sixth chord in measure 58 of a work by Vivaldi establishes the general claim that Italian sixths occur more frequently in music by Italian composers? Moreover, would we wish to claim that no amount of evidence could ever resolve whether Italian composers favour Italian sixths? Our intuition that many observations are better than few observations is warranted, but not necessarily in the way we think.

As we have seen in the case of Newton, increasing the number of observations can, by itself, lead to a false sense of security about a hypothesis. A more important consideration is that the examples or observations come from a variety of different sources. The essence of converging evidence is that independent observations made under a number of circumstances, using contrasting measurement methods, by several different observers, are all consistent with one particular interpretation. In this chapter, we consider more formally some of the conditions that give researchers greater confidence in a result.

Converging Evidence in Music

Introspective Data

There are innumerable sources of potential evidence in music-related research. Perhaps the first source is the scholar's own subjective or phenomenal experience. In our everyday activities, we form informal intuitions and conjectures about the musical phenomena we observe. Our intuitions might relate to listening, performing, conducting, composing, imaging music, dancing, watching a film, or other activities. Of course introspection is a fallible source of information since we are not always privy to the inner workings of our own minds. Self-deception is common. Nevertheless, a scholar may feel some greater assurance if experimental or other sources of evidence reflect her or his subjective intuitions about what may be happening.

Since people are not all alike, a researcher might seek wider phenomenological support by soliciting introspective reports from others. The value of these reports will be enhanced if a formal survey is conducted and efforts are made to collect introspective accounts from a wide variety of individuals. In particular, it can be valuable to seek introspective reports from people who are not our friends: our friends are more likely to think as we do (which may be one of the reasons they are our friends). Since world-views are related to background, the prudent scholar will solicit accounts from people of disparate and contrasting backgrounds.

Of course, more formal empirical evidence can be gathered through an experimental approach, an approach that allows greater rigor and often greater reliability. Since there are many ways to carry out an experiment, contrasting sources of evidence can arise from the innumerable variations in the type of stimuli used, the choice of tasks performed, the kinds of response data gathered, and the variety of people who participate.

Types of Stimuli

Consider the listening experiment. In the first instance, the experimenter has considerable latitude in the choice of sound stimuli. The stimuli might be highly simplified and rigorously controlled, such as the sounds commonly used in psychoacoustic experiments. Alternatively, the sounds might be highly controlled, yet more musical in character -- such as sounds produced by a computer or MIDI synthesizer with known properties. A third alternative is to use commercial recordings of actual performances; sound recordings have a higher degree of "ecological validity," yet still provide the opportunity for repeated use of identical stimuli. A fourth approach might abandon the use of identical stimuli by relying on several musical sources -- possibly including live performance.

Converging evidence arises when experiments employing each of these types of stimuli lead to the same conclusion. For example, increasing loudness appears to evoke a heightened physiological arousal whether the stimulus is a simple sine tone with increasing amplitude, a computer-performed melody, or the crescendo in a recorded performance of a Beethoven piano sonata.

Types of Responses

Apart from the type of stimuli used, converging evidence can also be sought by varying the type of responses observed in a listening situation. A researcher might measure systemic (whole body) physiological responses such as heart-rate, breathing, perspiration, etc. Alternatively, physiological measures might focus on changes in the central nervous system. The researcher might use brain imaging techniques such as PET or fMRI, or use electroencephalographic methods. Alternatively, the researcher might record behavioral responses, such as verbal responses to questions. For example, a listener might be asked to predict the next note in a sequence. The listener's response might be active (such as singing the next note) or passive (such as assigning a numerical rating identifying how well a presented note fits the current context). A listener might be asked to make judgements, such as the "appropriateness" or "pleasantness" of particular events or passages.

Performance Tasks

Compared with listening tasks, the behaviors of performers provide an especially contrasting source of converging evidence in music research. Performance data may be gathered by formal or informal interviewing of performers, conductors, or teachers. Such individuals might be asked to introspect, to imagine performing, to provide a retrospective report following an actual performance, or to provide a running commentary while engaged in a task such as rehearsing. The researcher might interrupt a performance to pose specific questions or solicit information at particular moments. Apart from collecting verbal responses, the research might entail gathering data through monitoring instruments directly attached to the performer -- or less directly through high-speed film or through 3-D kinematic data acquisition systems. Another approach is to collect data by monitoring the musical instrument rather than the performer, such as in the case of recording MIDI data. Alternatively, a music scholar might measure performance behaviors from sound recordings -- although this approach makes it nearly impossible to infer the performer's physical movements. Sound recordings are especially useful when it is difficult to recruit professional performers as research collaborators or participants. In all of the above cases, the performance data may be gathered either in a laboratory setting or in a concert or performance setting.

The tasks assigned to a performer may be varied, from recording a well-rehearsed performance, to recording the rehearsal process itself. The musician(s) may be asked to perform under certain constraints, such as playing at a specified tempo or playing without visual or auditory interactions with other musicians. Mechanical or electronic constraints might be introduced, for example, to eliminate the performer's control over loudness. Other tasks include examining the performer's reactions to experimenter-generated perturbations. For example, the experimenter might abruptly introduce an unexpected chord change to see how a jazz improviser adapts. Performers might be vocalists or instrumentalists, children or adults, male or female, novice or professional, classical or popular, Western or non-western, etc.

Compositional Tasks

A further source of possible converging evidence can be sought in studies of compositional activities. One approach entails simple observation of a composer engaged in writing a piece. The sequence of compositional actions might be recorded and subjected to protocol analyses; a database of compositional actions might be used to test hypotheses about the nature of composing. Alternatively, the composer might be assigned specific tasks, such as writing a passage that conforms to a predefined set of restrictions.

Notation as Data

As music theorists have long understood, one of the best sources of musical information is to be found in notated scores. Musical scores provide significant opportunities for testing hypotheses about musical organization, including tests of psychophysical phenomena. Measurements can be as primitive as simple interval counts, or as complex as Schenkerian analyses. Different styles, periods, instrumentation, etc. provide potentially contrasting repertories for seeking converging evidence for some general claim. Of course not all musically pertinent information is available in musical scores, but this criticism also applies to all other forms of musical information as well. When the analytic method involves considerable interpretation (such as in Schenkerian analysis), the reliability of this information can be determined by engaging many theorists to analyse independently the same works.

Text as Data

An often overlooked source of empirical data about music is historical and theoretical writings about music. Of course musical writings have been the most important focus for philosophically- and historically-oriented music scholarship. But these same sources can be used for empirical studies. Although this approach has been uncommon in music scholarship, it has been used by sociologists, anthropologists, and empirical historians. Historical texts, and the ideas they contain can be the subject of statistical approaches. For example, music reviews published in newspapers might be tabulated to estimate the point where a musician's popularity begins to decline. Similarly, text-based measures might include counting the number of times idea X is espoused compared to idea Y in musical treatises. (Such measures might provide a rough index of the relative importance or salience of an idea or concept in different historical periods, for example.)

Modeling and Simulation

Another source of potential converging evidence is modeling and simulation. Computer models and simulations have been created for a wide variety of musical purposes; not all models are suitable as sources of converging evidence. In some instances, computer models have been built to emulate aspects of music, from pitch perception to instrument/performer interactions. In the more rigorous studies, scholars have validated their models by carrying out experiments that test predictions arising from the models. For example, a model of melodic expectation might predict that a novel pitch sequence will tend to evoke a specific expectation. This expectation can then be tested through listening experiments. If a model has an established record for producing accurate predictions, then a model-derived prediction might also be considered evidence that converges with other research, even if the specific model prediction has not been tested.

Cross-Disciplinary Convergence

A particularly powerful source of converging evidence is to be found in cross-disciplinary convergence. Many phenomena (including music) can be studied from a variety of disciplinary viewpoints -- such as from biological, psychological, archeological, sociological, technological, economic, historical, religious, business, entertainment, and other approaches. From time to time, scholars working in different disciplines will independently arrive at the same conclusion, often by using unrelated and even contrasting methods.

In the field of music, cross-disciplinary convergence has sometimes occurred between music history and art history, between musicology and sociology, and between music theory and psychology. Cross-disciplinary convergence is especially compelling when the work is carried out independently; that is, when the work of a researcher in one field proceeds without knowledge of the results in another field. Contrary to normal intuition, the research is often strengthened when scholars in different disciplines pursue their work in ignorance of the parallel work going on in another field.

An especially good (though non-musical) example of cross-disciplinary convergence is evident in studies of the origin of native peoples in the Americas. Geologists have used standard geological techniques to estimate periods when lowered sea-levels facilitated the movement of people across the Bering Strait between Asia and America. Working independently, linguists have established three language groups among native American peoples, and have suggested times when the ancestral peoples diverged from a single hypothetical Asian source. Finally, population geneticists, working independently have also identified three genetic stocks among native Americans, and have estimated when each of these groups became isolated from their Asian ancestors. The geological, linguistic, and genetic research converge on three periods of Asian emigration to the American continents coinciding with three periods of lowered sea-levels (see Cavalli-Sforza & Cavalli-Sforza, 1995 for a useful review). Evidence from one discipline alone leaves considerable uncertainty about the validity of the results. However, the convergence of all three disciplines greatly increases the credibility of their joint estimates.

Cross-Cultural Convergence

Another powerful source of converging evidence for an idea is how that idea fares when compared cross-culturally. If a scholar claims that a particular phenomenon is "universal," then we ought to see evidence of such a phenomenon in a wide sample of contrasting cultures. Conversely, if a scholar claims that a particular feature is distinctive of a given culture (or composer, or style, etc.), then that feature should not be evident in other cultures (other composers, other styles, etc.). In other words, cross-cultural comparisons provide particularly challenging tests for both the generality and specificity of musical hypotheses.

All of the possible sources of evidence discussed above (types of responses, performance tasks, modeling and simulation, etc.) can be reexamined by comparing and contrasting parallel results from the plethora of cultures and subcultures that exist in the world. For each culture, one may address anew the listening issues, the performance issues, the types of musical materials studied, as well as the indigenous accounts of music-related activities and experiences.

Summary

Table 1 provides a summary overview of the types of sources of converging evidence we have described above. The table is intended to be suggestive rather than exhaustive.

Table 1

Cross-disciplinary
ApproachTaskMaterialsResponsesParticipants
Phenomenologicalintrospection while listeningvariousverbal/written accountresearcher/others
introspection while performingvariousverbal accountresearcher/others
introspection while composingvariousverbal accountresearcher/others
formal surveyvariousverbal/writtenexpert/novice
Experimentallisteningsimple controlled stimuliphysiological/systemic (whole body)expert/novice
physiological/central nervous systemexpert/novice
behavioral/active (e.g. singing)
behavioral/passive (e.g. judging)
listeningsimple music-like stimuliexpert/novice
listeningrecorded musicexpert/novice
listeninglive musicexpert/novice
rehearsed performancevarious styles, etc.kinematicexpert/novice
in concert/laboratoryexpert/novice
(EMG, film, MIDIexpert/novice
data, sound recording)
practisingnovice or expert materials""
improvisingnovice or expert materials""
adaptive performingnovice or expert materials""
composingopen-endedprotocol recordsexpert/novice
composingconstrained writing
Sound Analysisrecorded performancesound recordingstiming, dynamics, etc.professional/experts
Notational Analysisstatistical countsdifferent styles,not applicablecomposers/arrangers
periods, instrumentation, etc.
analytic proceduresscoreswritten analysestheorists
Text Analysisstatistical countsconcepts, ideaswriters on music
history of ideasconcepts, ideaswriters on music
Modeling and Simulationall of aboveall of aboveall of abovenot applicable
all of aboveall of aboveall of aboveall cultures: novices & experts

A Case Example: Pitch Proximity

There are relatively few musical phenomena that have been subjected to extensive research making use of converging evidence. In no case has there been an exhaustive study that includes all of the variations identified above. Nevertheless, it can be instructive to consider a phenomenon that has received considerable research attention. A useful illustrative example can be found in the case of pitch proximity -- the tendency for musical "lines" (such as melodies) to be constructed predominantly of small pitch intervals.

The observation that musical melodies tend to be constructed using relatively small intervals can be found throughout historical writings on music (e.g. Zarlino, 1573; Berardi, 1687; Rameau, 1722). In more modern times, formal surveys by several scholars have confirmed the preponderance of small intervals in melodic construction, including Ortmann (1926), Merriam, Whinery and Fred (1956), and Dowling (1967). My own work ( Huron, 1992) has affirmed the preponderance of small intervals in music from a number of cultures, including American, Chinese, English, German, Hasidic, Japanese, and sub-Saharan African (Pondo, Venda, Xhosa, and Zulu).

In 1950, Miller and Heise observed that alternating pitches (such as trills) produce two different perceptual effects depending on the pitch distance separating the tones and their speed of alternation ( Miller & Heise, 1950). When the tones are close with respect to pitch, quick alternations evoke a sort of "undulating effect," like a single wavering line. However, when the pitch separation is larger, the perceptual effect becomes one of two "beeping" tones of static pitch. Musicians recognize this phenomenon as that of pseudo-polyphony or compound melodic line, where a single sequence of pitches nevertheless evokes a sort of "yodeling" effect.

Miller and Heise's observations were replicated and extended by a number of researchers who carried out similar listening experiments. This includes experiments by Petter (1957), Bozzi and Vicario (1960) and Vicario (1960), Schouten (1962), Norman (1967), Dowling (1967), van Noorden (1971a, 1971b), and Bregman and Campbell (1971). Several of these researchers worked entirely independently, without knowledge of previously existing work.

In 1975 van Noorden mapped the relationship between tempo and pitch separation on perceptual integration and segregation. When the tempo is slow and/or the pitches have close proximity, the resulting sequence is always perceived as a single musical line or stream. Conversely, when the pitch distances are large and/or the tempo is fast, two lines or streams are always perceived. Van Noorden also identified an intervening grey-region, where listeners may hear either one or two streams depending on the context and the listener's disposition.

The importance of pitch proximity in stream organization is supported by a number of additional experiments. Schouten (1962) observed that rhythmic or temporal relationships are more accurately perceived within lines than between lines. This finding was replicated and extended by Norman (1967), Bregman and Campbell (1971), and Fitzgibbons, Pollatsek and Thomas (1974).

In addition, Bregman and his colleagues have demonstrated that pitch proximity is more important than the Gestalt principal of "good continuation" when listening to pitch trajectories. For example, when some pitches in a sequence are completely masked by noise, listeners are more likely to "hear" continuations that involve simple pitch interpolations rather than extrapolating a coherent trajectory or pattern ( Steiger & Bregman, 1981; Tougas and Bregman, 1985; Ciocca & Bregman, 1987; summarized by Bregman, 1990; pp.417-442). Similar effects have been observed in speech perception ( Darwin and Gardner, 1986; Pattison, Gardner & Darwin, 1986).

Further experimental work has demonstrated the perceptual difficulty of tracking lines that cross with respect to pitch. Using pairs of well-known melodies, Dowling (1973) carried out experiments in which successive notes of the two melodies were interleaved. The first note of melody `A' was followed by the first note of melody `B,' followed by the second note of melody `A' followed by the second note of melody `B' etc. Dowling found that the ability of listeners to identify the melodies was highly sensitive to the pitch overlap of the two melodies. Dowling found that as the melodies were transposed so that their average pitches diverged, recognition scores increased. The greatest increase in recognition scores occurred when the transpositions removed all pitch overlap between the concurrent melodies.

Deutsch (1975) and van Noorden (1975) found that concurrent ascending and descending tone sequences are perceived to switch direction at the point where their trajectories cross. That is, listeners tend to hear two converging glissandos as "bouncing" off each other rather than passing through each other (i.e. crossing). Although the crossed trajectories represents a simpler Gestalt figure, the bounced perception is much more common.

In summary, at least four empirical phenomena point to the importance of pitch proximity in helping to determine the perceptual sense of a line of sound: (1) the breaking-apart of monophonic pitch sequences into pseudo-polyphonic percepts described by Miller and Heise and others, (2) the discovery of information-processing degradations in cross-stream temporal tasks -- as found by Schouten, Norman, Bregman and Campbell, and Fitzgibbons, Pollatsek, and Thomas, (3) the perceptual difficulty of tracking auditory lines that cross with respect to pitch described by Dowling, Deutsch, and van Noorden, and (4) the pre-eminence of pitch proximity over pitch trajectory in the continuation of lines demonstrated by Bregman et al. The segregation or integration of pitch sequences thus appears to depend upon the proximity of successive pitches.

Apart from the positive evidence for pitch proximity, there is also negative supporting evidence. Musicians recognize that the pitch interval between successive notes also has a significant impact in creating compound melodic lines, such as in yodeling. Dowling (1967) carried out a study of a number of Baroque solo works in order to determine the degree to which pseudo-polyphony is correlated with the use of large intervals. In measurements of interval sizes in passages deemed pseudo-polyphonic by two independent listeners, Dowling found the passages to contain intervals markedly larger than the "trill threshold" measured by Miller and Heise. In a sample of pseudo-polyphonic passages by Telemann, Dowling noted that Telemann never uses intervals less than the Miller and Heise trill threshold.

As further evidence of the importance of small intervals in non pseudo-polyphonic lines, Dowling examined the interval preferences of listeners. Through a series of randomly-generated stimuli, Dowling found that listeners prefer melodies employing the smaller interval sizes. A study by Carlsen (1981), moreover, demonstrated that musicians show a marked perceptual expectancy for small pitch intervals. Although Carlsen found significant differences between German, Hungarian and American musicians in their melodic expectancies, in all cases neighbouring pitch continuations were strongly favoured.

Further evidence has shown that musical practices are consistent with the psychological research pertaining to pitch proximity. This is particularly evident in the case of part-crossing. The crossing of parts with respect to pitch always violates the pitch proximity principle. No matter how the pitches are arranged, the aggregate pitch distance is always lowest when the upper voice consists of the upper pitches, and the lower voice consists of the lower pitches. In a study of part-crossing in polyphonic music, Huron (1991) showed that J.S. Bach avoids part-crossing, and that he becomes most vigilant to avoid part-crossing when the number of concurrent parts is three or more.

Modeling and simulation of monophonic/pseudo-polyphonic perceptions was carried out by Huron (1989). In this work, a computer model was devised based on the extant perceptual research. The model was then used to predict the relative degree of pseudo-polyphony for a wide sample of contrived passages. The model was tested against data collected from five music theorists and was shown to be highly consistent with the theorists. In fact, the model behaved in a manner closer to the "average" theorist than any of the individual participating theorists.

In summary, six types of empirical evidence can be identified that seem to provide converging evidence concerning the influence of pitch proximity on melodic organization. One type of evidence is the preponderance of small intervals in most of the world's melodies observed by Ortmann, Merriam, Whinery, Fred, Dowling, and Huron. A second type of evidence is the reciprocal prevalence of large pitch intervals in pseudo-polyphonic passages found by Dowling. A third type of evidence is the auditory preference for small intervals in melodies found by Dowling. A fourth type of evidence is the perceptual expectation for small intervals in continuations of melodic contours found by Carlsen. A fifth type of evidence is the avoidance of part-crossing in polyphonic music measured by Huron. And a sixth type of evidence is the successful computer modeling of expert judgements of melodic organization.

The strengths of this research, include the following highlights: 1. Several specific experiments (such as the Miller and Heise experiment) were directly replicated. 2. In addition, several specific hypotheses were corroborated by different researchers. Of the nearly 30 researchers involved, many were unaware of the work of other scholars, suggesting a high degree of independent observation. 3. Music scholars (such as Ortmann and Merriam) carried out independent lines of research from the experimental psychologists (such as van Noorden and Bregman). The music scholars were motivated by an interest in melodic organization and voice-leading, whereas the psychologists were interested in auditory organization and streaming. The comparative independence of this work provides a degree of cross-disciplinary convergence. 4. The research involved both the use of controlled experi ments and descriptive musical surveys. 5. The listening experiments entailed a variety of types of stimuli, including pure tones, complex tones, simple tone sequences, musical passages, and speech. 6. The listeners were asked to perform a number of different tasks, including reporting how "expected" the next pitch is, singing the next pitch in a sequence, judging "how good" a melody sounds, identifying the "best pitch" to fill a melodic gap, and other tasks. 7. Similar results were found in both music and speech. 8. The researchers studied both listeners and compositional artifacts (notation). 9. The musical repertories studied included vocal and instru mental music from many different cultures. The repertoires included classical, folk, and popular idioms. 10. Computer modeling was carried out and the model was validated by comparing model behavior with that of professional music theorists. The operation of the model was based on the established perceptual research. 11. Finally, the empirical research was directly linked with explicit claims made in classic music theory tracts. The resulting principals are consistent with traditional advice given in classic music theory texts. Of course there is room for improvement. A better case might consider the following: 1. Although the experimental research was carried out in several countries (Britain, Canada, Germany, Hungary, Italy, USA), all of the listeners were Western. 2. No performance-oriented research has been undertaken. For example, an experiment might be carried out to show that improvised melodies tend to use small pitch intervals.

It bears emphasizing that no volume of evidence is sufficient to establish the truth of some claim. As we saw in the case of Newtonian mechanics, even ideas that have a long history of converging evidence can fall apart. There remains the possibility that a superior interpretation for pitch proximity may arise.

The Importance of Converging Evidence

Confirmation Bias

It is clear that assembling converging evidence in support of an idea can entail a great deal of work. Is all this work worthwhile? Why do we need converging evidence?

The simple answer to these questions is that scholars make mistakes. In medicine, it is relatively rare that a single observation or analysis is assumed to establish a point. Yet, in humanities scholarship -- including music research -- it is not uncommon for scholars to assume that a single observation or analysis has established something. Failing to further test an idea can have onerous repercussions. Once a concept is accepted, it is often difficult for scholars to abandon it. In a famous series of studies, Peter Wason and Philip Johnson-Laird showed that people are tenacious in holding on to a particular view, even when the contradicting evidence is obvious ( Wason, 1960; Wason & Johnson-Laird, 1972). Observations that coincide with our theories are taken as supporting proof and are given great mental weight. On the other hand, observations that contradict our theories are taken as exceptions and have little mental impact. This psychological phenomenon is known as confirmation bias.

Unfortunately, the problem does not end with the fact that untenable ideas fail to be abandoned. Since scholars tend to build on the work of previous scholars, this can easily lead to the construction of elaborate intellectual sand castles. Why do we need converging evidence in music research? Because many important ideas in music scholarship are not well established.

Conclusion

There are two dangers in inductive research. As we've seen, one danger is to assume that something can be "proven" through observation. Over the past three decades, a number of critiques in the philosophy of science have highlighted the problem of induction ( Agassi, 1975; Feyerabend, 1975; Gellner, 1974; Kuhn, 1962/1970; Lakatos, 1970; Quine, 1953; and others). However, a second danger is to assume that inductive knowledge is impossible. The most well-known critiques of inductive reasoning (such as Paul Feyerabend's Against Method) all use historical examples to make their case. That is, specific historical observations are used to support a general claim about the fallibility of induction. In short, these authors rely on inductive reasoning itself to make the case that inductive reasoning is problematic. If inductive reasoning is problematic, then it follows that we ought to mistrust their conclusions.

Many music scholars have mistakenly regarded the problem of induction as a fundamental problem in science. However, the problem of induction -- drawing general conclusions from specific observations -- is not restricted to empirical approaches to music. It is similarly found in post-modernist music scholarship. Consider, by way of example, the following argument from Burr (1995). Burr begins by discussing how attitudes towards children have changed historically:

Whether one understands the world in terms of men and women, pop music and classical music, urban life and rural life, past and future, etc., depends upon where and when in the world one lives. For example, the notion of childhood has undergone tremendous change over the centuries. What it has been thought `natural' for children to do has changed, as well as what parents were expected to do for their children (e.g. Aries, 1962). It is only in relatively recent historical times that children have ceased to be simply small adults (in all but their legal rights). And we only have to look as far back as the writings of Dickens to remind ourselves that the idea of children as innocents in need of adult protection is a very recent one indeed. We can see changes even within the timespan of the last fifty years or so, with radical consequences for how parents are advised to bring up their children. This means that all ways of understanding are historically and culturally relative. [pp. 3-4]
Burr's argument is inductive. The specific example of changes in attitudes towards children is cited as supporting Burr's general conclusion that "all ways of understanding are historically and culturally relative." As Hume pointed out, such forms of reasoning are useful, but unreliable. Notice, moreover, that the argument that I have just made is also inductive. That is, I have claimed, from a single example, that inductive reasoning is ubiquitous. There is no getting away from it: inductive reasoning is commonplace, useful and problematic.

At this juncture, the more serious problem is to assume that inductive knowledge is impossible -- that we cannot learn from observation. We may be damned if we learn from experience, but we are certainly damned if we don't learn from experience. Inductive reasoning is a necessary evil, but prudent measures like converging evidence can help to keep the beast at bay.

As we have argued, the most reliable knowledge arises when several independent research methods point to the same interpretation. We can be most confident of our knowledge when, no matter how we look at the phenomenon, the same answer is supported.

In music research, converging evidence can come from a number of different sources. In laboratory environments, listening experiments may be done using highly controlled stimuli. Alternatively, experiments may be carried out in more natural settings using authentic musical materials. In other cases, physiological evidence (such as cardiograms or electroencephalograms) may be consistent with a particular hypothesis or interpretation. Evidence may also be sought in samples of musical notations or in measurements of performance nuances or sound recordings.

Introspective phenomenological accounts might also tend to support one hypothesis over another. Similarly, studies of groups of people in certain social situations may be informative. Historical treatises and other theoretical writings might provide evidence in support of a given account. Finally, cross-cultural evidence may be examined -- through experiments, studies of musical materials, performances, improvisations, or indigenous accounts that are either similar or contrasting from culture to culture.

Unfortunately, relatively few musical phenomena have been investigated from so many points of view. For most phenomena, we must remain skeptical of the preliminary results until further corroborating studies are carried out.

References

Agassi, Joseph. (1975). Science in Flux. Dordrecht, Holland: D. Reidel Publishing Co.
Berardi, A. (1687/1970). Documenti armonici. Bologna: Giacomo Monti, 1687; reprint edition, Bologna: Forni, 1970.
Bozzi, P., & Vicario, G. (1960). Due fattori di unificazione fra note musicali: la vicinanza temporale e la vicinanza tonale. Rivista di psicologia, 54 (4), 253-258.
Bregman, A. S. (1990). Auditory scene analysis; The perceptual organization of sound. Cambridge, MA: M.I.T. Press.
Bregman, A. S., & Campbell, J. (1971). Primary auditory stream segregation and perception of order in rapid sequences of tones. Journal of Experimental Psychology, 89 (2), 244-249.
Burr, Vivien (1995). An Introduction to Social Constructionism. London: Routledge.
Carlsen, J. C. (1981). Some factors which influence melodic expectancy. Psychomusicology, 1, 12-29.
Cavalli-Sforza, L., & Cavalli-Sforza, F. (1995). The Great Human Diasporas; The History of Diversity and Evolution. Reading, MA: Addison-Wesley.
Ciocca, V., & Bregman, A. S. (1987). Perceived continuity of gliding and steady-state tones through interrupting noise. Perception & Psychophysics, 42, 476-484.
Darwin, C. J., & Gardner, R. B. (1986). Mistuning a harmonic of a vowel: Grouping and phase effects on vowel quality. Journal of the Acoustical Society of America, 79, 838-845.
Deutsch, D. (1975). Two-channel listening to musical scales. Journal of the Acoustical Society of America, 57, 1156-1160.
Dowling, W. J. (1967). Rhythmic fission and the perceptual organization of tone sequences. Unpublished doctoral dissertation, Harvard University, Cambridge, MA.
Dowling, W. J. (1973). The perception of interleaved melodies. Cognitive Psychology, 5, 322-337.
Feyerabend, Paul (1975). Against Method: Outline of an Anarchistic Theory of Knowledge. London: Verso Edition.
Fitzgibbons, P. J., Pollatsek, A., & Thomas, I. B. (1974). Detection of temporal gaps within and between perceptual tonal groups. Perception & Psychophysics, 16 (3), 522-528.
Gellner, Ernest. (1974). Legitimation of Belief. Cambridge: Cambridge University Press, 1974.
Huron, D. The avoidance of part-crossing in polyphonic music: Perceptual evidence and musical practice. Music Perception, Vol. 9, No. 2 (1991) pp. 135-154.
Huron, D. Voice segregation in selected polyphonic keyboard works by Johann Sebastian Bach. Unpublished PhD dissertation, University of Nottingham, UK, 1989.
Kuhn, Thomas S. (1962/1970). The Structure of Scientific Revolutions. (2nd edition, 1970) Chicago: University of Chicago.
Lakatos, Imre & Musgrave, Alan (eds.) (1970). Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press.
Merriam, A. P., Whinery, S., & Fred, B. G. (1956). Songs of a Rada community in Trinidad. Anthropos, 51, 157-174.
Miller, G. A., & Heise, G. A. (1950). The trill threshold. Journal of the Acoustical Society of America, 22 (5), 637-638.
van Noorden, L. P. A. S. (1971). Discrimination of time intervals bounded by tones of different frequencies. IPO Annual Progress Report, 6, 12-15.
van Noorden, L. P. A. S. (1975). Temporal coherence in the perception of tone sequences. Doctoral dissertation, Technisch Hogeschool Eindhoven; published Eindhoven: Druk vam Voorschoten.
Norman, D. (1967). Temporal confusions and limited capacity processors. Acta Psychologica, 27, 293-297.
Ortmann, O. R. (1926). On the melodic relativity of tones. Princeton, NJ: Psychological Review Company. (Vol. 35, No. 1 of Psychological Monographs.)
Pattison, H., Gardner, R. B., & Darwin, C. J. (1986). Effects of acoustical context on perceived vowel quality. Journal of the Acoustical Society of America, 80, Supplement 1.
Petter, G. (1957). Osservazioni sperimentali sulla natura dell effetto tunnel. Rivista di psicologia, 51 (3), 1-15.
Quine, Willard. (1953). From a Logical Point of View. Cambridge, MA: Harvard University Press.
Rameau, J.-P. Traite' de l'harmonie. Paris: Ballard, 1722; trans. by P. Gossett as Treatise on Harmony. New York: Dover, 1971.
Schouten, J. F. (1962). On the perception of sound and speech. Proceedings of the 4th International Congress on Acoustics (Copenhagen), 2, 201-203.
Steiger, H., & Bregman, A. S. (1981). Capturing frequency components of glided tones: Frequency separation, orientation and alignment. Perception & Psychophysics, 30, 425-435.
Tougas, Y., & Bregman, A. S. (1985). Crossing of auditory streams. Journal of Experimental Psychology: Human Perception and Performance, 11 (6), 788-798.
Wason, P.C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129-140.
Wason, P.C. & Johnson-Laird, P.N. (1972). Psychology of Reasoning: Structure and Content. London: Batsford.
Zarlino, G. (1573/1965). Le istitutioni harmoniche III. Venice: Francesco Senese, 1573; reprint edition, New York: Broude Brothers, 1965.

This document is available at http://dactyl.som.ohio-state.edu/Music829C/converge.html