"Perhaps because of the noisy and chaotic nature of the production process, these tools are located in the south side of the encircling ring architecture." (Bakersmith Report, 2015, p.37)

Audio: A Multimedia Toolbench Chapter

Concepts

"The intuitive mind is a sacred gift and the rational mind is a faithful servant.
We have created a society that honors the servant and has forgotten the gift."
Albert Einstein


Selected Audio Software & Hardware for Input, Manipulation & Output
Title Access Online Tutorials Reviews Examples Publications Company Costs
. . . . . . . .
. . . . . . . .

 

The Encyclopedia Britannica states that "sound is a mechanical disturbance from a state of equilibrium that propagates through an elastic material medium". Through air at sea level, sound travels at 741.1 miles per hour. The denser this medium, the faster sound travels. In seawater, sound travels over four times faster than air. But knowing such detail about sound provides virtually no guidance to the creator of a wide variety of compositions in the information age. Consequently, this chapter is less concerned with the physics of sound and more concerned with its human perception. This chapter's emphasis is on the synthesis of audio into the larger mix of communication and composition elements that are important to our culture and to teachers and learners. Synthesis in this chapter means a unified composition of audio (sound or music) and any combination of other communication elements such as text, still images, animation, video, virtual reality, and electronic sensors and remote control.

Prior to computer technology, a form of "parallel play" was possible, for example by turning on music while reading a book, or listening to the radio broadcast of a game while watching the game on TV. This chapter will provide ideas that can be used in the parallel play environments of our legacy systems still common in our homes, schools and businesses. But this chapter's goal is to look beyond this era. The electronic age allows new compositions in which the audio and other elements are considered as part of a unified composition, in the same way that music tracks are integrated so intimately in movie composition. Such compositional design allows one individual to take responsibility for all elements of the synthesis or to engage multiple composers in the unification of a single work. This synthesis across these major forms of communication and composition is only possible using computer technology.

The concept of audio represents two overlapping elements: sound and music. Sound has alerted us to danger and to opportunity. Speech, the creation of particular sounds for particular objects and ideas, was a watershed experience at some unknown point in human history. Music is a specialized form of sound, but a form of such importance to our species as to have had universal and significant appeal across all time in all cultures of which we have knowledge. While sound effects and speech have been used to emphasize the rational relationships of cause and effect, the use of music has been to accent the emotional, the non-rational aspect of human communication. Major elements of culture and communication throughout human history have used music to magnify and enhance their power and efficacy. Music has the capacity to reflect and influence human emotion. Though current culture accepts music's intrinsic value as an art unto itself, it also has a significant history of extrinsic relevance to seemingly everything. It is used at times like a salt for nearly every food for thought. Music has never been more readily available nor more widely used.

At this stage in United States educational history, the role of both sound and music has developed certain curious characteristics. The anomaly to the observation of the universality of music would appear to be the educational system of at least the United States. In the last 50 years particular curriculum philosophies have created a tension between the "arts" and the "basics" of education, meaning reading, writing and mathematics. The arts, including music, have been increasingly specialized and increasingly marginalized. This was done primarily to provide greater time on task to address what are referred to as the curriculum basics but also to provide the music educator with focused time on their own agendas. The debate about extending the school day and the school year to provide even more time for instruction on the basics implies a sense of increasing failure to achieve national curriculum goals in the basics for many students. One must ask whether this separatist movement  is self-defeating. Given the perceived relevance of art in general and music in particular to human culture, perhaps learners find math and reading increasingly less relevant as the arts become increasingly less relevant to core educational values. An in-depth look at the role of the arts and music in human history would suggest that a reversal and rethinking of this particular current curriculum trend would have greater long term strategic value.

Perhaps the primary use of sound in current classrooms is in its enforced absence. Sound is a significant means of signaling obedience and extending control. Here, sound has more negative connotations that positive ones. Schools teach that the making of sound that is not specifically authorized is an act of misbehavior if not outright defiance. Perhaps this explains some of today's popular musical trends. Today's classrooms are the polar opposite of the "blab" schools of Abraham Lincoln's day when everyone was to read and think out loud and the ear was trained to ignore the sounds of others and to concentrate on one's own mental activity. In today's classroom, sound of any kind other than one individual's voice at a time is generally perceived as a disruption to the concentration of others. Further, music's role in the overall curriculum is one of "auditory cheesecake", nice but not essential. This is in some contrast to the home environment whose central rooms generally have some radio or television producing sound and children's rooms which frequently have some kind of music playing throughout waking hours. This is in contrast to most business environments which allow music to be playing while work is underway or which encourage music to be played in general sales areas. Music and sounds play an important role in today's psychotherapy, geriatrics, and advertising. Even dairy barns and animal feed lots have found that music has improved the effectiveness of the environment. School use of sound is certainly in contrast to the newest cultural forms of communication, computers and their networks. In just the last few years, sound capacity (including CD and DVD players, speakers and headphones) has become a standard feature. Network transmissions include sound in audio and video conferencing, in Internet phone calls, in the transferring of musical data files, and in automatic transfer of data that is recognized by media plug-ins and players which manage sound and video across the Internet.

What explanation can there be for this interesting contrast between schools and general culture? One explanation might be that forms of economic and technical determinism are at work. Educators may be managed covertly by the capabilities of a given technology. The dominant form of communication in most classrooms is cellulose technology. Perhaps there is an inherent operational advantage to aligning closely with the characteristics of the technology that dominates, in this case a silent communication of the visual, of text and still images. Paper is also an extremely cost efficient form of mass communication. One could conclude that educational systems have optimized their environments for lowest cost mass communication. Based on this line of thought, one could predict that as classrooms change to optimize for the emerging and highly interactive global culture, a culture which emphasizes more direct individualized and small team communication across vast distances, that the economics and technologies of the classroom will change. If so, the economics of paper versus computer networks and the technical features of textbooks versus media savvy communication machines will dramatically alter the nature of common classroom behaviors and consequently their use of sound.

There are other deeper reasons for the segregation of basic human communication and composition skills. These reasons reach to the core of our curriculum theories about why and how we teach and learn. There is a perception that to best teach a skill we must isolate it like a scientist isolates a variable for experimentation. Reducing the number of interactions to the minimal number possible is essential. Divorce music instruction from reading and reading instruction will improve because other goals of music instruction are not included. The number of goals being worked on has been simplified.

Where does this line of thinking come from? It comes from a rationalistic framework that hungers for simplified cause and effect in order to maximize control and production. Our culture hungers for it because we have seen and been told of its power. Such analytical thinking has had incredible power in the physics of building better tools and machines. Its seems straightforward to apply this logic to living systems, including people. If we spend more time directing a child to work on word attack skills then this thinking implies that their word attack skills will get better.  If we include a feedback loop to spot and correct errors in these word attack skills they will reach perfection in them more quickly. Music education would use the same logic. If we spend more time in learning notes in the treble clef then we will read music better. If we include a feedback loop to spot and correct errors, they will reach perfection in their use faster. The more important the process, the greater the time that should be spent on it. More people do not work in the music industry making music than do, so drop music and add time on text in books.

In fact, it is a philosophical leap of faith to conclude that a process or method that succeeds wonderfully in improving simple inanimate objects also works on improving complex living organisms or the cultural systems by which these living entities organize. If human beings were machines that did not require special motivation to maintain effort over ever longer periods of focus, and if a human being's power to self-control what their mind attends to were not an issue, then this rationalistic philosophy would have greater educational value. However, such philosophy has severe limitations in maintaining quality output whether working with adults or children. Increasingly, even factories that produce inanimate objects have had to re-examine and deconstruct this model for their social organization to maintain the quality of their goods.

If it is important to build on a scientific frame of reference, then the study of ecology and the logic that descends from it has a value that needs to be examined. In the science of ecological systems, it is understood that complexity and diversity are essential to health. Divorce and segregation of the elements is generally unnatural. The removal of species is seen as weakening the overall system. Homogeneity is a sign of impending death. Increasing the number of interactions between elements of the systems is a means to improve its well being. The nature of interaction between even two or three major the elements of an ecological system is seen as so complex that long term prediction and control is not possible. Applying this philsophy to education, the music teacher and the reading teacher would be encouraged to increasingly merge their roles and look for new ways to interact between their goals. Simple but effective ideas will emerge. For example, the reading teacher struggles to motivate the child to increase their fluency with new words by reading the same passage over and over. The music teacher knows that with the right song, a combination of new words will be hummed and practiced for days without any external pressure.

Sound and music are an integral part of human culture. Sound provides its own intrinsic values. For example, there is no substitute for hearing the actual vocal pacing and inflection of a voice to derive the full meaning of words. Using identical words, voice mail provides more information than email and this additional information is not only important to the listener but sometimes more important than the literal meaning of the words.

In spite of the common use of sound in general and music in particular, it remains hard to rationalize as required or not required in any given situation in the design and composition of communication. We must work with many generalities. These generalities also seem to have some application across species. Any aural projection is a clarification of territory (e.g., a bird at edge of its territory in the spring). Here I am and this is how I feel about myself and about the presence of others. The social nature of our species is such that we can often empathize with and resonate (e.g, the howl of the wolf and the reply of the pack) with this projection of energy. To talk, sing or play can also be an invitation to others to add their resonance in unison or harmony whether the receiver actually returns a response or only thinks it. There is a "language of the emotions" but not a formalized language, a nonrational communication so imprecise that no one has yet articulated it, but so important to particular situations that our vocal and musical intonation varies with every social setting. If it does not, we are considered insensitive and perhaps impolite. Further, entire languages, such as Chinese, are built on tonal systems. In tonal systems the rise or fall of the final sound of makes it a different word. In other words, people are sensitive to the nuances of sound and can become much more sensitive to its communication potential. The importance and value of sound will make it an integral part of "web processing" (e.g., web based communication) in the years ahead.

There are a few specific cognitive facts that are also available. Approximately one percent of human beings are tone deaf, a condition also known as amusia. They cannot sing in tune or tell two tunes apart. A even smaller percentage of the population is autistic, a condition that could be described as varying degrees of emotional deafness. Yet most autistic people are very musically capable and some are musical savants, demonstrating an ability for what appear to be emotionally deaf people to work with what many in the population see as a language for emotion. Studies have shown that intense musical practice with an instrument leads to growth in the cerebral cortex, the area most strongly associated with higher brain function. Music also can affect the "levels of various hormones, including cortisol (involved in arousal and stress), testosterone (aggression and arousal) and oxytocin (nurturing behavior) as well as trigger release of the natural opiates known as endorphins. Using PET scanners, Zatorre has shown that the parts of the brain involved in processing emotion seem to light up with activity when a subject hears music" (Lemonick, 2000, p.74). This said, this does not provide more than general guidelines for composition. Why our species distinguishes itself with its verbal and musical abilities remains one of the unknowns of science.

The value and use of aural composition (including speech and music) in a classroom depends the instructor's perception and expectation of the availability of the technology for both its output and input. The technology of paper (e.g., cellulose technology) inherently supports only part of human communication capacity, and at that, only part of visual communication. That is, text easily integrates with imagery that includes such still elements as graphs, photographs and tables. Paper is not capable of dealing with stored or live communication that extends through some measure of time, including sound, music, animation, video and virtual reality. In the new age of communication that is emerging, paper's dominance is doomed. This is not to say that paper will or should disappear. It will always have its conveniences. We are habituated to its use. But its fundamental inability to handle the full range of stored human communication compared with the capacity of computer technology to do so suggests that its time has come and is going. It's share in the market of communication falls with advances in electronics. The length of time that paper technology remains in the twilight of its golden years is at the mercy of the cost of production of computer chips and network technologies, all of which drop in price and double in capacity every 18 to 24 months. Further, the form factor of electronic display of information is so malleable that electronic structures that mimic the characteristics of sheets of paper are emerging. The switchover time is relentlessly approaching at which the cost of computer delivery of information will make computer delivery and sharing of electronic information as universal as paper. Relevant to the use of sound and music, in the rest of the culture the tide has already turned. The regular use of radios, stereos and speakers is a commonplace everywhere but in the classrooms. Further, sound technology is a basic part of the personal computer package. However, without the perception and expectation of the use of audio, even the computer through the use of email can contribute to the devocalization of human culture.

As we increasingly near the dominance switchover point between paper and electronic technologies, the role and value of time driven elements such as sound, animation and video will grow. The expectation that educational systems will teach with and about them will grow. With this expectation will come a need to compose, edit and publish with these fundamental communication elements. Sound is the easiest of these elements to address now with current computer technology.

In considering sound, we examine a harbinger of other changes to come. Yet we must also contend with a rather tawdry history of the technology for sound recording and playback. Though many of the technologies for the storage of sound, such as wax, plastics and wires, will last for decades or centuries, our inventiveness has created a number of pockets of "dead media" in which the playback devices are no longer supported and maintained. It is the rare university library that has any device for replaying Edison's scratchy wax cylinders of the 1800's or the high quality sound of 8 track tapes of the 1970's. The phonograph player may be the next in line for extinction. Fortunately, computer technology provides a wide range of tools for converting digital media stored in one electronic sound format to another and the long term retention of such tools for conversion is inexpensive.

In order for computers to facilitate the the use of sound, the sound must be in digital format, either the computer must digitize what comes from a microphone or the computer must provide software that can compose with already digitized sounds, sounds that were created by electronic devices (MIDI). In part because of the data storage requirements of digitizing real audio, the MIDI system was developed in the early 1980's.  MIDI stands for musical instrument digital interface and is a standard method for digital technologies to pass music and sound information to each other (Voyetra, 2003). Multimedia composition benefits greatly from the use of MIDI sound. One minute of CD quality stereo music or sound can require 10 megabytes per minute and more. Compressing this sound reduces its quality. With compression at 11 kbits (11 thousand bits of data per second), one minute of speaking or music that is still understandable and useful might take up about 80 kilobytes, thought it might not be very enjoyable to hear. Tighter compressions  are also possible, with varying degrees of quality. In contrast, one minute of MIDI music might only require about 20 kb of data, and the sound would be a relatively high level of sound quality.

The MIDI system requires that the electronic sounds to be used have been stored on the computer in advance. When the operating system is installed in a computer, the MIDI sounds are included. There are 128 standard instrument sounds, and an extension to this standard allows other sounds to be created and added. The 20 kb of data in one minute of MIDI sound includes just the data on how to play the sounds that are already present. That is, the computer file instructs the computer to pick out the various sounds already stored on the computer and play those sounds as a sequence of notes, a sequence which includes their volume, length, and pitch. Such files transmit across computer networks very quickly, and enable sound to play as soon as a web page is opened. In complex multimedia compositions, the computer's CPU can be told to simultaneously display video, rotate three dimensional objects, scroll text and receive further information from the Internet or hard drive and more. Also playing high quality sound could bog down the composition so that none of the elements of the composition display or work properly. If audio is important, playing a small MIDI file may be essential to allowing the composition to function as intended. MIDI can also be used to pass along timing information that is used to control lights, fog machines, CD players and other devices hooked up to electrical power.

MIDI sound quality is not uniform. Different qualities of MIDI sounds are used on different computers. Some MIDI sounds are synthesized or created by electronic components, called FM synthesis. These sounds can vary in how much data is stored for each sound. Other MIDI sounds are digital samples of real instruments, wavetable synthesis. Though these sounds can also vary in how much data is stored to define each sound, this approach enables a much higher quality of sound, a quality that approaches CD recording capability.

In contrast to MIDI, another system called MODs includes the instrument sounds with the other data files for a composition. MODs provide a much more uniform listening experience.

To better understand audio's place at the electronic table, now and in the future, one needs deeper levels of experience with audio composition. Three stages of audio composition are considered: input, manipulation, and output. These cannot be considered totally distinct stages however, for input and manipulation are controlled by the purposes of output, by knowledge of audience. The software tools frequently cross these boundaries, integrating input, composition and output. This makes the categorization of software into the three divisions difficult. Their placement is more a case of emphasis than pure match. After this exploration, the chapter returns to the issue of integrating audio with the previously considered elements of communication: text, still images and video.
 

Input


Selected Audio Software & Hardware for Input, Manipulation & Output
Title Access Online Tutorials Reviews Examples Publications Company Costs
. . . . . . . .
. . . . . . . .

Input Concepts

Input to the computer in this chapter is about the origination, creation and capture of sound or music. The input might be natural (the  human voice, environmental sounds, a musical instrument) or electronic (a MIDI keyboard or other MIDI instrument, or other man-made sounds using various electronic components).

Natural Sounds and The Recording Studio

Specialized centers for recording audio information are called recording studios. Some centers have a significant capacity to capture the highest quality of sound and protect against other interfering sounds. Such studios can cost upwards of hundreds of dollars per hour to use. For example, Western Carolina University's recording studio opened in November, 2003. Other centers for recording occur at any location where some attention is given to protecting the sound quality of a microphone which is connected to a recording device, whether tape recorder or digitizing computer. This might be as simple as asking the classroom of students to be very quiet for a minute while a tape recording is being made at the teacher's desk or at a center in the classroom. What kind of microphone and other equipment is best for which setting? Most of the answers to this and related questions will be found in resources referenced in this chapter, such as equipment vendors and textbooks. The selection and use of microphones is an enormous industry around which audio engineer careers are built. There is an enormous body of literature to consult on audio engineering and a wide range audio personnel to contact who work in the music, radio or television industry. However, there are some general recording suggestions that have wide application.

If the sound is produced outside the computer, there are minimum and maximum values within which the sound to be recorded must stay. Too soft and the microphone will record nothing. Too loud and the pickup will grab mere loud indistinguishable noises. The type of microphone selected can have a major impact on finding the optimum range. Good quality sound generally means spending $100 or more on a microphone.

The sound quality of the audio is very important to the receiving person's understanding. The closeness of the microphone to the sound source is an important variable in recording quality. If you cannot get the microphone close, there are other options. One option is the use of wireless microphones. Wireless microphones should be a part of every school media center. Children's voices are often soft and shy. A second option is to use a "shotgun" microphone, which is merely a long tube like addition to the microphone (mic) which is pointed at the sound source. Such devices not only reduce extra environmental sounds, but can grab sounds from many yards away. Windy days can also produce the sound of wind currents over the mic. Various coverings are available that cover the mic which block the wind with minor loss of sound pickup. These can be as simple a a small cloth hand-towel.

In many cases, the sound pickup can best be enhanced not by better microphones, but by enhancing the quality of the sound production. This can mean working with the person or performer on sufficient volume. It also means developing a clarity to the sound so that variations are distinct. For a person, this has as much to do with relaxing and warming up the voice before speaking as it does with the careful articulation of the words or music. It also means finding a pitch that is within the normal range for a set of vocal cords.

Providing a separate room which does not echo and provides some isolation from external sounds is the next step up in improving recording quality. Simple inexpensive steps can be taken to reduce the bounce or echo effect in an existing room, such as hanging cloth or carpeting on some of the walls. Acoustic ceiling tiles are also easy to install. Isolating an existing room from external sounds generally requires more expensive remodeling. The best time to think about the design of an recording studio is when a school building is being built and architects and sound engineers can be consulted.

Though tape recorders and VCRs have traditionally been used to record sound, this chapter focuses on digital recording. This generally means including a computer in the recording process. Various software programs can be used to digitize the incoming sound in a computer and manipulate or change the quality of the sound in various ways. Such audio recording programs include Pro Tools, Sound Edit, Cool Edit, Audacity and others. Recording studios play a dual purpose role for both audio and video production. Video editors such as iMovie and Windows Movie Maker that ship free with computer operating systems for Mac and Windows can also record live sound, as well as sound and video. Karaoke software and systems are also sometimes designed to record the singer's voice, along with the background music, whether it is CD quality or a MIDI Karaoke player. Whether recording for school news for the school building cable TV system, radio plays, Karaoke performances, or dramatic readings, the studio should be seen as playing an important role in highlighting real and personal values for the role of reading and writing in the lives of students.

Electronic Sounds

Two examples of highly advanced software programs that enable musical compositions are Finale and Sibelius. Originally they would have fallen in the category of output programs. They were first designed as a kind of electronic engraver for music that speeds the process of putting musical notes on sheet music which has been already recorded by hand. More recently they have become a major tool used for the very invention and input of the musical ideas. They and other programs have become a kind of musical word processor. MIDI instruments, including keyboard and guitar, input notes into the computer as they are played. The computer keyboard and mouse can also be used to enter notes. Once the notes are placed, the computer can not only print out the sheet music for various musicians to try, but can use MIDI instrument sounds to play back the work in progress. Numerous composers of music have testified to the transforming power that this feature of computers has brought to their musical composition, saving them significant time and cost (Sibelius, 2003).

There are other programs that use style guidelines and random number generators to create original musical works, or to create original variations on existing compositions, such as Band in the Box.

"Clip Sound"

The simplest way to gather sound or music for use in multimedia compositions is to find and copy free or non-copyright protected sounds and insert them in compositions.

In addition to library resources (see bibliography in the Audio Toolbench page) web sites are providing instruction on sound input. New sites arrive regularly. Try these different search terms: teaching sound; teaching audio; audio tutorials.

  • Input and Play
  • Musical compositions in this section cannot be used in multimedia compositions on the Internet without specific written permission. However, parts of them can be used by educators locally under the Fair Use provisions of U.S. copyright law.

    Input to Edit Audio composition can be divided into a many steps and stages. These web sites above explore many of the facets of audio composition.

     

    Manipulation

    There is considerable overlap in the software tools for input and manipulation of sound. See the Audio Toolbench page for a wide variety of programs for editing sound and music. One class of tools edits the input of natural sounds. This would include programs like Pro Tools and Sound Forge. Another set of tools is used to input and then playback and edit musical compositions, such as Sibelius.

    Output

    Output systems might send the sound to a more traditional format such  CD, DVD, or audio tape recorder. Output systems might also produce files that would be uploaded for later use from a web page. The Sibelius software, for example, saves compositions as sheet music, and provides a MIDI file that plays the composition, along with other output controls over the pitch and speed of the composition. If enough audio files are uploaded, sequenced and streamed, mixing live with previously recorded works, the end result is an online radio station. Thousands of such stations exist today, making online radio one of the most significant tools to use in helping students explore other cultures. Though the language may not be understood, the music of the language of the speech and the music of their songs can be appreciated by all.

    Synthesis:

    Linking: Multimedia; Compound Document; Comprehensive Composition; Information Integration; Web-processing


    Multimedia Home

    Pub. 3.1.2000. Updated: 12.1.2003
    Pageauthor Houghton@email.wcu.edu - Web Office