At New York's Manhattan School of Music, Linda McKnight is preparing to teach a Friday morning masterclass in double bass. It won't be much different from any other masterclass--there will be performances, demonstrations of technique by both instructor and students, and a healthy question-and-answer session--except for the fact that while McKnight gets ready in her classroom, her students are some 470 miles away at the Cleveland Institute of Music. They will participate via the internet.
The Manhattan School of Music pioneered this kind of interactive distance learning program in 1996, using it for private lessons, masterclasses, educational outreach, composer colloquiums, professional development sessions, and educational exchanges among schools both nationally and internationally. Multiple cameras provide different views and different perspectives crucial to teaching musical technique--such as close-ups of a hand on a keyboard or string.
The idea itself is not new; colleges and universities have experimented with interactive distance learning for years. What has changed in the nine years since the program began is the quality of the sound--full stereo reproductions that impress even the most discerning audiophile. "In terms of audio as it relates to distance learning, this is innovative," says Christianne Orto, director of Recording and Distance Learning at MSM. "What we've done is taken the technology and adapted it for a live music performance application."
Videoconferencing was not designed for music. In fact, for years, it featured little more than a picture with often tinny, telephone-quality sound to facilitate corporate meetings or distance learning classes.
But now, advances in audio "codecs," echo cancellation, and transmission methods have enabled MSM to connect students and musical artists together around the country and around the world. A codec (short for "coder-decoder") is a device that converts analog audio signals into a digital format for transmission, and then converts them back into analog format on the receiving end to play through speakers. Echo cancellation is a process that removes audio echoes that distort sounds.
"The codecs and echo cancellation units have all dramatically improved," Orto says. "It used to be that echo cancelers wouldn't really understand the complexities of musical sound and would create weird howling sounds and strange artifacts of sounds. The echo cancelers are getting smarter and can handle the harmonics and frequencies of musical sound."
"High-quality sound is of paramount concern in musical video conference," says Orto. "It enables us to bridge the gap of distance to have the teacher here and the student in a remote location. It is really wonderful because it's live, interactive videoconferencing in real time."
Although audio recording and reproduction predates video recording by only about 50 years, the two technologies have advanced at very different rates. "The microphones have long been sophisticated," says Orto, "but now we can use those sophisticated microphones more effectively, because the video conferencing technology has caught up with us to a degree."
"I think we have done a lot to get the other elements of multimedia together--the picture and data transmission and the integration of other multimedia instructional tools," says Russ Colbert, global education market manager for Polycom, a firm that makes rich-media collaborative applications for the web. "But we've done disgustingly little to get the audio right for the room, and that seems crazy, doesn't it? If you don't have quality audio, what good is all that other stuff?"
Combining sound and video to the degree that Manhattan School of Music needs for lifelike music reproduction, or to provide the ultimate real-time distance learning class--still has hurdles to overcome.
For example, says Orto, even though MSM uses a variety of high-quality condenser and ribbon microphones to capture the sound, "we don't really know on the remote side what it sounds like, so we have to rely on the ears of those on the remote location to give us feedback," she says. "The next step of innovation would be for us to be able to know what the other side sounds like through some sort of monitoring device."
And, ultimately, technology has to answer to the laws of physics.
"At this point it would be a lie to say that two people could play a duet from remote locations," Orto admits. "We still have the speed of light to contend with."
The MSM setup features live audio-video communication on both ends. There is a coding-decoding delay and a transmission delay that adds up to between a half-second and a second. "The latency is there, and we are still working very hard to see whether the latency can come down through improved codecs," Orto says.
With advances in code processing--and greater bandwidth availability through Internet2--the goal seems to be getting closer.
"I would like to see in the next three to five years that we could look forward to a real-time collaboration--something like a 10-millisecond delay, not detectable to the listener," Orto says.
Maybe you've experienced a problem like this: You are watching a video of, say, a lecture during which there is a question-and-answer session. The presenter gestures to an audience member, off-camera, who has raised a hand. There are a few moments of silence--a question is being asked, but you can't hear it. The speaker then answers the question, but you have no idea what was asked. Not exactly what you need as a viewer.
The solution is an intelligent audio-visual setup that can mimic the classroom experience for remote viewers, directing the camera to the source of the sound.
Such a system is in use at Pacific Lutheran University in Tacoma, Wash., says Bob Holden, associate director for Multimedia Services. "We have a new building set to go online in January, with about 20 smart classrooms. Each one is set up with a full sound and video system to record lectures," he says.
Because of the physical design of a traditional classroom--a large area with the instructor at one end and the students some distance away--recording audio properly requires a combination of solutions. Holden says professors can use a podium microphone or can wander around the room with a wireless head-worn microphone. Students can access a close proximity touch-to-talk wired microphone--maybe one mic for every three or four students.
"The audio signal goes directly into the system and records to tape or disk to capture the audio cleanly," he explains. For the intelligent video portion of the presentation, however, Holden brings in student help.
"It's much cheaper to hire a student who is getting paid from work-study funds to go and operate the camera. They can be sensitive to whatever the programming is," he says. "If the instructor wants a class videotaped, we have students who are trained to go out and do it as a full production."
For many institutions, this type of setup is sufficient. It offers complete control of content and is relatively inexpensive. But what happens when camera operators are unavailable? Is there another way to capture the classroom without having to worry about whether a student camera operator is going to show up on time? The answer is sound-controlled cameras.
Polycom's Colbert explains the process this way: "The camera starts at a 'home' position on the instructor at the front of the room. When a student asks a question, her voice triggers a detection of the voltage created by the voice, so that detection sends a signal to the video controller that says, in effect, 'Someone is speaking from microphone number three; trigger automatic camera switching in the direction of microphone three.' Likewise, when the signal from microphone three stops, and there is no voltage present from the speaking student, the camera switches back to the home preset."
While that system is effective and commonly used, it requires a lot of equipment and a lot of wires running around the classroom. An alternative is the system in use at Radford University (Va.).
"We have a faculty that likes to keep up with technology, and a challenge to my group was to build an electronic classroom for every one of our 150 classrooms on campus," says Randy McCallister, telecommunications engineer at the school. "In some of those rooms we use the Komatsu AirProjector Wireless Presentation System so people can walk around the room with their Gateway tablets and be projected on the screen from anywhere."
Besides operating its own cable television network, Radford also provides educational content to high schools in Virginia, and the institution shares classes with Virginia Tech as well as distance students in Roanoke, Abington, and other locations throughout the state.
"Most of our professors have been trained in these rooms, and we make them all look the same so they feel comfortable as they walk in and they know how they work," says McCallister. The institution sought a system that allowed distance learners to see and hear the classes as if they were actually present in the room. At the same time, classroom designers wanted a system that was self-contained and simple to operate.
"Audio itself was no longer a concern," he explains. "My bigger concern was making all the microphones balance. I have several classrooms that use touch-to-talk microphones--typically 20 to 22 microphones per room--and a table of students will share a microphone."
However, the number of microphones only added to the complexity. Besides the wiring that runs through the room, the setup requires an audio mixer to adjust mic levels. "Some people speak softly and some are boisterous. And there's always the threat of feedback," McCallister says. "Our goal was to make the class presentations easy to broadcast and strip this thing down to absolute simplicity."
The answer at Radford was a setup with a single, mounted sound-powered camera and a "smart" ceiling-mounted microphone array from Polycom. The Polycom VSX 8000 system features three cartoid directional microphones arranged in a triangle. It also has a transparent Plexiglas mounting plate that serves a dual purpose: Near the microphones, the plate acts like a satellite dish, collecting and focusing audio from the room below, but the plate also blocks undesired sounds, such as air-conditioning units or plumbing, that are coming from within the ceiling. One CMA covers an area of about 30 by 30 feet, and several can be combined for larger rooms.
Microphone arrays have been around for a number of years, but they have traditionally picked up all sound in a room and couldn't discern which sounds were important and which were not.
But thanks to advances in this technology, the microphone array can focus on the most important sound, and even detect the movement of the speaker across the room, and across the stereo separation, mimicking the action of the human ear. As the speaker moves from the left to the right side of a room, the microphone adjusts its input to reflect that movement. It senses the loudest voice in the room and aims at it. The instructor doesn't need to wear a microphone, nor do students have to be near a microphone to be heard.
"We've been really happy with that arrangement," says McCallister. "We mount the CMA just a little bit closer to the professor's position, and it picks up sound around the room just fine."
The system doesn't require costly sound mixers or controllers. The sound-powered camera turns in the direction of that loudest sound. And those pesky question-and-answer sessions? "There's a few-second delay in the camera-turning motion," says McCallister, "so as long as the student is asking a question for more than three seconds, the camera will turn to the direction from which the sound is coming."
That's not to say the system is without problems. "Some of the classrooms don't have enough sound absorbency, and sounds can sometimes bounce around and confuse the camera," McCallister says, "but we're working on dampening those spots so it works better."
The simplicity enables Radford to create distance learning classes where and when they are needed. "Because of our multimedia classroom design, we can simply walk into a room with the Polycom VSX 8000, plug it in to our system, mount a single camera and a single microphone array and we're done," says McCallister. "This flexibility lets us quickly change the classrooms we use for distance ed if we need to, such as when an instructor has a special need for a larger space. It takes just a couple hours to set up."
Eventually, McCallister predicts, the ceiling array system will replace all the desktop mics. "We want to eliminate user interference and make this as simple as possible."