The New Internet Multimedia

Market Mud Wrestling in Technicolor, Coming Your Way Soon

by Francis Vale (1998)

After hundreds of hours of hard work and gallons of stomach churning bad coffee, you and your design team have just finished that new MTV-style multimedia opus. So you call up the Intranet webmaster, tell her all about the spiffy streaming video and CD quality audio stuff you've put in there to get the new product message out to all the corporate troops, and then, uh-oh, you are greeted with deafening silence at the other end. A short and frosty conversation later, you find out that all your hard work has just gone up in expensive smoke.

It seems all that cool audio/video stuff was too hot to handle, as your Intranet demigod gives you the lowdown on the Intranet Multimedia facts of life: IF all the corporate desktops are running at least 200Mhz Pentium MMX CPUs; IF the available bandwidth is there when the users log on to download that new multimedia message; IF there is minimal packet loss; IF remote access users are all running 56K modems (and preferably ISDN); IF, IF, IF... THEN maybe your far flung viewers might get about 3 to 12 small frames of herky jerky video per second, and just telephone quality sound, with about 5% of the audio suffering from snap, crackle, and pop -- and likely coming from out of synch lips. And that's IF everything is running super that day. And if it isn't, which is typical of most day to day corporate Intranet operations, then maybe it's time to buy a couple of zillion empty orange juice cans, and a few thousand miles of string. The voice quality may actually be better. And about all that video? Maybe you should send out Polaroid snap shots instead.

Ouch! What's a struggling Corporate Cecil B. Demille to do?

The first thing to do is understand your mission, and your audience. If you are trying to use your organization's Intranet as an on the cheap TV broadcast studio to reach all and sundry, forgettabout it. At least for the moment. Rather, if you have a company with offices all across the county or worldwide, with thousands of workers and/or customers who need to see and hear late breaking, corporate critical news, then why not rent out some movie theaters? Today, business TV, typically done via satellite, is a booming business, and renting out idle movie screens for temporary use as two way video conferencing centers is one its hottest segments. For example United Artists Theater Circuit (UATC), has taken 35 out of their 2,338 screens scattered across the planet, and equipped them with digital satellite equipment from Digital Express. UATC also has several dozen mobile Digital Express units they can move around between their 386 movie house locations worldwide. UATC has given its theater-as-instant-videoconference center the moniker, "Satellite Theater Network," or STN.

Since its inception in 1993, Microsoft has availed itself of the STN at least five times. (Knowing Microsoft, it also probably insisted on owning the popcorn concession.) Cadillac has also used the STN to interactively train its mechanics on the inner workings of some of its cars, like its new Catera. Interactivity is achieved through STN's "Electronic Audience Survey System," a handheld keypad that allows individual audience members to respond to various questions presented during the presentation. And Autodesk smashed all STN attendance records when it simultaneously broadcast a new product rollout event to over 45,000 designers comfortably ensconced in 120 theaters. If you or your CEO ever harbored secret aspirations to be up on the big silver screen, here is your chance. Moreover, there is nothing to prevent you from taking that multimedia extravaganza of yours and feeding it off a (non-web host) machine, and downloading it to the globally dispersed audience. And if you are clever about it, you can use the Survey System's keypad to plug audience responses right into the corporate database. You can even flash web pages up there on that big movie screen.

So, OK, STN is cool for mass audience A/V gigs, but suppose you want to just "show face" in an Intranet videoconference with only a few people? There must be some reasonably viable compression schemes out there that will reduce the net bandwidth. Well, there are. (see article sidebar) But software-only compression and decompression of video still doesn't come cycle-cheap, as it usually leaves room for doing little else on your PC. And videoconferencing over dial up analog modems, even at 56K, is just too painful. And these video compression schemes don't support interactive multimedia applications. And you can't slip in 2-D or 3-D graphics into that compressed video stream. And on and on. But wait! There is a glimmer of hope on the multimedia Intranet horizon, and it's called MPEG-4.

The original MPEG-1 ISO spec provides for a 325 x 240-pixel image with transmission rates up to 1.5 Mbits/second, enabling image quality roughly equivalent to VHS standards. MPEG-1 is thus ideal for CD-ROM and video-CD storage. MPEG-2 increases image quality but requires higher bit rates. Images are 720 x 486 pixels and are transmitted at 3 to 10 Mbits/s. Additionally, MPEG-2 provides for variable bit rates, which are required for digital videodisks. H.263, on the other hand, is targeted at transmission rates ranging from 28.x Kbits/sec to 500 Kbits/sec. The lower limit was provided to accommodate remote videoconference users logging on via 28.8 modems.

MPEG-4, also being developed by ISO, attempts to cross the bandwidth divide between prior MPEG data rate requirements and that of H.263 video conferencing. (H.263 was also made part of the MPEG-4 draft spec.) MPEG-4 video can be optimized for low (5 Kbps-64 Kbps), intermediate (64 Kbps - 384 Kbps), and high (384 Kbps - 4 Mbps) bit rates. Sometimes, although frowned upon by ISO, these three bandwidth levels are called "usage profiles," and are loosely termed, from lowest bit rate to highest, the Mobile profile, Internet profile, and Broadcast profile. The lowest bit rate is also meant for use over the Public Switched Telephone Network (PSTN). Each profile has its own application-specific set of components. For any particular profile, MPEG-4 will uniformly support its associated components across all types of devices. If this is actually accomplished, then users will reap the benefits of a true open systems standard. Finally, MPEG-4 will support both constant bit rate (CBR) and variable bit rate (VBR) data transmission. MPEG-4 supported platforms can therefore span the bandwidth gamut, including Intranet-attached PCs connected via ISDN or 28.8 analog modems, Digital TV and interactive TV cable set-top boxes, mobile devices, or even things connected via the PSTN.

MPEG-2 was an extension of MPEG-1. But MPEG-4 is a completely different beast from either of these two. MPEG-4 is going to be a comprehensive standard embracing digital video, audio, speech, and 2-D/3-D graphics. But most critically, where MPEG 1 & 2 only offered a passive data stream, MPEG-4 is intended to support fully interactive, totally integrated, multimedia sessions. ISO envisions MPEG-4 as supporting the following types of applications:

  • Internet/Intranet Multimedia
  • Interactive Video Games
  • Interpersonal Communications (Videoconferencing, Videophone, etc.)
  • Interactive Storage Media (optical disks, etc.)
  • Multimedia Mailing
  • Networked Database Services (via ATM, etc.)
  • Remote Emergency Systems
  • Remote Video Surveillance
  • Wireless Multimedia
  • Broadcasting Applications
In sum, MPEG-4 will do the whole multimedia/communications enchilada.

All the MPEG formats, and its sibling standard, JPEG, use discrete cosine transform (DCT) as the basis for image data compression. Underlying the DCT algorithm is the fact that the human eye is much more sensitive to changes of brightness levels than to changes of color. So, MPEG renders an image in such a way that the brightness of every point in the image is accurately transmitted, but the color values are averaged out over a large block of pixels.

The images and video are separated into 64-bit blocks, and then those blocks are compressed. The DCT algorithm re-organizes the pixel information within each block into a more compact form by generating a series of mathematical coefficients that represent a combination of pixel values. It uses quantisation to scale and round coefficients in the order to produce good approximations that can be sent in fewer bits. Finally, it uses two-dimensional and run-length encoding to reduce the lengthy string of zeros produced by quantisation.

Via DCT, only the differences between consecutive video frames are transmitted, as opposed to sending all the data every time a new frame is sent. Thus, MPEG data is stored and transmitted as a stream of updated-as-needed images, rather than a succession of individual images that must be completely recreated "from scratch" each time they are displayed. Using these techniques, DCT can achieve video compression ratios of between 30:1 and 100:1. But in reality, the compression ratios usually top out at 60:1. If they get any greater, the ever-tinier blocks start to degrade the image quality.

In MPEG-1 & 2, and H.263, their respective DCT algorithms process all of the various transmitted pixels as part of a single, non-interactive image. But the DCT algorithms used in MPEG-4 are capable of treating multiple aspects of an image as distinctly different components, or objects, that can be individually manipulated. MPEG-4 can therefore encode objects in the data stream separately, and permitting individual component manipulation and interaction. "Touch" a car behind the image of a person, for example, and the car door may open, and the headlights turn on. Moreover, a separate audio compression scheme may kick-in, and the horn might start honking. And when the person in the image starts talking to you about this nifty new vehicle and why you should buy it, his or her voice, as well as the sound of the honking horn, will be dealt with together in the same audio compression stream, unlike earlier MPEG versions.

All this looks and sounds great, and is obviously much needed, which is also why MPEG-4 is very ripe for some raucous industry politics. For example, Quicktime was chosen for inclusion in the standard. The QuickTime file format will be used to store digital video, audio, and other types of content displayed using MPEG-4. According to Apple, the creator of the QuickTime technology, MPEG-4 will also contain things like MIDI, animation and 3D worlds. QuickTime will be used to store all of these things. The fact that QuickTime is being used for these kinds of things today is one of the reasons it was such a compelling choice for the MPEG-4 efforts.

No doubt, this QuickTime victory will likely ruffle some feathers at Microsoft, as its adoption by MPEG was actively "supported" by several of its most notable arch rivals -- Sun, Oracle, and Netscape (SGI and IBM also pushed for its adoption). But apart from smarting over the obvious NIH factor, if Microsoft wants to be MPEG-4 compliant, it must now incorporate the Quicktime format into its multimedia applications, like NetShow, NetMeeting, ActiveMovie, and Interactive Music. ActiveX Controls may also have to make some sort of accommodation with Quicktime. How likely and how soon Microsoft will deliver all these modifications after MPEG-4 debuts is anybody's guess.

Next, bear in mind that DCT processing is one of the things that DSPs (digital signal processors, AKA, "media processors") do extremely well, and they typically do it much faster and cheaper than a regular computer CPU. Thus, you can expect all kinds of MPEG-4 processor turf fights to be breaking out on the desktop, with the WCW fight card listing Intel vs. DSP-maker Motorola, Philips (the maker of the DSP-like Trimedia chip) and SGS-Thomson (which is developing Chameleon, a 64-bit microprocessor, specifically to support MPEG-4 multimedia applications).

The Intel vs. DSP marketing mud match has been a long running event, and it's about to get even more Technicolor gory. Expect to see Intel positioning its upcoming Merced 64-bit chip as the MPEG-4 software-only decoder champion. Meanwhile, the rival processor corner will claim that no matter how good Merced is, it still can't beat a dedicated DSP-style chip for doing a DCT, not to mention processing digital audio/speech, performing digital analog conversion, etc.

And where there is Intel in a fight for control of the multimedia desktop, there is also Microsoft. You may remember the bloody nose that Intel got when it pitted its software-driven "NSP" (Native Signal Processing) APIs against Microsoft. But Microsoft has also taken its lumps in this graphically bruising desktop battle; e.g., its ill-fated "Talisman" multimedia hardware effort. (Although Talisman lives on in market-confusing spirit, its various implementations are nowhere close to the original architectural vision of Microsoft.) Thus, the fight over what goes into the MPEG-4 spec will be opening up lots of old war wounds.

Actually, MPEG-4 has more potential for becoming a huge brouhaha than any of these prior business bouts. For who do we also find in the MPEG-4 ring? None other than Sun Micro, which is butting heads once again with Microsoft. Sun is pushing hard to get Java incorporated into this new ISO standard. Given how important and pervasive MPEG-4 will likely be on the PC desktop, this newest confrontation between Mountain View and Redmond should come as no surprise.

How Sun managed to get Java into the big MPEG-4 picture was pretty slick. But to see understand how it made its marketing jiu-jitsu move, you need to know something about the MPEG-4 coding scheme, which is built on an object-based model. As we saw, each MPEG-4 object component can be individually manipulated, made interactive, encoded, and transmitted as a separate stream. (MPEG-4 is unique in this overall object regard.) Obviously, there must be some sort of object model and system description language to do all this work. And not surprisingly, in MPEG-4-ese, this scheme is called the "System Description Language," or SDL. The C++ looking SDL does a variety of things, such as controlling dynamic linkages between the various MPEG-4 components, manipulating the various objects' interactions, exchanging data between the objects, and modifying the transmitted information on the fly.

One critical element in this overall scheme is allowing the MPEG-4 client-receiving end (the decoder) to negotiate with the upstream transmitting end (the encoder). The purpose of these negotiations is to let the encoder know what the decoder client's underlying hardware/software can accommodate in terms of bandwidth, video frame rate, audio, speech, and 2-D/3-D graphics capabilities. What good is getting a ton of clever MPEG-4 data that can't be processed on your PC? This so-called "Meta information" about the delivery mechanism and the client's capabilities are intended to allow seamless scalability of MPEG-4 content.

Now enter Sun, which essentially said, look, why use SDL to negotiate this exchange between the MPEG-4 decoder client and the head end encoder? What you need here is a platform independent, C++ like language that can perform these high level negotiations, because, as we all know, MPEG-4 will be found in lots things, not just PCs. Well, what do you know. We've just described Java!

In Sun's MPEG-4 envisioned universe, when the MPEG-4 encoder process starts transmitting, it will put a Java applet in the data stream header, which will then set up shop in the target client device. Thus, Sun's hope is that Java will emerge as MPEG-4's Chief Meta Negotiator. But as Sun and Microsoft well know, the use of Java doesn't have to end at being just an MPEG-4 meta-arbiter. Sun's market-busting opening lies in the fact that MPEG-4 permits dynamic mixing of content already located on the client end. E.g., intermixing a CD-ROM's data with the incoming MPEG-4 stream. So, why not also use Java on that CD-ROM or DVD-ROM? And in the PC MPEG-4 multimedia player? As well as with everything else on the PC that might ever become entwined in the MPEG-4 data stream? If Sun's Java ever became the ISO standard to set up, control, and define all these high level MPEG-4 interactions on your PC or Windows CE device, then Microsoft has a huge problem.

To make sure that its place at the MPEGF-4 banquet table is assured, SunSoft has also added some new Java API libraries that facilitate the synchronization and multiplexing of decoded natural audio and video. To counter this particular Java API threat, Microsoft is pushing hard for its own spec. In addition, counterproposals are on the table from other vendors, some of which build on VRML, the Virtual Reality Modeling Language. VRML is viewed by ISO as a subset of MPEG-4. In addition, MPEG-4 also extends VRML by adding support of real-time and remote audiovisual objects. But whoever's sync/mux spec wins the day, it will be responsible for doing a critical piece of the MPEG-4 application work.

Hoping to add irresistible momentum to its game of one-up specmanship, the Java sync/mux API has also just sprouted a new extension. It's called the Java Media Framework (JMF). This extension is a collection of classes for doing a wide variety of display, synchronization, and capture of time-based data within Java applications and applets. Sun, Silicon Graphics, and rather interestingly, Intel jointly developed this Framework. The JMF Player, the first release in a series of three planned JMF APIs, (Player, Capture, and Conference) handles all the chores required to receive and play media from any source, so long as the platform is 100% pure Java enabled.

Not surprisingly, all this intramural maneuvering by so many industry heavy weights with wildly different agendas has produced a slow down in the rollout of the final MPEG-4 spec. So, ISO has done a Solomon-splitting of the standard. The result will be that MPEG-4 will be rolled out in two forwards/backwards compatible installments. Final standardization of the MPEG-4 standard had been expected in February 1999. But now, we can expect MPEG-4.Version 1 to out come in November 1998, with the final version (hopefully) arriving twelve months later. The main stumbling blocks centered on support for 3-D graphics and 3-D sound. MPEG-4 Version 2 will have provisions for 3-D audio, to add spatial localization cues to the 3-D images. In addition, Version 2 will provide for semi-transparency of images and also scaleable transmission of objects with arbitrary shapes. Thus, Version 1 will only support objects that are either invisible or opaque, and they all have to be regularly shaped.

Also contributing to the timetable slip-up was the desire by some companies for support of interlaced images in MPEG-4 for commercial broadcast. This interlace issue is fraught with industry politics, as a huge war continues to be waged between Microsoft and the rest of the TV broadcast industry over the new Digital TV formats. For almost two years, Microsoft absolutely did not want to support interlaced images, wherein the video image is painted in two successive frames. Microsoft only wanted progressive, paint-it-all-once, DTV images appearing on people's screens. This DTV food fight is still in topsy-progress. For example, Intel has defected from the original Progressive-only troika (consisting of Microsoft, Compaq and the giant chipmaker), and acceded to support interlaced DTV.

But as a counterbalance to Intel's progressive defection, Microsoft recently gained the support of John Malone's TCI cable company, which agreed to transmit its DTV signals down the wire in HD0 format; the low resolution, progressive-only picture. As part of that, deal Malone's TCI said it would be using MS CE in its new DTV set top boxes (which will also be running Sun Java.) But then, incredibly, in April, 1998, Microsoft totally reversed its position, and announced plans to support 1018I (the high definition interlaced DTV format). In exchange for Sony buying into MS CE, Microsoft has announced that its operating system will also support 1080I. Formerly, CE only supported progressive low definition DTV.

After all the bitter infighting, this decision by Microsoft to support 1080I is tantamount to Protestants suddenly being eligible to run for Pope. it will be interesting to see which end of the stick Sony ends up getting in this deal. Historically, doing business with Microsoft has been like bobbing for apples in a bucket full of piranhas, so you decide. And remember, despite its 1080I Sony support, Microsoft still clings tightly to the notion that interlaced is bad, progressive is good. Finally, Microsoft has also pledged support for 720P, the next step above standard definition progressive DTV. In short, Microsoft has essentially reversed its position as HD-0 being the one and only way to get started in DTV.

But no matter how you look at it, if anything was to raise Microsoft's highly visible ire, it would be if Sun succeeds in its various MPEG-4 Java gambits, which now looks likely. According to ISO, "There is a group of people working on the specification and developing a (Java) implementation. This will be donated to ISO and anybody will be allowed to use it in products conforming to the standard." If the use of Java in MPEG-4 becomes widespread, then Sun and its 100% pure Java partners will have effectively wrested application control of the next generation multimedia A/V client away from Microsoft. That Sun and friends are on the verge of succeeding in their MPEG-4 gambit is a specter that must be keeping the midnight oil burning up in Redmond. We have already witnessed the recent attempt by Microsoft to wrest complete control over its Java destiny away from Sun, via its new release of Visual J++. Consequently, as ISO will be adopting 100% pure Java for MPEG-4 as an available option, expect Microsoft to lash back in full market fury. Microsoft will no doubt market reposition and refashion its J++ as being the one and only way to handle MPEG-4 applications on the PC, or on Windows CE devices.

Already, we can see the portents of this impending Java MPEG-4 war. For example, Hewlett Packard has just announced "clean room" embedded Java that it developed to compete with Sun's own embedded system technology. Ominously, Hewlett Packard is licensing its rival Java system to Microsoft for use in Windows CE. (This HP license was also a godsend for Microsoft. Due to all of its legal imbroglios, Microsoft's license with Sun for Java has been suspended, cutting Redmond off from Java's Mountain View wellspring.) The implications of Hewlett Packard's business deal with Microsoft, especially in the context of MPEG-4, are obvious: Microsoft gains another strategic weapon to try to kill off Sun's Java. Redmond will most likely alter the HP system in such a way that Sun Java will have a tough time running. And a recent political decision has brought the prospects of a Java world war that much closer. The U.S. Dept. of Justice is opening another investigation into Microsoft's business practices; this one with respect to the company's blatant attempts to "embrace and extend" Sun's Java in order to destroy its market effectiveness. The DOJ, if it moves against Microsoft, will only produce another long, business-bloody fight over the fate of Java.

In the not too distant future, MPEG-4 providers, application developers, and unfortunately, users may likely be forced to choose Java sides. If that happens, then any hope for a truly open MPEG-4 standard will have been smashed. MPEG-4 is therefore no small stakes poker game. Whoever wins this MPEG-4 pot will have won market dominance over the next generation of multimedia applications, and most especially, scored a huge hit in the next big evolution of Internet/Intranet multimedia.

So, can this new MPEG-4 movie possibly have a happy ending? Unfortunately, it appears, as of this writing that may only happen if you make your own via STN.


Francis Vale, Copyright 1998, All Rights reserved

21st, The VXM Network,