CDF: Mathematica’s PDF
Wolfram is pushing a new document format called Computable Document Format (CDF). It looks like PDF + embedded Java apps.
One the one hand, there’s the xkcd viewpoint. This is basically just another step in the evolution of Mathematica’s native file format. And right now, Mathematica 8 is the only way to author a CDF file. Alternatively, one can simply author a webpage with embedded Java apps, or maybe even HTML5. Then everyone can use it, anyone can modify it, and most platforms will play it. On the other hand, web pages don’t print out as nicely as PDFs (CDFs should print out as nicely), and it can be a bit messy to download and view a webpage with embedded apps offline.
So maybe there’s a future for CDF. Although the same basic results can be obtained using HTML5 or Java, Mathematica makes it very easy to create some types of interactive infographics. Arguably, it’s possibly exactly what Elsevier’s Executable Paper Challenge is looking for. However, it’s a closed format. Although Wolfram says the specification is public, the restrictions are perhaps enough to prevent wider adoption. There’s only a player for people who don’t own Mathematica (~500 MB download).
It’s hard to get people to install another player on their computer. Flash had success because of streaming video. Shockwave got people to install it for games. I think it will be a challenge to get readers of the NY Times to en mass download another player for their browser just so that they can see an infographic.
See that 23% bar for the RealOne player? And that’s for streaming media, a very broad market. What does Wolfram expect for such a narrow application? No matter how optimistic they are, why would they want to do this? They have to support this software for a bunch of different platforms, including mobile devices. This includes direct customer support and keeping up with changes to the platforms, security, etc. All for free. If Mathematica offered an “Export to HTML5” option, they’d sell more software. Because then authors would know that everyone can see what they produce, without having to download another player.
So perhaps CDF is a bit like Wolfram Alpha: utterly useless outside of a very narrow field of applications, in which it performs utterly beautifully.
But that’s where the problem lies. I downloaded the player and tried out several of the CDFs. They were ugly and inefficient.
Note the lack of antialiasing. The animation wasn’t very smooth either. I’ve seen much better results with Java and HTML5. I think Processing is probably a better path towards this sort of thing. And that is free and outputs Java.
You can hide source code inline, and that’s nice. But it’s not a particularly innovative feature. Basically there’s a place to click on the right margin that expands a block of source code.
To cap it all off, the typesetting doesn’t seem to be as rich as that of PDF. It really looks like a webpage. The equations look good, as we would expect, but they don’t always select and highlight in intuitive ways. So if you want to cut and paste, it can be frustrating.
Overall, it’s hard to get too excited about CDF. It’s a way to get people to see a Mathematica document when they don’t have the software. But it only works when the intended audience is more likely to download the player than the author is likely to write it up in more standard web technology.
A precision: the FAQ states the CDF spec is public. On the other hand, the user agreement states that to monetize the content of a CDF, you must pay a license fee. So it’s not closed, as in undocumented, but it’s not open either.
While I like the idea of an executable research article, containing the source code for the presented analyses, as well as the raw data, the restrictions on distribution are completely stupid. Add to that the fact that the plugin is 500MB, and it’s pretty much DOA.
What allowed the Flash player to thrive and become almost universally installed is a combination of factors. The size of the download was kept under 1 MB for a very long time; even these days, despite addition of H.264 video it is still very small compared to every other plugin. Backwards compatibility was an important factor; the vast majority of the content developed for Flash 5 was playable in Flash 4. The plugin offered killer apps that rapidly increased adoption: interactive cartoons in the early 2000’s with Flash 4, embedded video with Flash 6 that allowed users access to video sharing sites, esp. YouTube. Since the days of Flash 5, the scripting language for Flash has been based on the ECMAScript standard (aka JavaScript), which meant a large crowd of web developers could easily create apps in the SWF format after a minimal period of training. Finally, while the player was closed, the SWF format was open, and many tools developed around it; various toolkits for server-side generation of Flash movies (pie charts and such) before the advent of ActionScript 2, the MTASC ActionScript bytecode compiler, several efforts at open versions of a SWF player.
On the other hand, Shockwave (offered by the same company, Macromedia, which was later swallowed by Adobe) never got widespread distribution. Because of Director’s legacy as an interactive CD-ROM creation tool, the Shockwave player was always much bulkier than Flash; in the days of the 56k modem, 20MB was huge! Plus you had to upgrade the plugin all the time to use the latest apps. The format never caught on with the open source crowd, which meant creators were locked in to the Director software, which used an esoteric scripting language (Lingo) that few knew. While it had functionality unavailable anywhere else (3d, advanced interactive games), the barrier-to-entry was high both for developers and users, so it never caught on outside of a very restricted game portal (shockwave.com).
The CDF player looks a lot more like Shockwave than Flash, IMHO.
I’ve edited the post to correct that piece of information. Thanks, Patrick.
That said, I still haven’t found the actual spec. I followed the links in the FAQ and found nothing but instructions on how to save CDF files from Mathematica. (“Save as…”)
I’m not alone, apparently.
http://hackerne.ws/item?id=2789218
I am a heavy Mathematica user, and I should be biased towards CDF, but I too have mixed feelings. The CDF “format” is essentially a castrated version of .nb format. All markup inside is Mathematica language “boxes”, which are relatively well documented in publically available sources. I wouldn’t want to write CDF by hand though. Much like PDF too. As for CDF lacking typesetting of PDF, perhaps to a degree, but you can do a LOT more with it in this respect than most people will know, it’s just not trivial. The ONLY editing tool in existance wasn’t made for rich typesetting and thus is not well adopted for it. It can be programmed though to look spectacular. On the other hand printing is another story. VERY weak compared to page-centric PDF. CDF is screen/interactivity-centric, and it doesn’t play well with laying itself out on a static page. There’s little to no control for print-specific presentation, like CSS media tag in HTML. One cannot hide controls from a printout, unless you first go and hide all of them on the screen. Splitting large content into multiple pages is awkward at best, if not utterly dysfunctional.
I chuckled when I read your comment on selecting in Mathematica editor. There are options in preferences which make selection work better, but it’s been my frustration with Mathematica since the day I started using it. I still mess up occasionally during selections after countless hours of using the tool.
Size of the player I think will be less and less a problem (until we start discussing mobile platforms), but the fact that the format does not intrinsically prompt for a player or packages one with it means that many people won’t know they’re not seeing the rich content – they just won’t see it and will wonder what’s so special about the page. Aside from that, closing imports and saves was perhaps smart for Wolfram to make some money, but dumb if they want the format to proliferate. Main point behind CDF is…. MATH. Math in the real world relies on data. I cannot import data, so I’m limited to putting it into a CDF, this bloats CDF in a huge was as all the data is stores inside as PLAIN TEXT, and I have no options to obscure the data, so if I deliver interactive intelligence, I immediately expose the underlying data, which is highly undesireable. In clinical environment it means that to show patient trends and sample group statistics, I have to include data on individual patients – that won’t fly very well with HIPAA. And it bloats the size in terrible ways. to show a 100 explorable data points, I had to include 5000 rows of source data in text format with no compression or obfuscation, which made my CDF 1MB! The .nb file that created it but rather imported the data from a local 100KB Exel file – was 30KB. How does 30k+100k = 1000k?
I’m sure I’ll think of something else later.
Final note FOR the player itself. The earthquake presentation you referenced specifically reduces rendering quality so it runs well on all systems. It’s a public demo after all. I can show you some gorgeous 3D anti-aliased stuff on my box, which will undoubtedly run well on your machine as well, but will stall a Pentium w/o a dedicated graphics card to a crawl. You made a good point that CDF is likely to find best adoption in specific environments, and hardware is not always up to date in those environments. In fact many such users have no dedicated machines period, working on citrix thin boxes. Wonder how/if CDF would play over citrix?