Posts tagged with dissemination

The subject of prolificacy came up in lab the other day. A study from the 80s (pdf) plotted the number of papers from a lab versus the number of people in the lab. This was repeated for several large research institutions. Across all of the data, the average was 1 paper/person/lab.

A graph from that paper is shown above. They included brief reports and unrefereed contributions to books, but did not include abstracts. Note that the spread is quite large. Among the labs with 20-30 people, output ranges from 10 to 60+ papers/year. Similarly, for labs with 10 or fewer people, output ranges from 0 to 28 papers/year. Perhaps part of the variability can be accounted for by variations from one discipline to another. Laboratories in the National Cancer Institute can include biochemistry, physiology, and cell biology.

How about # of publications vs. lab funding?
According to analysis by Jeremy Berg of NIGMS, it basically plateaus, or there is no relationship, depending on how you measure. (link, follow up)

Or death rate versus NIH funding?
Given a 10 year lag, actually pretty correlated. (source)

(hat tip to AM)

This is a long post. If you’re in a rush, then just read these first two paragraphs.

One of the early posts on this blog was about structured illumination. Specifically, I spoke about Mats Gustafsson’s version, which yields superresolution imaging, in the wide-field mode. Just recently, JM commented on that post and asked if there was any kind of guide on how to get this set up and running. Besides the usual sources (methods sections, co-authors, etc.), I’m not aware of any such guide. However, I have corresponded a few times with Mats over the years and he was always overwhelmingly helpful.

He passed away earlier this year and there have been a few articles written about his landmark work, his thoroughness, and his kindness (Nature Methods, HHMI). In this post, I want to share some excerpts from his emails to me. They’re not personal (we were just acquaintances), they’re technical. In addition to them being useful to people who may be putting together their own patterned illumination rig, I think they also give a small insight into how kind of a person Mats was. He took the time to write these detailed responses to just some postdoc that he met at a small conference.

Read the rest of this entry »

A paper on data organization just came out in Nature Methods (Millard et al. 2011, commentary by Swedlow et al.). They believe, as do I, that using XML schema to organize data is a good way to simplify automated analysis. For the primary container, they use HDF5. Fun fact for all the MATLAB users in the audience (from Tim at Imperial College):

The .mat file format is simply an HDF5 file with a pointless header prepended.

They call the XML metadata+HDF5 data combos “SDCubes”, for Semantically typed Data hyperCubes. Why cubes? That suggests that they are the same size along all axes, which they probably aren’t. If you don’t like hyperrectangle, you can use orthotope. One point that is lost in the figure I put above is the idea that the axes do not have to be continuous. There can be gaps and jumps. There can also be piles of data that all share one point on an axis, if that suits the data.

I like this approach because it is very general and simple. It consists of two file formats that are already being used by many researchers. In a way, the authors didn’t “create” anything. Hopefully this paper will give the strategy some added credibility and help to standardize it. Then people can concentrate on developing tools for working with data in this system, rather than developing new formats all the time.

0 comments

OpenOptogenetics

When Karl Deisseroth started publishing his work on Channelrhodopsin-2, he set up a website to share the resources, including plasmid information, protocols for expression systems, and hardware details. His site, optogenetics.org, is an excellent source. However, it is focused on Deisseroth lab information.

For a more broadly focused resource, Josh Siegle (Matt Wilson lab, MIT) and others have consolidated a great deal of information in wiki format at OpenOptogenetics.org. The wiki format is ideal for this sort of information since it is changing all the time, and the relevant personnel changes over time as well.

There’s already a good amount of information on the site, but there are several opportunities to contribute and fill in the gaps as well. I encourage you to pitch in.

This is just a quick post to point people over to Matt Might’s excellent post full of tips on how to give a good scientific talk. I second his endorsements of Keynote and the book “Even a Geek Can Speak”.
(link)

0 comments

Pubmed limbo

In my experience, PubMed works beautifully the vast majority of the time. It does an excellent job parsing search terms and they’re always adding new features (e.g., you can search using full names now, not just first initials). PubMed works so well, that I’m actually surprised when it fails to find the paper I’m looking for. But it does happen.

LSTOTT has a great post about articles lost in PubMed limbo. It’s a real phenomenon. They also identify an article which is in the database, but does not get returned using standard searches that should match. Which happens more often, in my opinion.

Database jocks call this latter problem an indication that recall < 1. (The former problem is just a mistake in QA, that is, someone forgot to include the article). This is the proportion of relevant documents that are actually returned. LSTOTT thinks PubMed’s recall may be declining (they call it “leaky”). What do you think?

Google Scholar does an excellent job of finding well-cited articles, including the ones PubMed misses. This is because there is no one point of failure that can prevent indexing: if the article is cited, then Google will index it. But this strength is also its weakness: relevancy is often sacrificed in favor of citation counts. Furthermore, the output is ordered by citation counts, which is not typically a useful parameter when I’m searching for a paper. PubMed’s reverse chronological ordering is better. But really, both systems should make it easier to re-sort the results.

0 comments

Future Publishing

We used to just tell stories. Then we had monks copy manuscripts by hand. Later, movable type made printing easy. That brings us up to about 15 years ago. Now, online dissemination is changing the way academic publishing works. Scientists, librarians, and publishers are all trying to figure out how to adapt.

Much of the backroom chatter of librarians and other interested parties is actually online and makes for very interesting reading (academic samizdat, RIAA-style takedown notices, and protests). Here are a few links to get you started:

Book of Trogool – From a librarian’s perspective.
Open and Shut? – From freelance journalist Richard Poynder.
In The Dark – From an academic in Cardiff.
Embargo Watch (intro) – Do we need press embargoes? Are they relevant in the age of blogging? If we want to keep them, how should they be designed?
Springer’s 2011 Price List – To get an idea of the prices.

BTW, as discussed previously, you can typically release your own paper to an open access resource, even if you published it in a major, closed journal.

3 comments

DIY open access

Even if you publish in a “closed” journal, you are typically allowed to post the PDF of your article on your personal website.

For example, this is from Nature’s description of their policy (src):

In 2002, NPG was one of the first publishers to allow authors to post their contributions on their personal websites, by requesting an exclusive licence to publish, rather than requiring authors to transfer copyright.

This is a large loophole that everyone should take advantage of.

    1. It increases your readership: people who are trying to read it while not on campus nor working through a proxy, people at poorer institutions, and lay readers.

    2. You get fewer reprint requests, and you can reply to the few that slip through with a simple link, rather than emailing the PDF.

    3. If you have access to the logs on your website, you can get some idea of how many people are downloading your article and where they are.

Google Scholar is pretty good at finding posted PDFs online and indexing them. Now, if we could just get Google Scholar to adopt the well-tuned algorithms of PubMed’s search parsing.

Update

In the comments, Alex gave a link to SHERPA RoMEO, which keeps track of the license permissions for over 700 publishers. Thanks, Alex!

The usual color scheme for showing co-localization is to overlay a red image and a green image and have the yellow portions show the sites of co-localization. This is problematic since red-green colorblind people cannot tell what is going on. Following up on the recent post on Daltonization, here’s a colorblind-proof color scheme for showing co-localization. It uses a standard 3-plane RGB scheme. One image only has information in the green channel, the other image has identical information in the red and blue channels. Overlapping portions are white. By using this color scheme, you can ensure that the figure’s information is intact for all three major colorblindness types, as illustrated above.

Here’s the MATLAB code for producing the image:


clear all

% initialize variables
a=zeros(128);               % monochrome image
b=a;                        % monochrome image
a_img=zeros([128 128 3]);   % RGB color planes
b_img=a_img;                % RGB color planes

% draw square gradients, monochrome from 0-1
a(16:48,48:80)=meshgrid(0.5:0.5/32:1);     % gradient
a(90:102,32:96)=1;
a(16:48,85:90)=1;
b(80:112,48:80)=meshgrid(0.5:0.5/32:1);    % gradient
b(26:38,32:96)=1;
b(80:112,85:90)=1;

% assign RGB color planes
a_img(:,:,2)=a;             % just the green color plane is assigned

b_img(:,:,1)=b;             % both the red and
b_img(:,:,3)=b;             % blue channels are assigned the same data

% display the data
imagesc(a_img+b_img)
caxis([0 1])
axis image off

In academia, we have some nice benefits when it comes to intellectual property. Just about everything we do qualifies as scholarship, and so we take ample advantage of that portion of fair use when dealing with copyrighted works. And we can pretty much ignore patents when building rigs for our experiments, since we aren’t creating commercial products. Of course, we aren’t completely immune. Strictly protected intellectual property results in expensive, closed systems (spectrophotometers, PCR machines, scopes, etc.) that end up limiting what we can do with our experiments.

There’s already a lot of homebrew, custom hardware in science. But there are also a lot of duplicated efforts. The hobbyist community is already putting together ad hoc mechanisms to support and guide collaborative engineering efforts (e.g., Arduino and Adafruit). Labrigger wants to foster the same ideas among the scientific community (e.g., OpenEEG and OpenPCR).

The Open Hardware Summit is opening a dialog on this issue, and seeks to nail down some of the ideas floating around. Their immediate goal is to create a GPL-like license for hardware. They want to encourage derivative works, while still offering some sort of optional protection.

Here’s the current working definition: “Open Source Hardware (OSHW) is a term for tangible artifacts — machines, devices, or other physical things — whose design has been released to the public in such a way that anyone can make, modify, distribute, and use those things.”