To see faces where there are none

This week in “neither university press offices nor prestigious journals know what they’re doing”: a professor emeritus at Ohio University who claimed he had evidence of life on Mars, and whose institution’s media office crafted a press release without thinking twice to publicise his ‘findings’, and the paper that Nature Medicine published in 2002, cited 900+ times since, that has been found to contain multiple instances of image manipulation.

I’d thought the professor’s case would remain obscure because it’s evidently crackpot but this morning, articles from and Universe Today showed up on my Twitter setting the record straight: that the insects the OU entomologist had found in pictures of Mars taken by the Curiosity rover were just artefacts of his (insectile) pareidolia. Some people have called this science journalism in action but I’d say it’s somewhat offensive to check if science journalism still works by gauging its ability, and initiative, to countering conspiracy theories, the lowest of low-hanging fruit.

The press release, which has since been taken down. Credit: EurekAlert and Wayback Machine

The juicier item on our plate is the Nature Medicine paper, the problems in which research integrity super-sleuth Elisabeth Bik publicised on November 21, and which has a science journalism connection as well.

Remember the anti-preprints article Nature News published in July 2018? Its author, Tom Sheldon, a senior press manager at the Science Media Centre, London, argued that preprints “promoted confusion” and that journalists who couldn’t bank on peer-reviewed work ended up “misleading millions”. In other words, it would be better if we got rid of preprints and journalists deferred only to the authority of peer-reviewed papers curated and published by journals, like Nature. Yet here we are today, with a peer-reviewed manuscript published in Nature Medicine whose checking process couldn’t pick up on repetitive imagery. Is this just another form of pareidolia, to see a sensational result – knowing prestigious journals’ fondness for such results – where there was actually none?

(And before you say this is just one paper, read this analysis: “… data from several lines of evidence suggest that the methodological quality of scientific experiments does not increase with increasing rank of the journal. On the contrary, an accumulating body of evidence suggests the inverse: methodological quality and, consequently, reliability of published research works in several fields may be decreasing with increasing journal rank.” Or this extended critique of peer-review on Vox.)

This isn’t an argument against the usefulness, or even need for, peer-review, which remains both useful and necessary. It’s an argument against ludicrous claims that peer-review is infallible, advanced in support of the even more ludicrous argument that preprints should be eliminated to enable good journalism.

Preference for OA research by income group

Two researchers from Rwanda performed a “systematic computational analysis of the biomedical literature” and concluded in their paper that:

… papers with authors based in sub-Saharan Africa, papers with authors based in low income countries, and papers resulting from international collaboration are all much more likely to be made openly accessible than papers that don’t have these properties.

They analysed 547,404 papers indexed in PubMed, which is:

… a free resource developed and maintained by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM). PubMed PubMed provides free access to MEDLINE, NLM’s database of citations and abstracts in the fields of medicine, nursing, dentistry, veterinary medicine, health care systems, and preclinical sciences.


The researchers also found that after scientists from low-income countries, those in high-income countries exhibited the next highest preference for publishing in open-access (OA) journals and that scientists from lower and upper middle-income countries – such as India – came last. It is important to acknowledge here that while there exists a marked (inverse) correlation between GDP per capita and number of publications in OA journals, a causation might be harder to pin down because GDP figures are influenced by a large array of factors.

At the same time, given the strength of the correlation, their conclusion – about scientists from middle-income countries being associated with the fewest OA papers in their sample – seems curious. The article processing charge (APC) levied by some journals to make a paper openly accessible immediately after publishing is only marginally more affordable in middle-income countries than it is in low-income countries. However, the effects of technology and initiative seem to allay some of this confusion.

There are two popular ways, or routes, to publish OA papers. In the ‘gold’ route, the authors of a paper pay the APC to the journal, which in turn makes the paper openly accessible once it is published. A common example is PLOS One, whose APC is at the lower end, $1,595 (Rs 1.13 lakh). On the other hand Nature Communications charges a stunning EUR 4,290 (Rs 3.4 lakh) per paper for submissions from India. In the ‘green route’, the authors or publishers upload the paper to a publicly accessible repository apart from formally publishing it; common example: the arXiv preprints server, which is moderated by volunteers.

There is also ‘hybrid’ OA, whereby a part of the journal’s contents are openly available and the rest is behind a paywall. In one review published in February 2018, researchers also pointed out a ‘bronze’ route: “articles made free-to-read on the publisher website” but “without an explicit [OA] license”.

The authors of the current paper reason that researchers from high-income countries might be ranking higher in their preference for OA papers because the “‘green’ route of OA has been encouraged by an enormous growth in the number of OA repositories, particularly in Europe and North America”; they also note that Africa was home to only 4% of such repositories in 2018. In the same vein, they continue, “the vast majority of funding organizations with OA policies as of 2018 were based in Europe and North America, with less than 3% of total OA policies originating from organizations based in Africa”.

Additionally, many journals frequently waive APCs for submissions from authors in low-income countries, whereas those from lower- and upper-middle income countries – again, including India – do not qualify as frequently to have their papers published without a fee. A very conservative, back-of-the-envelope estimate suggests India spends at least Rs 600 crore every year as APCs.

It was to reduce this burden that K. VijayRaghavan, the principal scientific adviser to the Government of India, announced earlier this year that India was joining the Plan S coalition of research-funders, which aims to have all research funded by them openly accessible to the public by 2021. As a result, researchers funded by Plan S members will have to submit to journals that offer gold/green routes and/or journals will have to make exceptions for publishing research funded by Plan S members.

This is going to take a bit of hammering out because the Plan S concept has many problems. Perhaps the most frustrating among them is its Eurocentric priorities. Other commentators have acknowledged that this limits Plan S’s ability to serve meaningfully the interests of researchers from South/Southeast Asia, Africa and Latin America. In July, two Argentinian researchers lambasted just this aspect and accused Plan S of ignoring “the reality of Latin America”. They wrote that Plan S views “scientific publishing and scholarly publications … as a commodity prone to commercialization” whereas in Latin America, they “are conceived as the community sharing of public goods”.

The latter is more in line with the interests of the developing world as well as with the spirit of knowledge-sharing more generally. At present, a little over 50% of research articles are not openly accessible, although this is changing thanks to the increasing recognition of OA’s merits, including the debatable citation advantage. Research-funders devised Plan S to “accelerate this transition”, as Jon Tennant wrote, but its implementation guidelines need tweaking.

Another problem with Plan S is that it keeps the focus on the ‘gold’ OA route and does little to address many researchers’ bias against less prestigious, but no less credible, journals. For example, while Plan S specifies that it will have gold-OA journals cap their APCs, scientists have said that this would be unenforceable. So, as I wrote in February:

… if Plan S has to work, researcher-funders also have to help reform scientists’ and administrators’ attitude towards notions like prestige. A top-down mandate to publish only in certain journals won’t work if the institutions aren’t equipped, for example, to evaluate research based on factors other than ‘prestige’.

To this end, the study by the researchers in Rwanda offers a useful suggestion: that the presence or absence of policies might not be the real problem.

There was no clear relationship between the number of open access policies in a region and the percentage of open access publications in that region. … The finding that open access publication rates are highest in sub-Saharan Africa and low income countries suggests that factors other than open access policy strongly influence authors’ decisions to make their work openly accessible.

The DNA-based computer that can calculate π

I’m not fond of biology. Of late, however, it’s been harder to avoid encountering it because the frontiers of many fields of research are becoming increasingly multidisciplinary. Biological processes are meshing with physics and statistics, and undergoing the kind of epistemic reimagination that geometry experienced in the 19th and 20th centuries. Now, scientists are able to manipulate biology to do wondrous things.

Consider the work of a team from the Dhirubhai Ambani Institute of Information and Communication Technology, Gujarat, India, which has figured out a way to compute the value of π using self-assembling strands of DNA. Their work derives from previous successful attempts to perform simple mathematical calculations by nudging these molecules to bind to each other in specific ways, a technique called tile assembly.

It was first formulated as a tiling problem by Chinese philosopher Hao Wang in 1961. Wang wanted to know if a set of square tiles could cover a plane in a periodic pattern if each tile had four different colored edges and only edges of the same color could abut each other. The answer was that they could cover a plane but only with an aperiodic pattern.

In a DNA tile assembly model (TAM), each tile represents a section of the DNA molecule, called a monomer. When adjacent tiles’ abutting sides line up with the same color, then the two monomers attach themselves across the abutting sides according to a strength corresponding to that color. This way, given a tile to start with – called the seed tile – and a sequence of tiles coming up next, the DNA monomers can link up to form diverse patterns.

By controlling the sequence of colors and their strengths, scientists can thus use TAM to control the values of variables moving through the resultant grid. Connections of monomers between tiles can be made become stronger or weaker, and to different extents, in ways mimicking how the voltage between different electronic components in a computer’s circuit allow it to perform mathematical calculations.

So, Shalin Shah, Parth Dave and Manish Gupta from the Institute used four new variations of TAM that they’d developed to calculate the value of π. Each of these variations performs a specific function, much like the logic gates inside an information processor.

  1. The compare tile system decides which number is greater between two numbers, or if they’re equal
  2. The shift tile system shifts the bits of a number by one bit to the right, and adds a 0 to the leftmost bit. For example, 11001 becomes 01100.
  3. The subtract and shift tile system subtracts one binary number from the other, then right-shifts its bits by one bit to the right, and finally adds a padding 0 to the leftmost bit
  4. The insert bit tile system inserts a bit in a number

Using a combination of these systems – all with the TAM at their hearts – the trio has been able to compute the value of π like below:

The gray tiles are input tiles, green are addition/subtraction tiles, yellow are copy/duplicate tiles, orange tiles are shift tiles, and blue tiles indicate the remainders of the corresponding division process. Image: Computing Real Numbers using DNA Self-Assembly, Shah et al, Laboratory of Natural Information Processing, DAIICT.
The gray tiles are input tiles, green are addition/subtraction tiles, yellow are copy/duplicate tiles, orange tiles are shift tiles, and blue tiles indicate the remainders of the corresponding division process. The calculation is growing upward and toward the right. Image: Computing Real Numbers using DNA Self-Assembly, Shah et al, Laboratory of Natural Information Processing, DAIICT.

You can see that the calculation is an ongoing infinite series – specifically, the Leibniz series, which estimates π as an infinitely alternating sequence of additions and subtractions between smaller and smaller fractions. Because it is infinite, the trio’s calculator’s ability to find a more precise value of π depends only on how many tiles are available. Second, because the calculator can compute infinite series, any number or problem that can be reduced to the solution of an infinite series is now solvable using this calculator.

This would merely be a curious yet tedious way to calculate if not for its potential to exploit the biological properties of DNA to enhance the calculator’s abilities. Although this hasn’t been elaborately outlined in the trio’s pre-print paper on arXiv, it is plausible that such calculators could be used to guide the development of complex and evermore intricate DNA structures with minimal human intervention, or to fashion molecular logic circuits commoving microscopic robots delivering drugs within our bloodstreams. Studies in the past have already shown that DNA self-assembly is Turing-universal, which means it can perform any calculation that is known to be calculable.

The DNA molecule is itself a wondrous device, existing in nature to store genetic data over tens of thousands of years only for a future inheritor to slowly retrieve information essential for its survival. Scientists have found the molecule can hold 5.5 petabits of data per cubic millimeter, without letting any of it become corrupted for 1 million years if stored at -18 degrees Celsius.