Tag Archives: Hypothesis

Hypothify: first outline

I’ve been pondering a hypothesis-based academic social media site for a few weeks now; and talked with a couple of people about it. Ideas are only just beginning to coalesce but now seems the right time to try to outline what I see hypothify doing, and how it might work. I’m conscious that it needs to start very simple, and remain as simple as possible. It’s easy to come up with a massive feature list, but identifying the most important stuff and planning how it might work is key. [edit after writing – there’s still A WAY to go on this!!]

What?

A place to propose,  discuss and aggregate evidence for/against hypotheses. Consensus building through democratic voting up/down of evidence/ discussion items. What I thought Quora was going to be when I first heard about it, but less question/answer and more evidence-based.

In traditional(ish) publishing terms it would represent a platform for ‘living’ review articles on each hypothesis. However, would integrate social media aspects (voting and sharing) and wider evidence than just academic papers.

Not peer-reviewed, or peer evaluated, but peer assembled.

Why?

Hypotheses are fundamental to academic work. They represent the ideas and concepts which propagate through the academic sphere and out into the wider world as our understanding of the natural world/ the universe/ the human condition etc. They are often dissociated from the piece-by-piece evidence in the traditional academic record. Currently academics are supposed  to read everything and make up their own mind on a particular matter. For each individual this is only possible for a limited number of concepts/hypotheses because of the massive time cost of i) finding all the literature and ii) reading it all and ii) keeping up to date. In reality we all take ‘received wisdom’ on many matters on trust from other academics, or tend to disbelieve everything we’re told and argue it out ourselves from first principles! Hypothify would solve the pain of i) and negates ii) and iii) by providing community-maintained consensus instead of ‘received wisdom’ on each given hypothesis.

How?

The platform would allow the proposal of hypotheses by any user. Evidence items (papers [via Mendeley API if poss], unpublished data [figshare, slideshare, blogs or notebook entries], snippets of discussion / reasoned argument [from discussion of this hypothesis or elsewhere via e.g Disqus]) can be presented by any member of the community  as being ‘for’ or ‘against’ the hypothesis. Key to the usefulness of evidence items will be a tweet-length summary of what that evidence item contributes to the assessment of the hypothesis. One will have to be added by the person introducing the evidence, other ‘competing’ summaries may be added. Where necessary, discussion of the evidence can be conducted and this itself can be cited as evidence itself. It is conceivable that a single piece of evidence may be argued to support either side of a hypothesis. Maybe it’s necessary to recognise that evidence can be ‘agnostic’?

Key to the success of the platform will be the voting up/down of content (a la stack exchange). Hypotheses themselves should not be voteable on I think  -i.e. there will be no opportunity for individuals  to vote subjectively/dogmatically for/against a hypothesis, only vote up or down evidence supporting or contradicting the hypothesis. Plus vote up or down particular summaries of evidence items, so the best summaries float to the top for each bit of evidence. So the ‘hypothesis view’ page will show the hypothesis at the top and evidence items for and against (highest voted first), with the best summary pertaining to that hypothesis for each one. Plus a link to the evidence item (i.e. NOT stored on-site). I think this is really neat because a user can find a hypothesis they’re interested in, find what the community thinks it the base evidence for and against, read those bits, and make and informed decision based on comprehensive community review of the field. It may or may not be useful to have a ‘swingometer’ for each hypothesis which  represents the net votes for evidence for and against the hypothesis, which give a ‘community assesment’ of the hypothesis.

Attracting users?

What’s in it for users? Firstly, being seen to propose and and contribute to hypothesis assessment will bring kudos to users. A ‘reputation’ system (also a la Stack Exchange) could be implemented to measure reputation / value of contributions… Even badges etc would probably work for academics, but I think there’s a more instantly attractive ‘doughnut’ (as my good friend Charlie calls them) – promotion of your research output. If you add a good summary of your paper which informs debate on a particular hypothesis it will (if it’s good) float towards the top for that hypothesis. You will be able to engage with other interested parties and discuss your research. Google will love it.

Engage ‘the enemy’. Let’s say I propose a hypothesis, which just happens to be something I’ve proposed in papers in the literature in the past. Great. I put up the hypothesis, provide evidence items. As I’m the only contributor, the hypothesis is only ‘proposed’. To move it to the ‘community debated’ stage I need to get other people involved. So I share it on Twitter, but also invite people I know will be interested, to join the debate. Furthermore, other established hypothify users will be automatically invited to join based on their interests  and other hypotheses they’re active on and the tags which have been associated with the hypothesis in question.

As evidence items are added, the system will attempt to ‘scrobble’ originator details (emails, figshare user details, Mendeley author details) and contact the originators to inform them that their work is being used to inform debate on a particular hypothesis. They will be invited to join the debate. I’m guessing if their work is being ‘correctly’ cited they will be flattered enough to go and have a look, and if it’s being ‘incorrectly’ cited (in their opinion) they will be incensed enough to wade in and put those upstarts right. Thus the experts will hopefully filter in.

Furthermore, as evidence and discussion accumulates and more people vote evidence, evidence summaries and the hypothesis summary  up and down, the ‘top contributors’ will be identified. Those top contibutors, plus the hypothesis proposer (i.e. the proposer on hypothify, not necessarily the originator of the hypothesis in the outside world [who should be cited and acknowledged]) will be identified at the head of the hypothesis view as the ‘authors’ of the synthesis. Thus each hypothesis becomes a peer assembled citeable document (hopefully with a doi assigned). A publication! And as we all know in academia, publications are all. And what’s really nice is that it doesn’t matter which ‘side’ you’re on. If you’re presenting valuable evidence and discussion in either direction, you’ll be listed. So all those old vested interests evaporate away like the eyewash they are.

Synthify?

Not all problems present well as hypotheses. For instance, in my field – marine biogeochemistry – much science is exploratory, and/or based on assessing magnitudes of things : “What is the globally representative concentration of ammonium in the surface ocean“; “What is the annual global carbon uptake by primary producers in the ocean“. Of course, these can be presented as hypotheses “The globally representative concentration of ammonium in the surface ocean is 150nM“, but this is rather pointless. However, the accumulation of evidence leading to an assessment is much the same process as outlined above for hypotheses, only without the FOR and AGAINST argument. And these syntheses could then feed in to wider hypotheses as evidence. Synthify.com has been taken unfortunately, but I think it’s reasonable to conduct such data synthesis work under the hypothify banner. For the field I work in at least, I think the ‘synthify’ function will be as useful as the ‘hypothify’ one.

Anything else?

Moderation will be important and will rely strongly on the community. Controversial topics could get very sticky very quickly. Need to think about policing. Free speech is important, but balanced debate more so. Anthropogenic global warming deniers and intelligent designers are going to be a challenge to the stability and value of the system.

Integration with the rest of the web is obviously very important. All items will obviously be fully shareable, but a proper API would ideally allow full functional push and pull to and from other sites – mendeley, peer evaluation, wikipedia, quora, disqus etc etc.

If all this sounds irresistably interesting, please hit the pre-sign-up  at http://hypothify.kickofflabs.com

Transactions in research on the web: hypothesis and synthesis

In a recent post in response to  a suggestion that there should be a ‘GitHub for science’, Cameron Neylon discusses the need for core technology which will allow irreducible ‘grains’ of research to be distributed. He makes the argument that these packets of information need context and sufficient information that they become the ‘building blocks’ of scientific information on the web – with these, the higher level online web transactions that we anticipate will revolutionise and accelerate research will precipitate out with the minimum of effort as they have done for software, social interactions etc for the wider web.

Neylon’s post links (as an example of a step in the right direction) to this work on Research Objects, “sharable, reusable digital objects that enable research to be recorded and reused“. This is great stuff and if standardisable might start to fulfil Neylon’s vision for a transfer protocol for research information. However Research Objects in particular are likened to academic papers, which I think is the wrong scale to be looking at the problem. Using the code analogy we need snippets that can be rehashed into other uses, not complete programs, whether open source or not.

For e.g. laboratory chemistry an experiment itself might be made up of many research objects, such as a buffer solution of a particular composition and concentration (which is in turn made up of water of a particular purity and constituent chemicals of a particular level of purity from a particular manufacturer and batch). All this data should be encoded. One can imagine a globally unique identifier for research objects at this very granular level. Other examples might be the workflow for the physical experiment and subsequent data processing and the scripts used to process the data and do the statistics. Granulating and classifying all this really appeals to my geeky side and I’ve tried to do this kind of stuff in my lab based and other research in my open lab notebook, for instance defining individual reagent or standard solutions then using them in analyses and documenting  bits of e.g. experimental design or data analysis with associated code snippets to allow reproduction.

This approach could conceivably work very well for experimental information and data, and even the numerical analysis of data, but it doesn’t necessarily capture another important transacted currency in research – ideas; or the joining material between the ideas and the research object – the (numerical or qualitative) assessment of a body of evidence provided by discrete pieces of research in support of or against a particular idea. You could call these quantities hypothesis and synthesis. I think these fundamental concepts in research are often lost in the written record of research at the moment for a number of reasons, most importantly because much of the work of proposing hypotheses and conducting synthesis work tends to fall through the cracks of ‘lost knowledge’ in the publication process.  It’s difficult to get hypotheses and synthesis work published in the literature on a stand alone basis.

Furthermore the effort of proposing hypotheses, testing and assessing them is something which is better done at the community- rather than the individual- level. As well as sharing the effort and avoiding repetition, community-level synthesis and hypothesis testing should result in better research. In my area of science, where we look at the complex interactions of physics, chemistry and biology in natural systems, I find there is much ‘received wisdom’, concepts and ideas which propagate through the field with little easily accessible documentation to back them up. It might be out there buried in little packets distributed across many papers in  the literature, but often it isn’t assessed openly by the community.

For example the received wisdom (simplified here for argument’s sake) is currently that the North Sea is Nitrogen limited  (i.e. there is an unspoken hypothesis to this effect). A decade or two ago most people thought it was phosphorous limited. Nobody has written a paper about it or studied it specifically (at least not in the literature), people just look at one aspect or another of this when doing their own study on something else and make statements or inferences in their papers, which tend to influence the field. Other people may present evidence against the hypothesis in their paper, but they aren’t considering the subject in their analysis so pass no comment on it. The measurement people don’t ask ‘what do the models say’. The modellers don’t think about things in the same way, so don’t ask the question, or look for the answer. There’s no crosstalk, or open reasoned discussion which is inclusive to the whole community.  I’m not saying that I disbelieve the hypothesis, I just think most people who use the argument in discussions probably don’t have a good grip on the whole of the body of knowledge we have on the subject. By restating the hypothesis they strengthen the community belief in it. I’m not expert enough or well read enough in that particular subject to know whether the idea that the North Sea is N limited is a well-evidenced hypothesis or a meme. People I trust and respect have told me it’s true, but that is no substitute for a structured and argued body of evidence.  I would like a centralised source of evidence for and against such a hypothesis and an open  community-driven assessment of its validity – it would be really useful for the proposal I’m currently involved in writing.  I could spend weeks reading the literature and make my own assessment, but I haven’t the time.

Similarly, in biochemistry there is currently debate over the significance that structural dynamics have on the reactivity of enzymes. There are papers arguing for and against. As the author of this blog post points out, discussion in the literature can be biased by friendly or hostile reviewers, who take a strong view for and against the hypothesis in their reviews of experimental or perspective pieces. This is a problem for reasoned trustworthy debate; and the forum for debate and response in the peer-reviewed literature is slow and difficult. By the time a response is published the field has moved on and potentially adopted the ideas presented in the original paper, which may or may not have been biased to one side of the argument. Furthermore, with papers and responses scattered throughout the literature, there is no central point from which to access the body of published knowledge (unless someone writes a review article and manages to capture all the relevant evidence – again something that is more likely to be successful if a wide community is involved rather than just a small group). Future papers cannot be caught by this ‘net’. If I want to read up, I have to a) find all the papers and b) read all the papers. If all I want to do is cite the ‘state of the art’ in the field in the introduction to a paper I’m writing on something related but not completely dependent on the hypothesis, then  I’m more likely to cite a single article which takes one view or the other, or cite one of each, thus reinforcing ether one side of the argument or propagating the idea that ‘we don’t know’ which may or may not be true – and is impossible to assess without a detailed synthesis and assessment of the available information. Back to needing a community effort… If a group of experts state that they all believe a hypothesis and do so at the front of a big body of community-compiled and analysed evidence and argument, then I’d be much more happy to ‘receive their wisdom’.

Maybe this is something we can tackle with existing web technology without the need for a new underlying research-specific standard of information transfer. There are plenty of reasons why building “X” for research isn’t a particularly good idea (which boil down to “why not just use X”), but there is space for online tools which take research-specific data models and build web services around them. Figshare, for instance, or Mendeley. These tools are not restricted to research, however. Anybody can use them. I’ve been considering a similar web service for hypotheses and syntheses recently. Let’s call it Hypothify for argument’s sake (domain already registered :-)). It would be a space where hypotheses can be proposed, evidence compiled and synthesised, reasoned discussion can be conducted. Majority consensus could be built. Or not. Depending on the state of our knowledge. Hypotheses could range from highly specific (“In our experiment X we expect to find that Y happens“) to very broad conceptual hypothesis (“It is statistically unlikely that Earth is the only planet in the universe which supports intelligent life”). Key papers could be identified in support of/ against the hypothesis and short summaries written. Corresponding authors of those papers would be notified and invited to contribute. Contributions would be rated by the community. The major contributors of evidence for or against would be listed. Thus each hypothesis would be a ‘living document’ with an ‘author list’. Not peer reviewed but peer assembled. Citeable with a doi?

In some way hypotheses will tend to be heirarchical and interdependent, or importantly, mutually exclusive and this could be represented where appropriate. Hypotheses needn’t be limited to science: “Edwin Drood was murdered by his uncle“.  Academics and members of the public would be equally able to contribute. Some moderation would inevitably be necessary on controversial topics – climate change for instance. But Hypothify would be a space for engagement with the wider community both in terms of content but also the process of academic research. This is a positive thing. We can take useful bits of the wider web to use for our work (GitHub, Twitter, Slideshare), why not send something back the other way?

In my next post I’ll outline the (rather sketchy) details of how I think Hypothify might work. Would love to hear what you think! If you’re already convinced, please register your interest here.