Subscribe to MeaseyLab Blog by Email

What's the big idea?

17 September 2019

What’s the big idea?

In previous blog posts (see here), I’ve talked about the importance of having a hypothesis, and building that hypothesis in a logical framework within the introduction (see here). The introduction serves to inform the reader about why this particular hypothesis was chosen, introducing both the response and determinate variables, as well as the presumed mechanism by which the hypothesis can be falsified (or upheld).

In this post, I take the lead from my recent talk for the Herpetological Association of Africa (see blog post here), in which I talked about the need for herpetologists to respond to bigger theories in biological sciences.

This message was the result of work done in the MeaseyLab (but not yet completed!) on invasion hypotheses, where we (Nitya, James, Sarah, Natasha and I) checked 850+ papers on alien herps to see which of 33 common invasion hypotheses they had tested. The answer was disappointing, with <1% having used an invasion hypothesis.

In my talk, I suggested that this might not be true only of papers on herpetological invasions, but also of herpetology in general (although I concede that some areas, such as herp physiology are actually quite good). Further, I contend that using these wider hypotheses or theories would actually be good for the authors concerned, as it would likely garner them a wider audience. Moreover, a greater number of biologists might come to realise how valuable reptiles and amphibians are as models in biology.

So where would we find all of these big ideas?

There are quite a few papers that synthesise hypotheses in various areas of biology. Here I provide two, but I will endeavour to add more as I come across them… so watch this space (although not too keenly).

The first is by Mark Velland on theories in community ecology

The next is by Jane Catford on hypotheses in invasion biology, but I encourage you to look for more up to date versions (the newest is by Enders et al 2018, but this’ll change in time).

Each of these papers will give you a list of big ideas, together with the citations for seminal papers that have built them. You will note that many of these theories are very old with many dating back to Darwin.

Of course, there are many ways to approach and test these theories, but if you don’t know about them, then your work may actually make a considerable contribution to upholding or refuting them, but go totally unrecognised. When the significance of your work isn’t realised, it’s unlikely that it’ll be widely read and used.

Let’s face it, if all the effort of the work that we put into papers is just going to get buried, then is it really worth it? The work that we do is also really expensive, so making it as relevant as we can to a wide an audience possible is something that we should be concerned about.

So, I encourage you to stand on the shoulders of giants by using big ideas in your introduction. Make sure that the data that you collect can actually be used to respond to some of these big ideas. Then make sure that you cite them, giving them the importance that they deserve (yes, even as key words) so that others can find your work, and you might even find that one day, your work has shoulders that are broad enough for others to stand on!

The take home message:

1. As herpetologists we are not engaging with theories from ‘the literature’

2. Herps are great models [even snakes]

3.We have a lot to donate to many areas of biology, but we need to engage

Reading the literature can really expand your mind and horizons. When undertaking a literature review [or when reviewing a paper], take the time to think about not only what has been tested, but what could have been.

Further Reading

Catford, J.A., Jansson, R. and Nilsson, C., 2009. Reducing redundancy in invasion ecology by integrating hypotheses into a single theoretical framework. Diversity and Distributions15(1), 22-40.

Enders, M., Hütt, M.T. and Jeschke, J.M., 2018. Drawing a map of invasion biology based on a network of hypotheses. Ecosphere9(3), p.e02146.

Vellend, M., 2010. Conceptual synthesis in community ecology. The Quarterly review of biology85(2), 183-206.

  Lab  Writing

Google Scholar, Web of Science or Scopus?

24 August 2019

GS, WoS or Scopus - what's the difference?

Have you ever wondered why Google Scholar (GS) scores are so inflated compared to other citation databases like Web of Science (WoS) or Scopus? I've always noticed that Scopus has better coverage that WoS, and that GS is bigger than both (and a lot messier with lots of weird duplicates and poorly entered stuff), but is there anything more to it than that? 

Well it seems that there are some people who have already thought about this, and come up with a good idea of exactly what's different. Martín-Martín et al (2018) have done a great job of analysing all this stuff from some 2.5 million citations. What they found inspired me to write this blog post, in which I've chopped out the Life-Sciences stuff to show you. But I encourage you to go read the article for yourself (there's a link at the bottom, and here).

I have been known to take the odd peak at my Google Scholar profile over the year, and see how it's coming along. I rarely check on WoS or Scopus, 'cos it's a bit of a faff getting signed in and doing the search. Plus it looks so much smaller when one is habituated to seeing those double digits in GS! However, I've always been a bit uneasy about citing my GS citation rate, H-index or i10 (among others that they give) as I've never really known what all that extra represents. Something grey and unseemly? Well, it turns out that it's all good stuff, and perhaps GS is the better one to cite as it's a more inclusive index: more inclusive of different document types and different languages.

  • Top left:  the entire dataset of ~2.5 million citations shows that nearly half are in all 3 databases, but that more than a third are in GS only.
  • Top right: shows life sciences alone (~0.5 million citations) and over half (~57%) shared by all 3, and less than a third in GS only. 
  • Middle: shows the kinds of items that you are getting in GS vs all 3 databases. GS gives you lots of theses, book chapters, conference papers, and other unpublished stuff like preprints
  • Bottom: Shows the different linguistic contributions. Almost all English in the overlapping 3 databases, while GS encompases a lot of Chinese, Spanish, German, French, Portuguese, etc. (sorry not to list them all, but you can see what they are above). 

This is actually really interesting, and allows you to interpret your GS results as a more inclusive citation index. While WoS and Scopus aren't exclusively English or journal publications, they are mostly. But that extra third that GS gives you allows you to show the extra scope that your work is getting outside that English journal mainstream. Is your GS score more than a third higher than your WoS or Scopus score? If yes, then your work is having a greater impact elsewhere in the world, and there's nothing wrong with that.

The excerpts from the two tables above show how well GS correlates with both WoS and Scopus in our area (Biological Sciences). It also tells you by how much the GS score is likely to be inflated - 1.90 for GS/WoS and 1.45 for GS/Scopus. Again, if you deviate from this with a higher score, you can give yourself a pat on the back for having work that's reaching more people in more parts of the world. 

So, just for this blog, I've looked at all three databases for my citations today to see how my score compares: 1.72 for GS/WoS & 1.62 for GS/Scopus. Hmm... I wonder what it means when you get one higher and one lower? Any ideas anyone?

Martín-Martín, A., Orduna-Malea, E., Thelwall, M. and López-Cózar, E.D., 2018. Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories. Journal of Informetrics12(4), pp.1160-1177.

  Lab  Writing

"Invasive Alien Species" is not a thing

26 February 2019

Why “Invasive Alien Species” is not a thing

Many invasion biologists are fond of the term “Invasive Alien Species” (often abbreviated to IAS), but for me it’s logically inconsistent and encompasses redundancy. Perhaps, the original reason for placing the three words together was in recognition that not all alien species are invasive, therefore we’d need to add the term invasive to underline the point that we are only referring to the subset of alien species, those that are invasive. However, the implication in this phrase is that it’s possible to be invasive and not alien; i.e. that “invasive native species” is another category. But it’s not. Most would agree that to be invasive you would first need to be alien. To this end, perhaps I should have titled this blog: Why "invasive native species" is not a thing... but there is a school of thought that suggests that invaders can be native (see Valéry et al 2008).

The Blackburn et al (2011) scheme (pictured below) formalised this in a way that makes this easy to understand.

To make it even easier, I’ve adapted the scheme into sets so that you can appreciate that each group of species is a subset of the other (this scheme is not to scale, as we’d expect to see much smaller sets inside each set – maybe even following the tens rule?). Note that if “invasive native species” were a thing, we could draw another set inside “All species” but separate from all the other sets. Does this seem logical?

If I said that this was an “invasive species”, would you then have to ask: “is it alien or native”?

Invasive species are a subset of alien species (i.e. the ones that spread), but we shouldn’t be adding words to each growing subset; otherwise we’d have the term “invasive established alien species” to distinguish between those that are merely “established alien species” or just “alien species”.

My appeal is to think about these terms instead of blindly following those who have gone before.

Yes, this is another rant on the blog, and I’d like to point the finger at John Wilson for infecting me with this particular pernicious titbit. Now I can only hope that it’ll spread and you’ll be able to point to this to help in your own war on IAS. We don't want to join those silly mechanistic definition folk with their non-biogeographic ideas of invasions. Otherwise we might end up going down an alley labelled 'invasion syndromes'.

  Lab  Writing

Pseudoreplication

24 February 2019

What is pseudoreplication?

It is very rare that we can measure every animal in a population, or every measure available in the environment. Instead of such exhaustive sampling, we try to take a representative sample. This sample is something that we can achieve over the period of our study, and which we can use to represent the population or environment of interest. Each data point within the sample should be a replicate, the same measure taken on an equivalent animal. For example, 20 replicate measures of the right hind leg of a frog should involve 20 individuals.

A pseudoreplicate is a problem with the experimental design. Using our example above, if we measured the same leg on the same animal 20 times, we could not claim to have taken a sample of all frog in a population. Similarly, if 20 measures of 20 individuals all came from the same pond, these animals would probably represent the pond well, but not necessarily the entire population (which is presumably made from more than one pond). Thus, pseudoreplication occurs when the measurements taken have a degree of dependence on each other, and therefore aren’t independent.

In this image, I’ve used the mean of the length of the frogs’ rear legs to be represented by the intensity of the shading. Taking 20 samples from the blue population would need to involve sampling several of the ponds, similarly for the green population. But in the yellow and red populations, the animals move so frequently between the ponds that all the means are the same. Thus, if you only needed to compare red to yellow populations (for your question), then you only need to sample 20 animals from one of their ponds.

However, this is where the subtlety of pseudoreplication sets in. We may have good reason to believe that the frogs in the pond we’re sampling actually do represent the entire population. We may know that animals in all the ponds in that population regularly move around, and hence measuring 20 animals from any of the ponds is the equivalent to measuring animals from all of the ponds. If the opposite were true, that we believe that the frogs in each pond represent a discrete unit, then we’d have a bigger problem. We’d have to sample evenly across all the ponds in the population to make up our sample, or the alternative would be to take lots of samples from each pond and use the pond as a factor in our analysis. By now you can see that the task is getting more onerous, mostly because the question is becoming more complex. This is a really important point, your experimental design is going to depend entirely on your hypothesis, and (as I’ve stated before – see here) it is really important to know what this is from the start.

If our hypothesis was that the legs of animals in one population were longer from those of another (perhaps because of selective sorting), then we might presume that animals within one pond are closely related (especially at the range edge), and so the ponds would become our smallest repeatable unit. We should then measure only a few animals from each pond, and repeat this for lots of ponds for each population. You can build the ponds into your model when you test you hypothesis, given that you have sufficient statistical power (see here for more on this).

Pseudoreplication in experiments

When it comes to conducting experiments, there tend to be a greater number of points at which you might be pseudoreplicating. A good example is the use of incubators to raise 10 sets of tadpoles from 10 pairs of parents at different temperatures. When each incubator is set at a different temperature (i.e. a different treatment) then this is fine, but if two incubators are used to house 5 of the tadpoles sets each, the largest unit becomes the incubator instead of the parental set of tadpoles. This is because the incubators are unlikely to be able to keep exactly the same conditions (incubators are fickle things). Likewise, this could be a room or some other unit in which you are treating the samples. Imagine that you wanted to extract the gut microbiome of these tadpoles and that you used one kit to extract nearly all of them, but suddenly this became unobtainable and you had to buy another brand to finish off the remaining samples. The kits would become your largest unit, and you’d be falling into the realms of pseudoreplication.

As I’ve emphasised above, pseudoreplication is a problem of experimental design. This is because if you’d designed your experiment properly, you’d know that you’d have ordered the right number of extraction kits, or see that not all your animals are going to fit into a single incubator. When you know about these problems in advance, you’ll be able to make allowances for them by including them as a term in your analysis (essentially testing to make sure that the different kit or extra incubator isn’t an issue - you wouldn’t expect it to be, otherwise it wouldn’t be worth going ahead with the experiment). However, you can’t simply go adding extra terms into your analysis. At some point you’ll run out of statistical power, and you must know that you’re going to have enough before you start. That is, you’ll stand an unacceptably high chance of failing to reject the null hypothesis when it is false (Type II error).

Summary

In summary, pseudoreplication is something that you need to beware of before you start your sampling. A good way of checking if you have a problem with pseudoreplication is to present it to a group (like in a lab meeting), with enough detail in your study design so that they’ll be able to spot it. If you are aware of potential problems in your study, conduct a power analysis to decide how many samples you can take in order to take account of the problem.

Sometimes it might be impossible to avoid pseudoreplication in your study design. If you think it's going to be important, then you'll have to redesign your experiment. If you think it's not important, you'll need to be able to reason intelligently, and be honest about the possibility of pseudoreplication in your write-up (see an example of this here). 

  Lab  Writing

Writing the discussion

10 January 2019

Writing the discussion can be a daunting prospect

Writing your discussion

The discussion should be the last part of your manuscript or chapter that you are going to write. This is because for the discussion you must already know all of the rest of the manuscript, starting with the hypothesis, which dictates what will go into all of the other sections. While writing the other sections, I often make notes under the heading ‘discussion’ so that it acts as an aid memoire to ideas that I’ve had during the study.

Before you start writing your discussion, make a plan and then discuss this plan with your advisor. I’d make this suggestion for all of the sections of your chapter or paper. It doesn’t take that long to do, and it provides an opportunity for you and your advisor to talk about the results of your work and discuss them together. Such discussions should be eye opening for both of you, and they provide a great opportunity for you and your advisor to get excited about the work you’ve done, your results and what they mean. I get a lot enjoyment during these discussions, especially when sharing the excitement of the results. Sharing thoughts before you start writing is important because by talking about it, you and your advisor are more likely to come to a consensus about what the results mean. Conversely, presenting them with the discussion finished might not be the best way of convincing them, and you might have a lot more work to do in the long run.

In general, the first and last paragraphs of the discussion are key to the reader, but the discussion must also consider caveats and limitations in the experimental design and interpretation of your results, as well as providing a concise discussion of the results in the context of existing literature. This is also your opportunity to suggest new hypotheses and how they could be tested.

Let's remember that if you are struggling to write, there is the potential to follow a formula, such as the one I outlined previously (see blog post here). 

First paragraph of discussion

You discussion begins by you responding to your hypothesis, clearly stating whether or not it has been accepted, and putting this into the wider context of the study (i.e. paragraph one or two of the introduction with relevant literature). You can then follow these statements by emphasising what you consider to be the most important finding, and explain how it adds to existing literature. However, don’t be tempted to over-interpret your results, or claim that they mean more than they do (see section on speculation below).

This first paragraph of the discussion doesn’t have to be very long (three to four sentences), but you should make sure that you end by providing a link to the following paragraph or explaining how you will move the discussion on in sections.

To sub-section or not to sub-section the discussion?

My preference is to plan the discussion before you write it, just as you did for the introduction. This will provide you with logical sub-section headers for the discussion. When your chapter has a simple aim that is easily communicated, I’d suggest deleting these sub-section headings before you finish. However, many studies are more complex and contain multiple experiments or evidential approaches. It is then much easier to leave sub-sections in your discussion so that your reader can more easily follow the text. Where possible, these should be the same sub-sections that you have broken your methods and results into, especially where these relate to specific hypotheses or aims. Or it may be more appropriate to discuss the different approaches separately, specifically when the literature that you refer to falls into different groups.

If you are stuck and can’t decide which way is most appropriate for your work, spend more time on fleshing the outline specifically to include the literature that you want to cite. Try it one way, and then the other, and you should quickly be able to tell which makes more sense. Of course, you should also ask your advisor for their opinion – that’s what they are there for, after all.

When considering what sub-section to write first, go back to the order that you’ve presented the questions or approaches in the rest of your chapter or manuscript. Keeping the order consistent throughout is a really good way of helping your reader follow what you want to communicate. Shuffling the order in each section is almost guaranteed to get them lost and wishing that they hadn’t started reading.

Then you need to discuss!

The discussion is about explaining the meaning of your results to the reader. I often find that people write a lot of inappropriate information in the discussion. Remember that this section is not going to provide background information, and is unlikely to bring up new topics that need introducing. It may be that your results prompt you to introduce a new area of research that wasn’t covered in your introduction, and this is fine. But for the main part, you should discuss your results in the context of existing literature. You can expect that the literature that you use in your discussion will only partially overlap with your introduction, with plenty of new citations. Similarly, it can be that discussing your results will mean that you end up with paragraphs that have no citations.

When providing different sides of an argument, try to use your results come to conclude that one side is supported more than the other. If your results don't help with this particular point, then it could be that you are trying to discuss something that isn't directly related to the work. This is a very common problem in discussions, and a good test is asking yourself how your results add to the point you are trying to discuss. If they don't leave it out and move on.

Caveats and limitations

An important aspect of the discussion is to consider how the interpretation of your results may be incorrect. For example, if you have done an experiment, how well controlled was it and how well could it be considered to scale up to real world interactions? Could you have measured other variables? Almost every study will have caveats and limitations, and it is very important that you report them in a considered approach.

My preference is not to provide all the caveats and limitations as a separate paragraph. Instead, mention them when you are discussing relevant aspects.

Should you speculate in the discussion?

Reviewers will often be unhappy with speculation in the discussion section. Speculation isn’t that hard to spot, as it occurs when you make claims for which your results have no foundation. I think that it is healthy to have one or two statements that are speculative, but clearly labelled as such. After all, after writing this paper, you are going to be one of the world experts in the topic, and thus your deeper understanding is often worth relating to the reader. However, I suggest that you speculate in combination with suggesting what could be done in future. If you really feel that the point has to be made, you must clearly label it as speculative.

Perhaps an easier trap to fall into is over-interpretation. This is when you suggest that your results mean more than they do. It’s an easy trap to fall into, especially after setting up the study in relation to key topics in the discipline (presented in paragraph one or two of the introduction). You will probably find it hard to see where you have over-interpreted, and this is something that having your work read by your advisor, or another colleague, will really help. You may then be asked to ‘tone down’ your claim, or to place it into the direct findings of your results.

Again, my preference is not to place all speculations or future hypotheses in the same paragraph of the discussion. These aspects should appear as the topics they relate to are discussed.

Don’t beat up on others

Your results may show that other researchers were wrong with their interpretation or findings. Whatever you may think of them, never use your discussion to be disrespectful to other researchers or their work. This has been referred to as the “bully pulpit” (see here). As with all aspects of professional interactions, consider how you would like to be treated, and act accordingly. This is not to say that you shouldn’t point out mistakes that were made before, but be sure not to get emotive or insulting.

Generally, such comments won’t get through the peer review process, and remember that you might be insulting the examiner of your thesis (or the reviewer of your paper) – which is not likely to go down well!

Where next?

The ‘where next’ aspect of your discussion is important as it may provide the reader with ideas for their own work. Of course, these are questions that you may wish to pursue in your own career, or they may be corroborating evidence from other disciplines that you will never undertake. Either way, making pointers for continuing aspects of the research is an important component of the discussion. Providing new lines of research may also allow you to speculate about what you consider to be the most important angle of this topic now that you have presented your results.

Last paragraph

The last paragraph of the discussion is your take home message. It’s a summary paragraph that sets out what you aimed to achieve, and what the new state of understanding of the topic is now that your results are out. This should include the key literature that can now be reconsidered.

Never repeat text

Please remember that while this might sound similar to your first paragraph, it is not the same. This final paragraph should not replicate any text that appears elsewhere in your chapter or manuscript (not even the abstract). Never repeat or copy text generally, even within your own chapter (or between chapters). For the reader, it’s very easy to spot and it gives the impression that you have nothing to say and are simply filling space. This is not the kind of impression that you want to give your reader, especially if they are examining your work!

As always, there are a number of other places to look for more advice to writing your discussion, and I’d encourage you to read as widely as possible. Here are just a few that I’ve looked at after writing the first draft of this blog post:

U Sydney Biochemistry

Hess, DR 2004. How to Write an Effective Discussion Respiratory Care 49:1238-1241.

Şanlı, Ö et al, 2013. How to write a discussion section?Turkish journal of urology, 39 (Suppl 1): 20-24.

  Lab  Writing
Creative Commons Licence
The MeaseyLab Blog is licensed under a Creative Commons Attribution 3.0 Unported License.