Taming the Information Overflow – Part 2

February 10, 2015

Part 1: News articles, blog posts and social media links
Part 3: Work notes, documents, slides, web clippings and textbook scans

This post is the second in a series of three posts on systems I’ve implemented for coping with the increasing influx of information. The first post described my information system for everything not related to work, for example news articles, blog posts and social media links. In the present and an upcoming post, I describe similar information systems for work.

The information I use in my work – my Ph.D. project – can be split into two parts: Scientific publications and everything else. The latter covers many different types of information – for instance, copies from textbooks, web clippings, other people’s notes – but none of these are individually as big and important as the former, scientific publications.

Therefore, I dedicate the present post to my information system relating to scientific publications where I focus on everything related to finding, collecting, reading and processing these; in the next post, the information system for everything else is covered.

Scientific publications cover both journal articles, conferences papers, Ph.D. theses and book chapters, but journal articles by far represent the largest (in number of documents) and most important (in impact on my work) part. Consequently, I simply refer to scientific publications as “articles” in the rest of this post.

An overview of my system for articles can be seen in the above picture, and the points I’ll cover in this post are the following:

The philosophy behind my information system

My information system for articles has evolved since the beginning of my Ph.D. project (Fall 2012), and I’ve been inspired by other people’s information workflows. I have picked ideas from many different systems, but the concepts of Zettelkasten and the Knowledge Cycle have been particularly inspirational.

With my information system for finding, collecting, reading and processing articles, I aim at a few basic things: Ease in collecting, but without indulging in The Collector’s Fallacy; access to articles everywhere and on all devices; a structured and flexible system for getting an overview and for searching within my collection of articles; an external place to store and search within the acquired information.

In the following sections, I go more into details with these parts.

Sources of Articles: Many…

As a scholar, it is important to stay up to date on new articles, representing state of the art within research, to be able to generate new ideas and to stay aware of the developments in the field.

However, it is – especially for young researchers – equally important and useful to go back to original articles to learn and understand the basics of the topic at hand. And for this purpose reference lists in other articles are indispensable.

When I first started to learn the computational techniques that I use in my Ph.D. project, I kept looking backwards in the reference lists – and finally ended up with the original articles from 1966!

Similarly, well-written literature reviews in articles – or Ph.D. theses – can contain plenty of interesting articles and can furthermore be a useful starting point for your own research synthesis.

Therefore, reference lists obviously represent a big and important source of new information and articles. So important that I often as the first thing after having opened an article skim the reference list, even before reading the abstract. Especially if I know one or more papers in the reference list, I’m compelled to continue, either to read the abstract or to skim the figures – before I decide if I want to collect the article.

RSS feeds is another source of articles; in Feedly, I have a few RSS feeds set up from journals that much of my reference material stems from. When I first set it up, I received all new articles from several OSA, APS and Nature journals, but the signal-to-noise ratio was low; there were too many irrelevant articles for my work, and I would spend a lot of time sifting through these with only few useful articles coming out of it.

To change this I now have customized RSS feeds with OSA (free user profile) with important keywords, a single topic cross-journal RSS feed with APS and a single Nature journal RSS feed. This may mean that I in my RSS feeds miss some interesting articles, but at least the signal-to-noise ratio is a lot higher; most of the articles that end up in Feedly are relevant to me.

Customization is at the heart of another important source of articles: Google Scholar. I regularly search for articles on specific topics and have a personal profile on Google Scholar. This means two things: I receive notifications when my articles are cited, and I get notified of new articles within topics of my own articles and the articles I search on. The former notifications arrive via e-mail, the latter as a little red alarm bell on Google Scholar (see top right corner in the picture below).

As compared to notifications from publishers (OSA, APS and Nature, for example), Google Scholar also includes citations from more unofficial sources, of which arXiv (at least in my branch of the sciences) is an important example. Using both Google Scholar and arXiv is therefore a great idea.

A final and fairly frequent source of articles is discussions with colleagues as well as attending seminars and conferences.

Collecting Articles: Dropbox and JabRef

Whenever I find an article that I want to save for later reading or reference, I have an unambiguous workflow for collecting and storing it:

I save two copies of the PDF in my Dropbox, one in the “Edited” folder for reading and annotation and one in the “Original” folder that remains unannotated – which is convenient when I forward articles to others.

I name both files according to the last name of the first author and the year of publication (deLasson2013 for my own article from 2013, for example). This system works well for me as I remember first authors and publication years quite well, and in addition file names usually remain relatively short – which is convenient when I refer to articles in my research journal.

I download the BibTeX file for the article (that most journal provide on their homepage) and import it into my JabRef database.

In JabRef, imported references automatically get assigned a BibTeX key, which is the same as the file name. So whenever I write documents and need to cite to an article, the citation name is the same as the file name. Furthermore, I associate the JabRef database entry with the PDF in the Dropbox – so that the PDF in the “Edited” folder can be opened directly from JabRef – and tag the article (more on setting these things up in JabRef in this post).

Doing the above – that is, storing files according to author name, and not in “Topics” folders, and tagging them in my JabRef database – is a system that works well; each reference has an unambiguous file name and position, but can in turn be tagged with arbitrarily many tags. Then, by using JabRef’s advanced search, I can combine various tags in a search to filter articles. I’ve found that writing articles, especially introductions where the scientific literature is reviewed, is easier with a well-crafted tagging system that provides an easy overview of different parts of the topic or field of research.

Many researchers use online database services like Mendeley, and several times I’ve considered switching to one of these. A big advantage of these is the ease of importing articles; in Mendeley, you input the PDF, and the software automatically imports the reference including title, author names, journal name and similar information (with varying degree of success in getting all the information correct).

Admittedly, it takes a little bit longer for me to import references, but frankly I don’t mind that; it’s not an independent goal for me to add more references than that I can take the few moments to do it semi-manually (cf. The Collector’s Fallacy), and in that way I ensure the quality of the imported information – and add tags, to make the articles more useful and relevant in my work.

Reading Information: GoodReader, Skim and Foxit Reader

Reading articles comes in two forms: Skimming articles or reading selected paragraphs to find specific pieces of information, needed in my work right now, or actually reading an article (most often from start to end).

The first type of reading typically happens while I’m in the middle of working on something specific where I need input, and this type of reading – or skimming – is thus always done on the computer. In Windows, I use Foxit Reader that supports tabbed PDFs; this is extremely useful when I have several articles open at the same time. On Mac, I use Skim that unfortunately doesn’t have the same tabbed PDFs feature, making it harder to shuffle around many articles at the same time.

Both Foxit Reader and Skim have a large selection of annotation tools, that I occasionally use a bit. But most of the time, as mentioned above, I merely skim or read selected paragraphs while working on the computer, and I usually put the collected information from this process elsewhere – more on this in the next section.

The second type of reading is almost exclusively done on the iPad where I use GoodReader. GoodReader is synced with my article folders in Dropbox (details on setting this up in this post), and new articles added to Dropbox or articles that have been read and annotated in GoodReader are thus synced between the two.

GoodReader, too, has a lot of annotation tools, and I use some of them – mostly text highlighting and typewriter notes in the margin (see the picture below).

Previously, I would open the article from the “Edited” Dropbox folder to read and annotate – and the article from the “Original” folder in a different tab to view the references while reading. But recently a new version of GoodReader came out, in which it is possible to duplicate the same PDF in several tabs; simply click on the tab of an open PDF and choose “Duplicate tab”. Especially for longer documents where one might want to flip back and forth between different sections this can be useful.

Once the skimming on the computer or reading on the iPad is done, I process the acquired information, as explained further in the following section.

Process Information: Evernote and JabRef

Processing information, in many ways, is the most important part of reading to acquire new knowledge – and yet I have a feeling that many people don’t spend a lot of time, if any at all, on it.

I’m sure that my own system for processing information can be improved, but at least I do something active while and after reading, which makes remembering the information at a later point easier.

As discussed in the previous section, I mostly read articles in two ways: By skimming selected parts to find specific information or by reading entire articles, to get an overview and understanding of the overall results being reported.

In the former case, I collect the information elsewhere, namely in my research journal, where it becomes part of my ongoing work. I might also make an annotation directly in the article, but the important step is to actively pull the information out and put it into a, for me, useful context.

As an example, I recently reviewed the literature for the choice of a set of computational parameters that I need in my own computations. I went over a bunch of articles and noted their choices of parameters in my research journal in Evernote – see the picture below.

In the latter case, reading an entire article, I do a few things after reading: I mark the article as “Read” in JabRef, I potentially add it to an existing literature review, or start a new one, and I export all the annotations from GoodReader to Evernote.

Marking the article as “Read” in JabRef means I can always filter “Read” and “Unread” articles and use that to find articles I read previously – or unread ones that I’d like to read.

Adding an article to a literature review means that I prepare for writing, either my own article manuscripts or my Ph.D. thesis, where I’ll review previous results and findings in my field.

Exporting the annotations from GoodReader into Evernote makes them searchable together with my other work notes. Exporting annotations from GoodReader is readily available – via “E-mail summary”, see the picture below – and export into Evernote is simple with the private Evernote e-mail address. On top of adding the file name and reading date, I add “@Literature #w_annotations” in the e-mail subject field; that way, the annotations are sent directly to the “Literature” notebook and tagged with “w_annotations” (w for work).

Leave a Reply

Your email address will not be published. Required fields are marked *