Applying the 80-20 Rule to Computer Science Research

A key principle touted in the productivity world is the Pareto principle, which says that 80% of your results come from only 20% of the work that you do. What this means is that you can divide your work tasks into high value tasks that lead directly to results and low value tasks that don’t.  Productivity gurus love this principle because, if it’s true, you can boost your productivity seemingly overnight by substituting low value tasks for high value ones. There are some nuances, for example someone may still need to do a low value task even if you don’t do it yourself, but the general idea is to dump low value tasks as much as possible.

That being said, academic research is different from the business world, which tends to be the focus of most productivity books, blogs, podcasts, etc. We researchers rarely know in advance which of our hunches will lead to exciting breakthroughs and scientific discoveries. As a result, we sort of have to “try everything” to verify whether we can or cannot answer a particular research question with available methods. On top of this, we never know if the next coffee break, group meeting, paper discussion, or other interaction may lead to a new discovery. For these reasons, one may wonder whether the Pareto principle really applies to academic research and other forms of open-ended work. Is there really such a thing as high value and low value research tasks? For example, isn’t it always good to read lots of papers? Haven’t studies shown that group meetings are critical to producing innovative and impactful research [cite]? I argue in this blog post that yes we can apply the Pareto principle but doing so requires a shift in how we think about the practice of academic research.

What Makes for Successful Computer Science Research?

As a researcher in computer science, I study how data analysts and data scientists make their own discoveries and innovations. In my experience, there are three key concepts that we need to distinguish between high and low value tasks in computer science research. The first key concept is that all academic research is a form of knowledge work, basically work that happens mostly in our heads rather than mostly in the physical world (compared to say, construction or house cleaning). Second, the key measure of success for innovative knowledge work is the dissemination of new ideas. For example, the initial successes of Facebook, Microsoft, and Apple were all based on bringing a single new idea to the tech market. The third key concept is that there seems to be a recipe of sorts for generating new ideas, and this recipe can be applied by anyone in virtually any discipline, including computer science as well as academic research.

Where does psychology and cognitive science fit in? Well, to generate new ideas, we need to exercise our creativity, and as it turns out, creativity is fundamental and universal across people. According to the academic literature, creativity is made of two parts: originality, or the ability to connect completely different ideas together to generate new ones, and relevance, or the ability to generate ideas that actually matter to other people [Runco 2012]. If our research tasks cover both of these components, then we know we are producing high value knowledge work.

To summarize what we’ve covered so far: computer science research is a form of knowledge work; successful knowledge work leads to the dissemination of new ideas; research tasks are valuable if they help us generate new ideas; to generate new ideas, we need to be creative; and creativity has two components, originality and relevance. Putting everything together, high value research tasks should help us generate new ideas that are both original and relevant. Now, how we can structure academic research to generate original and relevant ideas?

Generating Original Ideas

Although we are focused on computer science research here, generating original ideas is a universal skill that can apply to any discipline, suggesting that we should look to not only computer scientists but also psychologists and cognitive scientists for answers. Across psychology and cognitive science there is a pretty consistent definition for an original idea. DeHaan provides an excellent summary: “A creative insight, then, is a sudden, unexpected recognition of concepts or facts in a new relation not previously seen.” [DeHaan 2011] Here, DeHaan’s “creative insight” is what I call a new idea. So to come up with an original idea, we just need to take two concepts that researchers wouldn’t normally put together, and connect them in an interesting way.

To shed more light on how this is generally done, I’ll share a slightly different perspective from data science research. In data science, specifically visualization, we study how data analysts discover original ideas as they explore and manipulate a dataset. Existing research posits that a person discovers an original idea by combining ambient information, or information extracted from their environment, with personal experience, or the data already in their brain, in a new way [Chang 2009, Gotz 2006, North 2006, Smuc 2009]. In data science, ambient information generally takes the form of tables, spreadsheets, visualizations such as bar charts and scatterplots, or even the output of machine learning models. However, ambient information is not limited to these data science-specific examples. It can be any information from your immediate environment, such as a quote from a paper, an idea posed in a podcast episode, a scene from a movie, and so on. However, as I’m sure you’ve already experienced, most information you perceive in a day is not going to relate to your research. So interesting ambient information is akin to searching for a “needle in a haystack,” or in general a rare event.

Data science research reveals three important points on generating original ideas. First, we need a robust and structured knowledge base in our brains that we can use to connect to interesting bits of ambient information. Otherwise, ambient information will be useless because we will have nothing useful to connect it with. Second, we want to expose ourselves to large streams of ambient information in an efficient way. If we don’t, then we limit our opportunities for finding ambient information that actually connects to what we currently know. Third, we want structured mechanisms for plucking interesting bits of ambient information from these streams and connecting them to our existing knowledge base. If we do not have an efficient way of recording and organizing new connections, then we will forget them, lose them, or become overwhelmed by them.

Psychology and cognitive science add nuance to these three points by emphasizing the need for connecting disparate concepts. For example, I never would have fully understood the science behind creativity if I only ever read computer science papers or data science papers. I had to start reading psychology papers and cognitive science papers to gain the full context. Taking this example a step further, I only stumbled upon DeHaan’s definition of creativity while reading a book completely unrelated to my research: The Autistic Brain by Temple Grandin and Richard Panek [Grandin 2013]. What is even more interesting is that I probably never would have gotten the idea to study creativity in the first place if I had not read the book Where Good Ideas Come From by Steven Johnson [Johnson 2011].

As it turns out, in academia we have a built-in measure of relevance for our new ideas: published research papers. You know that phrase “pics or it didn’t happen”? Well, in academia you could say something similar: “published paper or it didn’t happen.” This is basically just another form of the famous saying “publish or perish” in academia. I admit that the academic review process has its limitations, but the principle benefit of a published research paper is that you have proof that your new idea has been vetted and deemed relevant by your own academic community. Even for graduate students in data science, a typical doctoral dissertation encompasses multiple papers worth of content. Thus, a tangible deliverable for most researchers is a published research paper describing an original idea. Of course, one can quibble with the details, but in computer science, a published conference paper is the standard unit of success for generating a new idea.

That being said, we may not want to wait until we’ve already written an entire paper to find out that our original idea isn’t relevant, or worse isn’t even a good idea! To avoid these worst case scenarios, I recommend getting regular feedback on your ideas at every stage of the research process, from the initial pitch for a new idea down to the full paper draft you submit to a conference. There are lots of different kinds of feedback you can receive, such as feedback on clarity (can someone understand my new idea as written?), viability (does this idea seem achievable and will it produce meaningful results?), audience (would this idea be considered valuable in my research community?), and narrative (is this idea presented in a compelling way?). For example, friends and family can assess clarity. Junior peers can assess clarity and audience. If your university has a dedicated writing center, the writing center can assess clarity and narrative. Research mentors, advisors, and senior peers can assess all of the above. I highly recommend getting into the habit of sharing written documents for feedback rather than just presenting ideas verbally, even in the early stages of a project. This one habit alone will dramatically speed up your paper writing and lead to more published research, as long as you stick to it.

Identifying High Value Research Tasks

Applying both components of creativity (originality and relevance) to research, we can say that high value research tasks increase our ability to publish research papers on original ideas. Taking things a step further, I break this idea down into four specific ways that high value tasks can improve our research performance:

  • The task helps us generate more ideas in less time
  • The task helps us prune bad ideas quickly
  • The task helps us turn ideas into papers faster
  • The task helps us publish papers faster

Now, how do we actually apply this definition of high value tasks to our daily research work? Start by asking yourself whether the task will help you achieve the four points above. To help with this, I reframe the points above as yes or no questions:

  • Will this task help me generate more ideas?
  • Will this task help me prune bad ideas?
  • Will this task help me turn my ideas into papers?
  • Will this task help me publish my papers?

In my experience, the value of a research task is proportional to the number of yeses. Ideally, a task is high value when I answer yes to at least two of these questions. A task with four yeses is one of the most valuable tasks you can work on, and I would put it at the top of your todo list. Keep in mind that to generate new ideas, we need to expose ourselves to steady streams of ambient information as well as build a robust knowledge base in our brains. To publish new ideas, we need to turn them into written documents that can be reviewed by family, friends, peers, mentors, advisors, and eventually conference paper reviewers.

Common High Value Tasks to Build Into Your Research Routine

Whether you are a professor helping a research mentee, or a student researcher looking for tips, here are some common tasks that I highly recommend everyone incorporate into their research routines. However, even the most seasoned researchers did not do all of these things starting out. If these are new to you, try to incorporate just one task at a time into your research routine. Keep the ones that give you good results, and drop the ones that don’t.

  • Share a written document with a peer, mentor, and/or advisor to get feedback on a new idea.
  • Brainstorm with peers or your research group to generate new ideas.
  • Synthesize a template for how to write a certain type of paper (or section of a paper) based on relevant papers from the top conferences in your research area.
    • Example: Create a template for how to write a great visual perception study paper, based on existing visual perception papers published at IEEE VIS, ACM CHI, and EuroVis.
    • Example: Create a template for how to write a great database benchmark paper, based on existing benchmarking papers published at SIGMOD and VLDB
    • Example: Create a template for how to write a great introduction to a CHI paper, based on papers that have won the CHI best paper award or honorable mention award in the last five years.
  • Write up your current progress on your research project
    • Example: Write the introduction section
    • Example: As you read related papers, write up the relevance, significance, and limitations in a tentative “Related Work” section 
    • Example: If conducting experiments such as user studies or benchmarks, write the experiment protocol as a tentative “Experiment” or “Evaluation” section
    • Example: If building a system, write the planned or implemented architecture for the system and adding a snapshot of the current prototype
    • Example: If analyzing data, create an RStudio, Jupyter, or Observable notebook to organize and track your analyses. Use markdown to record your reasoning and approach to your analysis methods as you go along. (I adopt a similar structure with code files and GitHub).
  • Give a 10-15 minute presentation with slides on the motivation, approach, and results for your research project thus far.
  • Give feedback to a peer on their new ideas and/or writing
  • Review your notes for a project or topic that you haven’t been working on for a month or longer
  • Pick a major conference to submit your project to, and complete as many of the submission logistics as soon as possible:
    • Example: Migrate your current paper draft into the official conference paper template.
    • Example: Write a tentative paper title and abstract
    • Example: Create a submission on the submission website, including the paper title, authors, abstract, and keywords
    • Example: Create an OSF project with an anonymized link for sharing supplemental materials
    • Example: Anonymize the current set of supplemental materials and upload them to OSF
  • Consume content from outside of your research area/discipline and brainstorm ways to incorporate it into your knowledge base
    • Example: Read a non-fiction book that interests you and write down any quotes that resonate with you.
    • Example: Attend a networking event and write down contacts for organizations, such as inspiring non-profits, that you may want to work with in the future
    • Example: Attend a distinguished lecture in another department, and brainstorm ways you can incorporate the speaker’s technical ideas and/or research approach into your own work.
  • Attend a weekly accountability group with peers where you share your professional goals and group members hold you accountable for making progress towards these goals every week.

Notice how revising and editing your own work is not listed as a high value task in itself. Revising is a necessary task but it only becomes high value when done in preparation for more feedback. In other words, people have to see your work in order for it to get published so revising only matters if someone else sees the changes. For this reason, always give yourself a revision deadline, even if you are just showing your work to a friend.

Common Low Value Tasks to Avoid

Some of these tasks are inevitable, but I try not to start my day with them and spend as little time doing them as possible.

  • Reading and writing emails/social media posts.
  • Scheduling or attending meetings with no clear agenda and no clear goal
  • Scheduling or attending meetings that consistently fail to produce meaningful ideas, action items, and/or tangible writing deliverables.
  • Developing infrastructure and running experiments without writing down the objectives, rationale, and/or algorithms behind them in your paper draft first.
    • Example: writing code before writing the objectives, rationale, and algorithm/pseudocode in your paper draft or research notes beforehand.
    • Example: conducting an experiment without writing out the objective, hypotheses, protocol, and analysis methods beforehand (basically without pre-registering the experiment somehow).
  • Reading a research paper, attending a talk, or generally participating in some work event without taking any notes on how to incorporate what you learned into your knowledge base, such as:
    • How the techniques presented could be applied to your research project or research discipline in a new and interesting way
    • If it is directly relevant to your research, how your project provides something new over this work
    • What you like about how the material is presented, and specific steps to apply the same writing/presentation techniques to your current project
  • Passively consuming content without actually incorporating it into your knowledge base
    • Example: Listening to lots of podcasts or reading lots of books without taking any notes on what you’ve learned or find valuable. How will you keep track of what you’ve learned in the past so you can connect it with new ambient information?

Note that if you find yourself thinking “there’s nothing that I want to write down” during a task, then either you need to try harder to find something interesting to add to your knowledge base, or the task is low value and you probably shouldn’t be doing it.

General Habits to Boost your Research Productivity

These habits have enabled me to build a strong foundation for my research success. Stay tuned for a future post on how productive personal habits can enhance professional performance!

  • Write everything down, preferably in a centralized note-taking application with an effective search feature.
  • Give yourself plausible but ambitious deadlines and commit to them, even if you are just showing your work to a friend.
  • Participate in hobbies and creative pursuits outside of work to further exercise your creativity during your free time.
  • Meditate. You can start with just 5 minutes a day.
  • Take frequent breaks so your mind can refresh and recharge for the work ahead. Just like how we need daily, weekly, quarterly, yearly, and lifetime goals, we also need daily, weekly, quarterly, yearly, and career (e.g., sabbatical) breaks.

Summary

Great research is innovative research. Creativity is the cornerstone of innovation and creative ideas have two interlinked components: originality and relevance. Original ideas come from combining ambient information, or information from the environment, with your personal knowledge base, or the network of information in your brain, in an interesting way. Original research ideas are deemed relevant when they have been successfully vetted by the research community, initially by your peers and mentors but ideally by conference paper reviewers. Applying the components of creativity to computer science research, we can say that high value tasks enhance our ability to: (1) generate original research ideas, (2) prune the bad ones, (3) turn the good ones into research papers, and (4) get regular feedback on our written ideas prior to publication. High value tasks should cover at least one but ideally two or more of these points, and the more points that are covered, the higher value the task. I have also created some concrete examples of high and low value tasks, please feel free to share the infographics. Check out the infographics below for easy to remember highlights from this post.

References

  • [Chang 2009] Chang, R., Ziemkiewicz, C., Green, T.M. and Ribarsky, W., 2009. Defining insight for visual analytics. IEEE Computer Graphics and Applications, 29(2), pp.14-17.
  • [DeHaan 2011] DeHaan, R.L., 2011. Teaching creative science thinking. Science, 334(6062), pp.1499-1500.
  • [Gotz 2006] Gotz, D., Zhou, M.X. and Aggarwal, V., 2006, October. Interactive visual synthesis of analytic knowledge. In 2006 IEEE Symposium On Visual Analytics Science And Technology (pp. 51-58). IEEE.
  • [Grandin 2013] Grandin, T. and Panek, R., 2013. The autistic brain: Thinking across the spectrum. Houghton Mifflin Harcourt.
  • [Johnson 2011] Johnson, S., 2011. Where good ideas come from: The natural history of innovation. Penguin.
  • [Newport 2016] Newport, C., 2016. Deep work: Rules for focused success in a distracted world. Hachette UK.
  • [North 2006] North, C., 2006. Toward measuring visualization insight. IEEE computer graphics and applications, 26(3), pp.6-9.
  • [Runco 2012] Runco, M.A. and Jaeger, G.J., 2012. The standard definition of creativity. Creativity research journal, 24(1), pp.92-96.
  • [Smuc 2009] Smuc, M., Mayr, E., Lammarsch, T., Aigner, W., Miksch, S. and Gärtner, J., 2009. To score or not to score? Tripling insights for participatory design. IEEE Computer Graphics and Applications, 29(3), pp.29-38.