Back to Homepage

Credit: Martin Grandjean

Milestone 1: Group formation and initial questions

Gregory L. Nelson (adapted from materials made with Benji Xie and Andy Ko)

The first step of any software project is defining why the project is happening at all. What problem is it solving? Why have the specific people on a team come together to solve it? What makes them the right people to solve it? Do they have the skills to solve it? In this homework, you're going to answer these questions, designing your organization intentionally.

Step 1: Creating a team

Most teams come together around trust first and then choose a problem. This is because is trusting relationships are necessary for collaboration: they provide the psychological safety (Edmondson 1999) necessary for risk taking, feedback, and open communication. Because of this, we're going to form teams first, and then identify problems.

Your team must be 3 to 6 people. (We want the teams large enough that you encounter communication and coordination complexities that reflect real teams in practice). Because of the size of this class, you're not going to be able to form the exact team you might want. You're also going to have to interview each other to assess the potential for trusting relationships.

Here's the process to follow

Next, choose a spirit animal for your team. You will name your team after your spirit animal.

Step 2: Create group infrastructure

Your team needs infrastructure to effectively communicate and collaborate.

Step 3: Identify 3 potential data science domains (and decisions)

When doing data science in practice, you will co-evolve a framing (as a question or as a full Decision) with domain experts and other stakeholders. To do that, you need a basic familiarity with a domain, in order to talk with an expert in that domain. Gaining domain expertise is an open-ended task, so how do you do that? To help provide some structure, I will give you a set of questions to guide you in forming domain understanding. It is in the format of a spreadsheet, with sub-goals (why you are answering those questions), and questions underneath those sub-goals.

The goal of this part of the assignment is to 1) gain practice with domain understanding and framing skills (using our Decision concept / Decision Theory), 2) find domains and decisions that interest and motivate you to do your class project, and 3) practice your getting feedback and iterating skills. To accomplish 1) above, I discourage you from simply dividing and conquering with your group, by assigning 1-2 people to each domain; you should try to at least give feedback on the other two.

We fully expect this step to take more time than we have in class. Discuss, debate, and deliberate outside of class to arrive at a problem you're all excited about solving this quarter. Try to ideate decisions you can advise meaningfully within the scope of this quarter (and I will give you feedback). Be careful to avoid jumping to / focusing on a single solution from the beginning.

Because we'll have limited time in this class to do the project, here are a few constraints on the decisions you choose:

Create a page on your GitHub organization's wiki, and title the wiki page "Potential Domains and Decisions". On this page, first write three short summary paragraphs, one for each of your 3 potential domains. Each should describe:

  1. Goals:
    1. What are you trying to use data science to do?
    2. Who are the people / stakeholders in your domain?
    3. What do people/stakeholders in the domain want to achieve?
    4. Why is it important?
  2. Decisions: What specific decisions might you inform?

On the page, include a link to a longer Google document with 3 sections, on one each decision; each section should start with a summary paragraph from the Github wiki page. You will fill in this spreadsheet template, to help organize your thinking. This will involve researching the domains and summarizing them using a concept diagram, a written description, and any other ways of communicating the domain you think are useful. It will also involve a causal diagram.

Grading Criteria

For homework credit, you will have updated your team GitHub repository's wiki as stated in Step 2.

Your shared GitHub space will be graded on the following scale:

Further reading

Ko, A. (2017). What makes a good research question?

Edmondson, A. (1999). Psychological safety and learning behavior in work teams. Administrative science quarterly, 44(2), 350-383.