You work for an NFL (or NCAA football) team as a data scientist and receive the following email:

Subject: Win Probability

Welcome to the analytics team! It’s time to get started. We want to use a new framework for evaluating the effectiveness of play calling this season. We aren’t sure what the best approach is and believe you can help.

For the first part of your report you will:

1. Create a win probability model that will help us determine the impact of each play we run. For starters, include down, distance, yard line, time left in the game, and point differential.

Next, we are concerned that the model in 1.) may not accurately reflect the impact of each play.

1. Use the model you created to describe one game from the previous season. Include a visual display that has the time left in the game referenced on the x-axis and the win probability referenced on the y-axis. Describe and explain large shifts in win probability from the game, focusing on specific plays and/or drives that best explain these shifts.
2. In addition, provide a discussion on what aspects of the model may be limiting and how you would advance this model in the future. Draw from course readings and cite sources where appropriate. What other applications are there for this model? Could we determine when to punt or go for it based on this model?

Finally, we want to determine other applications for analytics going forward.

4. Pick a chapter (18–27) from Mathletics and provide a brief description of the topic discussed. Explain how this will help our organization in the future and include analysis and data from last year (or several years).

Deliverables and File Formats

The following files should be included in an archive folder/directory that is uploaded to Canvas as a single zip-compressed file. Note: Use ZIP.

1. Provide a double-spaced paper with a 10-page maximum. The paper should include responses to all four questions. The use of visuals, references, and appendices will not count toward the page limit.
2. Input data that is in the same format as used by the analysis program. For example, a .csv file that is brought in by the read.csv() function in R.
3. Complete program code used to access and analyze the data. Note: You can use R or Python for this assignment.

Data Collection

Data can be acquired through the nflfastR package in R, documentation can be found her An R package to quickly obtain clean and tidy NFL play by play data • nflfastR