Prof. Henry Kautz henry.kautz@virginia.edu
People’s online behavior contains signals about their physical and mental health. This course will explore research on using data from users’ interactions with Twitter/X, Google Search, YouTube and other online platforms for tasks ranging from identifying people suffering from anxiety disorder to tracking down restaurants that are sources of food poisoning. We will also read papers on both sides of the ongoing debate about whether social media should be restricted because of potential harm to children or adults.
CS 2100 or permission of the instructor.
Students will be required to read 2 to 4 research papers each week and write a 1 page summary of each. In addition, for each class session, two students will present their summaries and lead a discussion of the work. The written and verbal summaries should include
The research hypothesis of the work
The methods employed
A critical discussion of the results, including strengths and weaknesses
Ethical concerns
Open questions raised but left unaddressed by the work
These summaries should be written manually by the students without using any AI writing or summarization tools.
The course programming projects in which students will download and analyze part of their own online footprint. The easiest data users can access is their Google Takeout data, which includes their browsing history, location history, search history, and YouTube viewing history.
Students who already use Google services should consider turning on the “Timeline” feature of Google Maps immediately upon registering for the course (if it is not already on) in order to ensure that they have a rich set of location data to analyze. Students who do not use Google services should contact the professor to talk about what other kinds of online data about themselves they could access.
Please note that students’ data will not be shared with the professor or other students; they will only be expected to include summaries and visualizations of the data that they create themselves in their project reports.
Both CS 4501 and CS 6501 students will complete the first two projects:
Use your Google Takeout data to infer your sleep patterns for at least a month, under the assumption that these can be determined by your lack of use of Google applications. Create visualizations of the data. Write a report of at least 1500 words discussing your methods, results, and insights gained.
User your Google Takeout Map Timeline data to infer the significant locations in your life, that is, places you repeatedly visit over the course of a month, including your home, your classrooms, favorite restaurants, and similar. Create visualizations of the data. Write a report of at least 1500 words discussing your methods, results, and insights gained.
Only CS 6501 will complete a third project:
Create a research hypothesis about the relationship between at least two different kinds of Google Takeout data. Design and carry out an experiment to test your hypothesis. Write a report of at least 2000 words discussing your methods, results, and insights gained.
Students are free – and even encouraged – to use AI tools such as GPT4 or CoPilot to help write code for their projects. It is also okay to make use of code found in public Git repositories. All help with coding must be acknowledged. Students may also use AI tools to help write their reports as long as such use is also acknowledged, but should be aware of the potential for AI writers to make mistakes; students will be held responsible for errors generated by an AI helper!
Using AI tools to create any paper summaries will be taken as academic dishonesty. Passing off someone else’s work as your own without acknowledgement will be taken as academic dishonesty. All cases of suspected academic dishonesty will be referred to the UVA Honor Office.
Grades will be based on
25% written paper summaries
25% on paper presentations and discussion leadership
50% on programming projects and reports.