Practice: More pandas
Contents
Practice: More pandas#
Jupyter Notebooks
Reminder, that on this site the Jupyter Notebooks are read-only and you can’t interact with them. Click the button above to launch an interactive version of this notebook.
With Binder, you get a temporary Jupyter Notebook website that opens with this notebook. Any code you write will be lost when you close the tab. Make sure to download the notebook so you can save it for later!
With Colab, it will open Google Colaboratory. You can save the notebook there to your Google Drive. If you don’t save to your Drive, any code you write will be lost when you close the tab. You can find the data files for this notebook below:
You will need to run all the cells of the notebook to see the output. You can do this with hitting Shift-Enter
on each cell or clickin the “Run All” button above.
For this problem, we will be using the earthquakes dataset from the last lesson.
import pandas as pd
df = pd.read_csv('ufos.csv')
df.head() # Method to only display the first few rows
Problem 0#
Compute the average duration ('duration (seconds)'
) for each UFO shape ('shape'
).
For testing purposes, store the result in a variable called ans0
.
# Write your code here!
Problem 1#
For this problem, we will provide you with some buggy starter code. Your job is to fix it so it meets the specification below!
Write code to compute the longest duration UFO sighting ('duration (seconds)'
) for each city ('city'
) that is in a “positive location”. A “positive location” is one where either at least one of its latitude ('latitude'
) or longitude ('longitude'
) are greater than 0.
For testing purposes, store the result in a variable called ans1
.
# Fix this code!
ans1 = df.groupby('city')['duration (seconds)'].max()[df['latitude'] > 0 | df['longitude'] > 0]
ans1
Problem 2#
Find the name of the city ('city'
) that has the longest total duration of UFO sightings. Use the column 'duration (seconds)'
to compute the total duration of all UFO sightings in each city.
For testing purposes, store the result in a variable called ans2
.
# Write your code here!
Problem 3#
Compute how many words are in each comment ('comments'
). Your result should be a Series
of the same length as df
that has the number of words in the comment as values. Like with previous problems we have done with counting words, we are just looking for the number of sequences of characters separated by whitespace.
For testing purposes, store the result in a variable called ans3
.
Hint: How would you do this for a single string 'I love dogs!'
?
# Write your code here!