By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,985 Members | 1,776 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,985 IT Pros & Developers. It's quick & easy.

Help with data analysis in python

P: 1
Hello, I am struggling with an assignment using python for data analysis. We are allowed to use pandas but I had a problem with showing the data using Pandas so i continued the code without Pandas. I am having problem with getting multiple things from the data( I have to get the year, northern latitude, and name of a hurricane) as well as accessing certain strings from the data.

Here is the text file with the data: https://www.nhc.noaa...2017-050118.txt

Here is the .ipynb file with code to check your answers and the code for reading the file and how to access its records.
http://www.cis.umass...8fa/a1/a1.ipynb

This is my code so far for problems 1-3 but I have no idea how to access the records for problems 4 and 5.

Problem 1: Unique Hurricanes
Expand|Select|Wrap|Line Numbers
  1. names = set()
  2. for record in records:
  3.     #access names record and remove ','
  4.     first_entry = record[0].split(',')[1]
  5.     first_entry = first_entry.split(' ')[-1]
  6.     # strip whitespace
  7.     first_entry.strip()
  8.     # if hurricane name not UNNAMED add to set, thus generating unique names
  9.     if(first_entry != 'UNNAMED'):
  10.         names.add(first_entry)
  11.         # answer is number of unique hurricane names 
  12. answer = len(names)
  13. print(answer)
  14.  
Expand|Select|Wrap|Line Numbers
  1. names = []
  2. # import counter
  3. from collections import Counter
  4. for record in records: 
  5.     # access hurricane name record, remove ','
  6.     first_entry = record[0].split(',')[1]
  7.     first_entry = first_entry.split(' ')[-1]
  8.     # strip white space
  9.     first_entry.strip()
  10.     # if not unnamed append to names set
  11.     if(first_entry != 'UNNAMED'):
  12.         names.append(first_entry)
  13.         # call most_common function to get most common hurricane name 
  14.         answer = Counter(names).most_common(1)[0][0]
  15.         print(answer)
  16.  
Expand|Select|Wrap|Line Numbers
  1. years = []
  2. for record in records:
  3.     # access years record
  4.     first_entry = record[0].split(',')[0]
  5.     year = first_entry[-4:]
  6.     # append year to years set
  7.     years.append(year)
  8.     # call most_common function to get most common hurricanes in 1 year 
  9. answer = Counter(years).most_common(1)[0][0]
  10. print(answer)
  11.  
I need help with these:

4. Most Northerly Hurricane (10 pts)
Write code that computes the hurricane that went furthest north as measured by the greatest latitude. You need to find the name and the year of the hurricane.
Hints:
Check the documentation to find where the latitude is recorded.
You will need to go through the tracking points to check all of the latitude points recorded.
You need to keep track of three things: the maximum latitude seen so far plus the name of the corresponding hurricane and year
The latitude adds the N character to indicate the northern hemisphere. This needs to be removed to do numeric comparisons.
You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.
5. Hurricane with Maximum Sustained Wind (10 pts)
Write code that determines the hurricane with the highest sustained windspeed. You need to find the name, year, and wind speed for this hurricane.
Hints:
Check the documentation to find where the wind speed is recorded.
You will need to go through the tracking points to check all of the wind speeds recorded.
You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.

I have tried using pandas but when I try to read it in my way and create a dataframe it returns an error:
"list indices must be integers or slices, not str"

here is my code for my failed implementation:
Expand|Select|Wrap|Line Numbers
  1. import pandas as pd
  2.  
  3. hurricane_storm_dfs = []
  4. for storm_dict in hurricane_storms_r:
  5.     storm_id, storm_name, storm_entries_n = storm_dict['header'].split(",")[:3]
  6.     # remove hanging newline ('\n'), split fields
  7.     data = [[entry.strip() for entry in datum[:-1].split(",")] for datum in storm_dict['data']
  8.     frame = pd.DataFrame(data)
  9.     frame['id'] = storm_id
  10.     frame['name'] = storm_name
  11.     hurricane_storm_dfs.append(frame)
  12.  
Sep 28 '18 #1
Share this question for a faster answer!
Share on Google+

Post your reply

Sign in to post your reply or Sign up for a free account.