Use the data set from the Data Expo 2009
http://stat-computing.org/dataexpo/2009/
to answer the following questions.
A description of the data is given here:
http://stat-computing.org/dataexpo/2009/the-data.html
All of the data is already downloaded for you on the llc server here:
/data/public/dataexpo2009/
Please save your solutions in a file named, for instance, if you are group 3:
/proj/gpproj16/p01g3/p01g3.txt
1. How many distinct airport codes appear:
1a. In the origin column?
1b. In the destination column?
1c. Altogether?
2a. Use cut (it is always OK to use other commands in tandem, if needed) to find how many flights arrive or depart from IND. Hint: You might need more than 1 line of code for this problem. It might be helpful to try your code on individual years before you try it on the entire data set.
2b. For a much faster solution, use grep to see how many flights arrive or depart from IND.
3. How many tailnums have the following kids of errors:
3a. Equal to NA?
3b. Equal to 0?
3c. Equal to 000000?
3d. The phrase NKNO is part of the tailnum?
3e. The tailnum is blank, i.e., consisting only of zero or more whitespace characters but no other content?
4. Which 10 planes took the most flights overall?
5a. How many airplane flights did the airplane with tailnum N528 make altogether?
5b. What was the largest number of flights that this airplane ever made during a single day?
5c. How many flights did this airplane make on November 3, 1995?
6. How many flights has each airline had (as an Origin) from each airport? E.g., give a list of all pairs of (Origin) airports and airlines, with the associated counts. Please sort from the highest count to the lowest count. This question should give you some insight about which airports are hubs for which airlines.
7. How many airplane flights occur per year?