{"cells":[{"cell_type":"markdown","metadata":{},"source":["# Machine Learning Code\n","
\n","\n","```{jupyter-info}\n","{rel-data-download}`homes.csv`\n","```"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Before we begin, let's put a word of caution about how to approach learning these libraries:\n","> Trying to memorize all of these function calls and patterns is a ridiculous task. We will throw a lot of new functions at you very quickly and the intent is not for you to be able to memorize them all. The more important thing is to understand how to use them as examples and **adapt** those examples to the problem you are trying to solve. \n",">\n","> The most important thing is to understand the big ideas we highlight about the code we are showing!\n","\n","This means we won't always be able to explain every bit of code. The purpose is to give you some examples that you can run for your own projects or homeworks, even if you don't have the hundreds of pages of documentation memorized (because no one acually does that!).\n","\n","\n","## Machine Learning\n","So far, everything has been awfully abstract and high level. In this notebook, hopefully we can make things more concrete by tying the terms we introduced in the last slide to specific pieces of code to actually train a model. In the next slide, we will introduce the specifics of what this model is learning and how it learns!\n","\n","For this notebook, we will use a dataset about homes in either San Francisco or New York[1]. This dataset has a row for each house, and some various attributes of the house like which city it's in, its elevation, and the year it was built. \n","\n","> *[1] This data was provided by [R2D3](http://www.r2d3.us/visual-intro-to-machine-learning-part-1/) and has been slightly modified to fit our course. Please see [the original source](https://github.com/jadeyee/r2d3-part-1-data/blob/master/part_1_data.csv) for licensing information.*\n","\n","Now, let's load in this dataset and train a machine learning model to predict the city from the features! "]},{"cell_type":"code","execution_count":2,"metadata":{},"outputs":[{"data":{"text/html":["\n"," | beds | \n","bath | \n","price | \n","year_built | \n","sqft | \n","price_per_sqft | \n","elevation | \n","city | \n","
---|---|---|---|---|---|---|---|---|
0 | \n","2.0 | \n","1.0 | \n","999000 | \n","1960 | \n","1000 | \n","999 | \n","10 | \n","NY | \n","
1 | \n","2.0 | \n","2.0 | \n","2750000 | \n","2006 | \n","1418 | \n","1939 | \n","0 | \n","NY | \n","
2 | \n","2.0 | \n","2.0 | \n","1350000 | \n","1900 | \n","2150 | \n","628 | \n","9 | \n","NY | \n","
3 | \n","1.0 | \n","1.0 | \n","629000 | \n","1903 | \n","500 | \n","1258 | \n","9 | \n","NY | \n","
4 | \n","0.0 | \n","1.0 | \n","439000 | \n","1930 | \n","500 | \n","878 | \n","10 | \n","NY | \n","
\n"," | beds | \n","bath | \n","price | \n","year_built | \n","sqft | \n","price_per_sqft | \n","elevation | \n","
---|---|---|---|---|---|---|---|
0 | \n","2.0 | \n","1.0 | \n","999000 | \n","1960 | \n","1000 | \n","999 | \n","10 | \n","
1 | \n","2.0 | \n","2.0 | \n","2750000 | \n","2006 | \n","1418 | \n","1939 | \n","0 | \n","
2 | \n","2.0 | \n","2.0 | \n","1350000 | \n","1900 | \n","2150 | \n","628 | \n","9 | \n","
3 | \n","1.0 | \n","1.0 | \n","629000 | \n","1903 | \n","500 | \n","1258 | \n","9 | \n","
4 | \n","0.0 | \n","1.0 | \n","439000 | \n","1930 | \n","500 | \n","878 | \n","10 | \n","
... | \n","... | \n","... | \n","... | \n","... | \n","... | \n","... | \n","... | \n","
487 | \n","5.0 | \n","2.5 | \n","1800000 | \n","1890 | \n","3073 | \n","586 | \n","76 | \n","
488 | \n","2.0 | \n","1.0 | \n","695000 | \n","1923 | \n","1045 | \n","665 | \n","106 | \n","
489 | \n","3.0 | \n","2.0 | \n","1650000 | \n","1922 | \n","1483 | \n","1113 | \n","106 | \n","
490 | \n","1.0 | \n","1.0 | \n","649000 | \n","1983 | \n","850 | \n","764 | \n","163 | \n","
491 | \n","3.0 | \n","2.0 | \n","995000 | \n","1956 | \n","1305 | \n","762 | \n","216 | \n","
492 rows × 7 columns
\n","