Skip to content

Commit e890690

Browse files
committedNov 30, 2022
adding sprint challenge
1 parent 47a4038 commit e890690

File tree

1 file changed

+1351
-0
lines changed

1 file changed

+1351
-0
lines changed
 
+1,351
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,1351 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"deletable": false,
7+
"editable": false,
8+
"id": "j4EFYnn4MixR",
9+
"nbgrader": {
10+
"cell_type": "markdown",
11+
"checksum": "2dee446e1f8055f6ba536709ccbcb62c",
12+
"grade": false,
13+
"grade_id": "cell-2e05cbe003d95447",
14+
"locked": true,
15+
"schema_version": 3,
16+
"solution": false,
17+
"task": false
18+
}
19+
},
20+
"source": [
21+
"## Sprint Challenge: Data Wrangling and Storytelling\n",
22+
"### Notebook points total: 14\n"
23+
]
24+
},
25+
{
26+
"cell_type": "markdown",
27+
"metadata": {
28+
"deletable": false,
29+
"editable": false,
30+
"nbgrader": {
31+
"cell_type": "markdown",
32+
"checksum": "09ff81547a7ea5678875118ce6caa95d",
33+
"grade": false,
34+
"grade_id": "cell-5821095ecfb57ce4",
35+
"locked": true,
36+
"schema_version": 3,
37+
"solution": false,
38+
"task": false
39+
}
40+
},
41+
"source": [
42+
"## Python Fundamentals"
43+
]
44+
},
45+
{
46+
"cell_type": "markdown",
47+
"metadata": {
48+
"deletable": false,
49+
"editable": false,
50+
"nbgrader": {
51+
"cell_type": "markdown",
52+
"checksum": "dc8877d44d9b265b9d66a579582b88cd",
53+
"grade": false,
54+
"grade_id": "cell-b5f9b60ba324a5b0",
55+
"locked": true,
56+
"schema_version": 3,
57+
"solution": false,
58+
"task": false
59+
}
60+
},
61+
"source": [
62+
"**Task 1** - Python Objects\n",
63+
"* Create a list object called `list_practice` using the following three strings: `bloom`, `data`, `python` "
64+
]
65+
},
66+
{
67+
"cell_type": "code",
68+
"execution_count": null,
69+
"metadata": {
70+
"deletable": false,
71+
"nbgrader": {
72+
"cell_type": "code",
73+
"checksum": "2c2a3c1d8e5f4ea0a783bfa9d00218e7",
74+
"grade": false,
75+
"grade_id": "cell-6ee0685279505899",
76+
"locked": false,
77+
"schema_version": 3,
78+
"solution": true,
79+
"task": false
80+
}
81+
},
82+
"outputs": [],
83+
"source": [
84+
"# YOUR CODE HERE\n",
85+
"raise NotImplementedError()"
86+
]
87+
},
88+
{
89+
"cell_type": "markdown",
90+
"metadata": {
91+
"deletable": false,
92+
"editable": false,
93+
"nbgrader": {
94+
"cell_type": "markdown",
95+
"checksum": "6e15fd8ad40d40895af5017592188d86",
96+
"grade": false,
97+
"grade_id": "cell-254c30333f63dc0e",
98+
"locked": true,
99+
"schema_version": 3,
100+
"solution": false,
101+
"task": false
102+
}
103+
},
104+
"source": [
105+
"**Task 1 Test**"
106+
]
107+
},
108+
{
109+
"cell_type": "code",
110+
"execution_count": null,
111+
"metadata": {
112+
"deletable": false,
113+
"editable": false,
114+
"nbgrader": {
115+
"cell_type": "code",
116+
"checksum": "248c1e6a40bea014a9f130cee72c1a46",
117+
"grade": true,
118+
"grade_id": "cell-7f5ed8ccc5b15f71",
119+
"locked": true,
120+
"points": 1,
121+
"schema_version": 3,
122+
"solution": false,
123+
"task": false
124+
}
125+
},
126+
"outputs": [],
127+
"source": [
128+
"#Task 1 - Test\n",
129+
"assert isinstance(list_practice, list), \"Make sure you created a list object\"\n"
130+
]
131+
},
132+
{
133+
"cell_type": "markdown",
134+
"metadata": {
135+
"deletable": false,
136+
"editable": false,
137+
"nbgrader": {
138+
"cell_type": "markdown",
139+
"checksum": "30d91f27e533bf95f3afd44e8d9cda1b",
140+
"grade": false,
141+
"grade_id": "cell-bd07e140b06f24b6",
142+
"locked": true,
143+
"schema_version": 3,
144+
"solution": false,
145+
"task": false
146+
}
147+
},
148+
"source": [
149+
"\n",
150+
"**Task 2** - Dictionaries\n",
151+
"* Create a dictionary object called `diction_practice`. \n",
152+
"* Assign the following values to their respective keys listed above: `tech`, `science`, `language`\n",
153+
"\n",
154+
"*Hint:* There are multiple ways you can accomplish this task. You can either write out the key:values pairs manually, or iteratively combine two lists using the [`zip` function](https://docs.python.org/3/library/functions.html)."
155+
]
156+
},
157+
{
158+
"cell_type": "code",
159+
"execution_count": null,
160+
"metadata": {
161+
"deletable": false,
162+
"nbgrader": {
163+
"cell_type": "code",
164+
"checksum": "f9948f7da71e452ccc2cb3fd5550ac09",
165+
"grade": false,
166+
"grade_id": "cell-d2aac83e74b8fb89",
167+
"locked": false,
168+
"schema_version": 3,
169+
"solution": true,
170+
"task": false
171+
}
172+
},
173+
"outputs": [],
174+
"source": [
175+
"# YOUR CODE HERE\n",
176+
"raise NotImplementedError()"
177+
]
178+
},
179+
{
180+
"cell_type": "markdown",
181+
"metadata": {
182+
"deletable": false,
183+
"editable": false,
184+
"nbgrader": {
185+
"cell_type": "markdown",
186+
"checksum": "89adce679550e000e75a3cd0351721b1",
187+
"grade": false,
188+
"grade_id": "cell-95f85a283f98ef20",
189+
"locked": true,
190+
"schema_version": 3,
191+
"solution": false,
192+
"task": false
193+
}
194+
},
195+
"source": [
196+
"**Task 2 - Test**"
197+
]
198+
},
199+
{
200+
"cell_type": "code",
201+
"execution_count": null,
202+
"metadata": {
203+
"deletable": false,
204+
"editable": false,
205+
"nbgrader": {
206+
"cell_type": "code",
207+
"checksum": "eb3d302695cce47d57ee41774cb9aa05",
208+
"grade": true,
209+
"grade_id": "cell-6f3011fa357d270f",
210+
"locked": true,
211+
"points": 1,
212+
"schema_version": 3,
213+
"solution": false,
214+
"task": false
215+
}
216+
},
217+
"outputs": [],
218+
"source": [
219+
"#Task 2 - Test\n",
220+
"\n",
221+
"assert isinstance(diction_practice, dict), \"Did you use the correct syntax?\"\n"
222+
]
223+
},
224+
{
225+
"cell_type": "markdown",
226+
"metadata": {
227+
"deletable": false,
228+
"editable": false,
229+
"nbgrader": {
230+
"cell_type": "markdown",
231+
"checksum": "500b06dc305643e357cca230f16dce04",
232+
"grade": false,
233+
"grade_id": "cell-4a31ee084e18453a",
234+
"locked": true,
235+
"schema_version": 3,
236+
"solution": false,
237+
"task": false
238+
}
239+
},
240+
"source": [
241+
"\n",
242+
"**Task 3** - Dictionaries\n",
243+
"* Reassign the value of the `python` key in your dictionary to `'programming_language'`"
244+
]
245+
},
246+
{
247+
"cell_type": "code",
248+
"execution_count": null,
249+
"metadata": {
250+
"deletable": false,
251+
"nbgrader": {
252+
"cell_type": "code",
253+
"checksum": "968e8893d06b1276dac72f7a0db78d95",
254+
"grade": false,
255+
"grade_id": "cell-59c6653a1a1faae6",
256+
"locked": false,
257+
"schema_version": 3,
258+
"solution": true,
259+
"task": false
260+
}
261+
},
262+
"outputs": [],
263+
"source": [
264+
"# YOUR CODE HERE\n",
265+
"raise NotImplementedError()"
266+
]
267+
},
268+
{
269+
"cell_type": "markdown",
270+
"metadata": {
271+
"deletable": false,
272+
"editable": false,
273+
"nbgrader": {
274+
"cell_type": "markdown",
275+
"checksum": "b3c824a5462497e6fc4d903f762303c9",
276+
"grade": false,
277+
"grade_id": "cell-b0a297189d9a7dd3",
278+
"locked": true,
279+
"schema_version": 3,
280+
"solution": false,
281+
"task": false
282+
}
283+
},
284+
"source": [
285+
"**Task 3 Tests**"
286+
]
287+
},
288+
{
289+
"cell_type": "code",
290+
"execution_count": null,
291+
"metadata": {
292+
"deletable": false,
293+
"editable": false,
294+
"nbgrader": {
295+
"cell_type": "code",
296+
"checksum": "8fc04dbb280a746dbac3ef75ee9de31b",
297+
"grade": true,
298+
"grade_id": "cell-94bd4331d6cb29c9",
299+
"locked": true,
300+
"points": 1,
301+
"schema_version": 3,
302+
"solution": false,
303+
"task": false
304+
}
305+
},
306+
"outputs": [],
307+
"source": [
308+
"#Task 3 Tests\n",
309+
"\n",
310+
"assert diction_practice['data'] == 'science', \"Make sure your values are assigned to their correct keys\"\n"
311+
]
312+
},
313+
{
314+
"cell_type": "markdown",
315+
"metadata": {
316+
"deletable": false,
317+
"editable": false,
318+
"id": "qSvL3CeTFk9F",
319+
"nbgrader": {
320+
"cell_type": "markdown",
321+
"checksum": "71b4d138b7b263261d663d1b41d6add4",
322+
"grade": false,
323+
"grade_id": "cell-40090434cb0736b0",
324+
"locked": true,
325+
"schema_version": 3,
326+
"solution": false,
327+
"task": false
328+
}
329+
},
330+
"source": [
331+
"## Use the following information to complete Tasks \n",
332+
"\n",
333+
"\n",
334+
"\n",
335+
"In this Sprint Challenge you will first \"wrangle\" some data from [Gapminder](https://www.gapminder.org/about-gapminder/), a Swedish non-profit co-founded by Hans Rosling. \"Gapminder produces free teaching resources making the world understandable based on reliable statistics.\"\n",
336+
"- [Cell phones (total), by country and year](https://raw.githubusercontent.com/open-numbers/ddf--gapminder--systema_globalis/master/countries-etc-datapoints/ddf--datapoints--cell_phones_total--by--geo--time.csv)\n",
337+
"- [Population (total), by country and year](https://raw.githubusercontent.com/open-numbers/ddf--gapminder--systema_globalis/master/countries-etc-datapoints/ddf--datapoints--population_total--by--geo--time.csv)\n",
338+
"- [Geo country codes](https://github.com/open-numbers/ddf--gapminder--systema_globalis/blob/master/ddf--entities--geo--country.csv)\n",
339+
"\n",
340+
"These two links have everything you need to successfully complete the first part of this sprint challenge.\n",
341+
"- [Pandas documentation: Working with Text Data](https://pandas.pydata.org/pandas-docs/stable/text.html) (one question)\n",
342+
"- [Pandas Cheat Sheet](https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf) (everything else)\n"
343+
]
344+
},
345+
{
346+
"cell_type": "markdown",
347+
"metadata": {
348+
"deletable": false,
349+
"editable": false,
350+
"id": "0ZklksziMixS",
351+
"nbgrader": {
352+
"cell_type": "markdown",
353+
"checksum": "235860cb511ea847389f2096adf6cc6e",
354+
"grade": false,
355+
"grade_id": "cell-ae2312e461921817",
356+
"locked": true,
357+
"schema_version": 3,
358+
"solution": false,
359+
"task": false
360+
}
361+
},
362+
"source": [
363+
"**Task 4** - Load and print the cell phone data. Pandas and numpy import statements have been included for you.\n",
364+
"\n",
365+
"* Load your CSV file found at `cell_phones_url` into a DataFrame object named `cell_phones`\n"
366+
]
367+
},
368+
{
369+
"cell_type": "code",
370+
"execution_count": null,
371+
"metadata": {
372+
"deletable": false,
373+
"id": "FFO8QNJ7MixS",
374+
"nbgrader": {
375+
"cell_type": "code",
376+
"checksum": "82e3c74df1cb9321caff9035dd2e9409",
377+
"grade": false,
378+
"grade_id": "cell-6f16afb9d271f949",
379+
"locked": false,
380+
"schema_version": 3,
381+
"solution": true,
382+
"task": false
383+
}
384+
},
385+
"outputs": [],
386+
"source": [
387+
"# Task 4\n",
388+
"\n",
389+
"# Imports \n",
390+
"import pandas as pd\n",
391+
"import numpy as np\n",
392+
"\n",
393+
"cell_phones_url = 'https://raw.githubusercontent.com/bloominstituteoftechnology/data-science-practice-datasets/main/unit_1/Cell__Phones/cell_phones.csv'\n",
394+
"\n",
395+
"# Load the dataframe and print the top 5 rows\n",
396+
"\n",
397+
"# YOUR CODE HERE\n",
398+
"raise NotImplementedError()\n"
399+
]
400+
},
401+
{
402+
"cell_type": "markdown",
403+
"metadata": {
404+
"id": "ymCLkZMJMixT"
405+
},
406+
"source": [
407+
"**Task 4 Test**"
408+
]
409+
},
410+
{
411+
"cell_type": "code",
412+
"execution_count": null,
413+
"metadata": {
414+
"deletable": false,
415+
"editable": false,
416+
"id": "btcEJXxCMixT",
417+
"nbgrader": {
418+
"cell_type": "code",
419+
"checksum": "597bda1cfbeb2d3def97e7c47f6971c7",
420+
"grade": true,
421+
"grade_id": "cell-226f7bf8e9ea24f9",
422+
"locked": true,
423+
"points": 1,
424+
"schema_version": 3,
425+
"solution": false,
426+
"task": false
427+
}
428+
},
429+
"outputs": [],
430+
"source": [
431+
"# Task 4 - Test\n",
432+
"\n",
433+
"assert isinstance(cell_phones, pd.DataFrame), 'Have you created a DataFrame named `cell_phones`?'\n",
434+
"assert len(cell_phones) == 9574\n"
435+
]
436+
},
437+
{
438+
"cell_type": "markdown",
439+
"metadata": {
440+
"deletable": false,
441+
"editable": false,
442+
"id": "9YPW16tmT2J_",
443+
"nbgrader": {
444+
"cell_type": "markdown",
445+
"checksum": "26b84eb6f8694fc80894fec62cb92e2f",
446+
"grade": false,
447+
"grade_id": "cell-905dd5d05e5cebb7",
448+
"locked": true,
449+
"schema_version": 3,
450+
"solution": false,
451+
"task": false
452+
}
453+
},
454+
"source": [
455+
"**Task 5** - Load and print the population data. \n",
456+
"\n",
457+
"* Load the CSV file found at `population_url` into a DataFrame named `population`\n",
458+
"\n"
459+
]
460+
},
461+
{
462+
"cell_type": "code",
463+
"execution_count": null,
464+
"metadata": {
465+
"deletable": false,
466+
"id": "SNWpDAvyUYa2",
467+
"nbgrader": {
468+
"cell_type": "code",
469+
"checksum": "9874647278a76399d4f5547222d9dc02",
470+
"grade": false,
471+
"grade_id": "cell-561c2d59728188a9",
472+
"locked": false,
473+
"schema_version": 3,
474+
"solution": true,
475+
"task": false
476+
}
477+
},
478+
"outputs": [],
479+
"source": [
480+
"# Task 5\n",
481+
"\n",
482+
"population_url = 'https://raw.githubusercontent.com/bloominstituteoftechnology/data-science-practice-datasets/main/unit_1/Population/population.csv'\n",
483+
"\n",
484+
"# Load the dataframe and print the first 5 records\n",
485+
"\n",
486+
"# YOUR CODE HERE\n",
487+
"raise NotImplementedError()"
488+
]
489+
},
490+
{
491+
"cell_type": "markdown",
492+
"metadata": {
493+
"id": "RDOcC0FdVjIz"
494+
},
495+
"source": [
496+
"**Task 5 Test**"
497+
]
498+
},
499+
{
500+
"cell_type": "code",
501+
"execution_count": null,
502+
"metadata": {
503+
"deletable": false,
504+
"editable": false,
505+
"id": "jcaZ5W5cVjI_",
506+
"nbgrader": {
507+
"cell_type": "code",
508+
"checksum": "01d084b75b701322c49f0aecda45802a",
509+
"grade": true,
510+
"grade_id": "cell-59d01cd695becd74",
511+
"locked": true,
512+
"points": 1,
513+
"schema_version": 3,
514+
"solution": false,
515+
"task": false
516+
}
517+
},
518+
"outputs": [],
519+
"source": [
520+
"# Task 5 - Test\n",
521+
"\n",
522+
"assert isinstance(population, pd.DataFrame), 'Have you created a DataFrame named `population`?'\n",
523+
"assert len(population) == 59297\n"
524+
]
525+
},
526+
{
527+
"cell_type": "markdown",
528+
"metadata": {
529+
"id": "9acXXTiEV5uJ"
530+
},
531+
"source": [
532+
"**Task 6** - Load and print the geo country codes data. \n",
533+
"\n",
534+
"* Load the CSV file found at `geo_codes_url` into a DataFrame named `geo_codes`\n"
535+
]
536+
},
537+
{
538+
"cell_type": "code",
539+
"execution_count": null,
540+
"metadata": {
541+
"deletable": false,
542+
"id": "Obm4p8WXV5uJ",
543+
"nbgrader": {
544+
"cell_type": "code",
545+
"checksum": "8247001411ff10e160febf6472e84ce8",
546+
"grade": false,
547+
"grade_id": "cell-eb4d290384535503",
548+
"locked": false,
549+
"schema_version": 3,
550+
"solution": true,
551+
"task": false
552+
}
553+
},
554+
"outputs": [],
555+
"source": [
556+
"# Task 6\n",
557+
"\n",
558+
"geo_codes_url = 'https://raw.githubusercontent.com/bloominstituteoftechnology/data-science-practice-datasets/main/unit_1/GEO_codes/geo_country_codes.csv'\n",
559+
"\n",
560+
"# Load the dataframe and print out the first 5 records\n",
561+
"\n",
562+
"# YOUR CODE HERE\n",
563+
"raise NotImplementedError()"
564+
]
565+
},
566+
{
567+
"cell_type": "markdown",
568+
"metadata": {
569+
"deletable": false,
570+
"editable": false,
571+
"id": "_WR-4MbmV5uK",
572+
"nbgrader": {
573+
"cell_type": "markdown",
574+
"checksum": "a88bbf49e8714b89d0e0fa46c06f2217",
575+
"grade": false,
576+
"grade_id": "cell-4a0f7ebd4c9931a7",
577+
"locked": true,
578+
"schema_version": 3,
579+
"solution": false,
580+
"task": false
581+
}
582+
},
583+
"source": [
584+
"**Task 6 Test**"
585+
]
586+
},
587+
{
588+
"cell_type": "code",
589+
"execution_count": null,
590+
"metadata": {
591+
"deletable": false,
592+
"editable": false,
593+
"id": "Z3Tza5NWV5uK",
594+
"nbgrader": {
595+
"cell_type": "code",
596+
"checksum": "6bf7335d2dd565fbef6619579e0ddf59",
597+
"grade": true,
598+
"grade_id": "cell-39240405659c0c19",
599+
"locked": true,
600+
"points": 1,
601+
"schema_version": 3,
602+
"solution": false,
603+
"task": false
604+
}
605+
},
606+
"outputs": [],
607+
"source": [
608+
"# Task 6 - Test\n",
609+
"\n",
610+
"assert geo_codes is not None, 'Have you created a DataFrame named `geo_codes`?'\n",
611+
"assert len(geo_codes) == 273\n"
612+
]
613+
},
614+
{
615+
"cell_type": "markdown",
616+
"metadata": {
617+
"deletable": false,
618+
"editable": false,
619+
"id": "5DbACESjYxpV",
620+
"nbgrader": {
621+
"cell_type": "markdown",
622+
"checksum": "fbbb18f1bcf72e28cfd967ddb3d36732",
623+
"grade": false,
624+
"grade_id": "cell-817781e1dc5827c6",
625+
"locked": true,
626+
"schema_version": 3,
627+
"solution": false,
628+
"task": false
629+
}
630+
},
631+
"source": [
632+
"**Task 7** - Check for missing values\n",
633+
"\n",
634+
"Let's check for missing values in each of these DataFrames: `cell_phones`, `population` and `geo_codes`\n",
635+
"\n",
636+
"* Check for missing values in the following DataFrames:\n",
637+
" * Assign the total number of missing values in `cell_phones` to the variable `cell_phones_missing`\n",
638+
" * Assign the total number of missing values in `population` to the variable `population_missing`\n",
639+
" * Assign the total number of missing values in `geo_codes` to the variable `geo_codes_missing` \n",
640+
" * Hint: you will need to do a sum of a sum for this last task."
641+
]
642+
},
643+
{
644+
"cell_type": "code",
645+
"execution_count": null,
646+
"metadata": {
647+
"deletable": false,
648+
"id": "SwmSvUySJjXc",
649+
"nbgrader": {
650+
"cell_type": "code",
651+
"checksum": "462f481fe420ad2c37fe5a262dec6816",
652+
"grade": false,
653+
"grade_id": "cell-9426cd5765574e07",
654+
"locked": false,
655+
"schema_version": 3,
656+
"solution": true,
657+
"task": false
658+
}
659+
},
660+
"outputs": [],
661+
"source": [
662+
"# Task 7\n",
663+
"\n",
664+
"# Check for missing data in each of the DataFrames\n",
665+
"\n",
666+
"# YOUR CODE HERE\n",
667+
"raise NotImplementedError()"
668+
]
669+
},
670+
{
671+
"cell_type": "markdown",
672+
"metadata": {
673+
"deletable": false,
674+
"editable": false,
675+
"id": "cREZV7g0aLGC",
676+
"nbgrader": {
677+
"cell_type": "markdown",
678+
"checksum": "24b8d3d5ff21f6c4cfa3e634f00113aa",
679+
"grade": false,
680+
"grade_id": "cell-47ee1692d471f9ea",
681+
"locked": true,
682+
"schema_version": 3,
683+
"solution": false,
684+
"task": false
685+
}
686+
},
687+
"source": [
688+
"**Task 7 Test**"
689+
]
690+
},
691+
{
692+
"cell_type": "code",
693+
"execution_count": null,
694+
"metadata": {
695+
"deletable": false,
696+
"editable": false,
697+
"id": "eaQwM15IaLGD",
698+
"nbgrader": {
699+
"cell_type": "code",
700+
"checksum": "da0a8c532d46ac776b2609e8f439f601",
701+
"grade": true,
702+
"grade_id": "cell-cf6ab3b4b1e8afc1",
703+
"locked": true,
704+
"points": 1,
705+
"schema_version": 3,
706+
"solution": false,
707+
"task": false
708+
}
709+
},
710+
"outputs": [],
711+
"source": [
712+
"# Task 7 - Test\n",
713+
"\n",
714+
"if geo_codes_missing == 21: print('ERROR: Make sure to use a sum of a sum for the missing geo codes!') \n",
715+
"\n",
716+
"# Hidden tests - you will see the results when you submit to Canvas"
717+
]
718+
},
719+
{
720+
"cell_type": "markdown",
721+
"metadata": {
722+
"deletable": false,
723+
"editable": false,
724+
"id": "P54itLGveF5p",
725+
"nbgrader": {
726+
"cell_type": "markdown",
727+
"checksum": "c6054c87e1ff95c1767b855e2149568c",
728+
"grade": false,
729+
"grade_id": "cell-aad431149f1868a7",
730+
"locked": true,
731+
"schema_version": 3,
732+
"solution": false,
733+
"task": false
734+
}
735+
},
736+
"source": [
737+
"**Task 8** - Merge the `cell_phones` and `population` DataFrames.\n",
738+
"\n",
739+
"* Merge the `cell_phones` and `population` dataframes with an **inner** merge on both the `geo` and `time` columns.\n",
740+
"* Call the resulting dataframe `cell_phone_population`"
741+
]
742+
},
743+
{
744+
"cell_type": "code",
745+
"execution_count": null,
746+
"metadata": {
747+
"deletable": false,
748+
"id": "KL_NCL7heF51",
749+
"nbgrader": {
750+
"cell_type": "code",
751+
"checksum": "1ee15cfe45ac7986f9cb323dd1d52fbb",
752+
"grade": false,
753+
"grade_id": "cell-decaebaa844aa3a5",
754+
"locked": false,
755+
"schema_version": 3,
756+
"solution": true,
757+
"task": false
758+
}
759+
},
760+
"outputs": [],
761+
"source": [
762+
"# Task 8\n",
763+
"\n",
764+
"# Merge the cell_phones and population dataframes\n",
765+
"\n",
766+
"# YOUR CODE HERE\n",
767+
"raise NotImplementedError()"
768+
]
769+
},
770+
{
771+
"cell_type": "markdown",
772+
"metadata": {
773+
"deletable": false,
774+
"editable": false,
775+
"id": "9vFSumbkfqr_",
776+
"nbgrader": {
777+
"cell_type": "markdown",
778+
"checksum": "9daf332b99212b89d3e248fef54079ac",
779+
"grade": false,
780+
"grade_id": "cell-00202b83d4d54973",
781+
"locked": true,
782+
"schema_version": 3,
783+
"solution": false,
784+
"task": false
785+
}
786+
},
787+
"source": [
788+
"**Task 8 Test**"
789+
]
790+
},
791+
{
792+
"cell_type": "code",
793+
"execution_count": null,
794+
"metadata": {
795+
"deletable": false,
796+
"editable": false,
797+
"id": "85-p_0UGfkZJ",
798+
"nbgrader": {
799+
"cell_type": "code",
800+
"checksum": "9bd0a79e9049acd8ced9d23b68d54d9a",
801+
"grade": true,
802+
"grade_id": "cell-dd2473ea91f15f30",
803+
"locked": true,
804+
"points": 1,
805+
"schema_version": 3,
806+
"solution": false,
807+
"task": false
808+
}
809+
},
810+
"outputs": [],
811+
"source": [
812+
"# Task 8 - Test\n",
813+
"\n",
814+
"assert cell_phone_population is not None, 'Have you merged created a DataFrame named cell_phone_population?'\n",
815+
"assert len(cell_phone_population) == 8930\n"
816+
]
817+
},
818+
{
819+
"cell_type": "markdown",
820+
"metadata": {
821+
"deletable": false,
822+
"editable": false,
823+
"id": "oByYSkC7hB05",
824+
"nbgrader": {
825+
"cell_type": "markdown",
826+
"checksum": "49df153f02f79995801f03a60b34c838",
827+
"grade": false,
828+
"grade_id": "cell-01ad09608fc02d0c",
829+
"locked": true,
830+
"schema_version": 3,
831+
"solution": false,
832+
"task": false
833+
}
834+
},
835+
"source": [
836+
"**Task 9** - Merge the `cell_phone_population` and `geo_codes` DataFrames\n",
837+
"\n",
838+
"* Merge the `cell_phone_population` and `geo_codes` DataFrames with an inner merge using the `geo` column.\n",
839+
"* **Only merge the `country` and `geo` columns from the `geo_codes` dataframe.** \n",
840+
"* Call the resulting DataFrame `geo_cell_phone_population`\n"
841+
]
842+
},
843+
{
844+
"cell_type": "code",
845+
"execution_count": null,
846+
"metadata": {
847+
"deletable": false,
848+
"id": "NcO8-JpQhB1F",
849+
"nbgrader": {
850+
"cell_type": "code",
851+
"checksum": "3450df7dea31e4dee7ceb126a2d07f6a",
852+
"grade": false,
853+
"grade_id": "cell-1ce5a2360ee6fd20",
854+
"locked": false,
855+
"schema_version": 3,
856+
"solution": true,
857+
"task": false
858+
}
859+
},
860+
"outputs": [],
861+
"source": [
862+
"# Task 9\n",
863+
"\n",
864+
"# Merge the cell_phone_population and geo_codes dataframes\n",
865+
"# Only include the country and geo columns from geo_codes\n",
866+
"\n",
867+
"# YOUR CODE HERE\n",
868+
"raise NotImplementedError()"
869+
]
870+
},
871+
{
872+
"cell_type": "markdown",
873+
"metadata": {
874+
"deletable": false,
875+
"editable": false,
876+
"id": "zAKDLSV-hB1G",
877+
"nbgrader": {
878+
"cell_type": "markdown",
879+
"checksum": "a5634357f45ca81f6651395a42cc6341",
880+
"grade": false,
881+
"grade_id": "cell-935fc7dc053d368e",
882+
"locked": true,
883+
"schema_version": 3,
884+
"solution": false,
885+
"task": false
886+
}
887+
},
888+
"source": [
889+
"**Task 9 Test**"
890+
]
891+
},
892+
{
893+
"cell_type": "code",
894+
"execution_count": null,
895+
"metadata": {
896+
"deletable": false,
897+
"editable": false,
898+
"id": "eQgHSsLihB1G",
899+
"nbgrader": {
900+
"cell_type": "code",
901+
"checksum": "7005259d49844adab8beb565aaf5eed5",
902+
"grade": true,
903+
"grade_id": "cell-764d5b72ae382339",
904+
"locked": true,
905+
"points": 1,
906+
"schema_version": 3,
907+
"solution": false,
908+
"task": false
909+
}
910+
},
911+
"outputs": [],
912+
"source": [
913+
"# Task 9 - Test\n",
914+
"assert len(geo_cell_phone_population) == 8930\n",
915+
"assert type(geo_cell_phone_population) == pd.DataFrame\n"
916+
]
917+
},
918+
{
919+
"cell_type": "markdown",
920+
"metadata": {
921+
"deletable": false,
922+
"editable": false,
923+
"id": "-CZF39BWivc2",
924+
"nbgrader": {
925+
"cell_type": "markdown",
926+
"checksum": "2373ace4287e3e77f2e1c4b86d9be106",
927+
"grade": false,
928+
"grade_id": "cell-bbacb1043d1c0990",
929+
"locked": true,
930+
"schema_version": 3,
931+
"solution": false,
932+
"task": false
933+
}
934+
},
935+
"source": [
936+
"**Task 10** - Calculate the number of cell phones per person.\n",
937+
"\n",
938+
"* Use the `cell_phones_total` and `population_total` columns to calculate the number of cell phones per person for every year. (In other words, for every row). \n",
939+
"* Create a new column: Call this new feature (column) `phones_per_person` and add it to the `geo_cell_phone_population` DataFrame (you'll be adding the column to the DataFrame).\n",
940+
"\n",
941+
"*Hint: You can find a refresher on how to create a new column in Module 2 of this sprint.*"
942+
]
943+
},
944+
{
945+
"cell_type": "code",
946+
"execution_count": null,
947+
"metadata": {
948+
"deletable": false,
949+
"id": "vBeXa3LTivdC",
950+
"nbgrader": {
951+
"cell_type": "code",
952+
"checksum": "ca0c58c434dc1be273c34f79f9cdc4db",
953+
"grade": false,
954+
"grade_id": "cell-bd0bce2920604643",
955+
"locked": false,
956+
"schema_version": 3,
957+
"solution": true,
958+
"task": false
959+
}
960+
},
961+
"outputs": [],
962+
"source": [
963+
"# Task 10\n",
964+
"\n",
965+
"# YOUR CODE HERE\n",
966+
"raise NotImplementedError()"
967+
]
968+
},
969+
{
970+
"cell_type": "markdown",
971+
"metadata": {
972+
"id": "4IiDB6f6ivdD"
973+
},
974+
"source": [
975+
"**Task 10 Test**"
976+
]
977+
},
978+
{
979+
"cell_type": "code",
980+
"execution_count": null,
981+
"metadata": {
982+
"deletable": false,
983+
"editable": false,
984+
"id": "L3UgjCfFivdD",
985+
"nbgrader": {
986+
"cell_type": "code",
987+
"checksum": "2c7a7142a39c84e1d18daa6c160cae60",
988+
"grade": true,
989+
"grade_id": "cell-45c955c43e471400",
990+
"locked": true,
991+
"points": 1,
992+
"schema_version": 3,
993+
"solution": false,
994+
"task": false
995+
}
996+
},
997+
"outputs": [],
998+
"source": [
999+
"# Task 10 - Test\n",
1000+
"\n",
1001+
"# Hidden tests - you will see the results when you submit to Canvas"
1002+
]
1003+
},
1004+
{
1005+
"cell_type": "markdown",
1006+
"metadata": {
1007+
"id": "b_CZNPZAlw71"
1008+
},
1009+
"source": [
1010+
"**Task 11** - Identify the number of cell phones per person in the US in 2017\n",
1011+
"\n",
1012+
"* Create a one-row subset of `geo_cell_phone_population` with data on cell phone ownership in the United States for the year 2017.\n",
1013+
"* Call this subset DataFrame `US_2017`.\n",
1014+
"* Print `US_2017`."
1015+
]
1016+
},
1017+
{
1018+
"cell_type": "code",
1019+
"execution_count": null,
1020+
"metadata": {
1021+
"deletable": false,
1022+
"id": "Y0hRRvc1lw8B",
1023+
"nbgrader": {
1024+
"cell_type": "code",
1025+
"checksum": "a0c775ef5f419d97070af229fe40c354",
1026+
"grade": false,
1027+
"grade_id": "cell-665e83d11e594d90",
1028+
"locked": false,
1029+
"schema_version": 3,
1030+
"solution": true,
1031+
"task": false
1032+
}
1033+
},
1034+
"outputs": [],
1035+
"source": [
1036+
"# Task 11\n",
1037+
"\n",
1038+
"# Determine the number of cell phones per person in the US in 2017\n",
1039+
"\n",
1040+
"# YOUR CODE HERE\n",
1041+
"raise NotImplementedError()\n",
1042+
"\n",
1043+
"# View the DataFrame\n",
1044+
"US_2017"
1045+
]
1046+
},
1047+
{
1048+
"cell_type": "markdown",
1049+
"metadata": {
1050+
"id": "mIDryQfKlw8C"
1051+
},
1052+
"source": [
1053+
"**Task 11 Test**"
1054+
]
1055+
},
1056+
{
1057+
"cell_type": "code",
1058+
"execution_count": null,
1059+
"metadata": {
1060+
"deletable": false,
1061+
"editable": false,
1062+
"id": "wL0MypzFlw8C",
1063+
"nbgrader": {
1064+
"cell_type": "code",
1065+
"checksum": "912e932a02c4192868d095b526a831e4",
1066+
"grade": true,
1067+
"grade_id": "cell-ea08fdda80cf9731",
1068+
"locked": true,
1069+
"points": 1,
1070+
"schema_version": 3,
1071+
"solution": false,
1072+
"task": false
1073+
}
1074+
},
1075+
"outputs": [],
1076+
"source": [
1077+
"# Task 11 - Test\n",
1078+
"\n",
1079+
"# Hidden tests - you will see the results when you submit to Canvas"
1080+
]
1081+
},
1082+
{
1083+
"cell_type": "markdown",
1084+
"metadata": {
1085+
"deletable": false,
1086+
"editable": false,
1087+
"id": "HTl_zamAtfJa",
1088+
"nbgrader": {
1089+
"cell_type": "markdown",
1090+
"checksum": "815cb8dd3b19d79fd553f23d9a773b03",
1091+
"grade": false,
1092+
"grade_id": "cell-4acad6efd0ae146c",
1093+
"locked": true,
1094+
"schema_version": 3,
1095+
"solution": false,
1096+
"task": false
1097+
}
1098+
},
1099+
"source": [
1100+
"**Task 12** - Describe the numeric variables in `geo_cell_phone_population`\n",
1101+
"\n",
1102+
"* Calculate the summary statistics for the quantitative variables in `geo_cell_phone_population` using `.describe()`.\n",
1103+
"* Find the mean value for `phones_per_person` and assign it to the variable `mean_phones`. Define your value out to two decimal points.\n"
1104+
]
1105+
},
1106+
{
1107+
"cell_type": "code",
1108+
"execution_count": null,
1109+
"metadata": {
1110+
"deletable": false,
1111+
"id": "HGKKIqAktfJn",
1112+
"nbgrader": {
1113+
"cell_type": "code",
1114+
"checksum": "8773b60013ce17644af742cd4c6a356d",
1115+
"grade": false,
1116+
"grade_id": "cell-181c9805c52dfda8",
1117+
"locked": false,
1118+
"schema_version": 3,
1119+
"solution": true,
1120+
"task": false
1121+
}
1122+
},
1123+
"outputs": [],
1124+
"source": [
1125+
"# Task 12\n",
1126+
"\n",
1127+
"# YOUR CODE HERE\n",
1128+
"raise NotImplementedError()"
1129+
]
1130+
},
1131+
{
1132+
"cell_type": "markdown",
1133+
"metadata": {
1134+
"id": "4Nh0qP1ptfJn"
1135+
},
1136+
"source": [
1137+
"**Task 12 Test**"
1138+
]
1139+
},
1140+
{
1141+
"cell_type": "code",
1142+
"execution_count": null,
1143+
"metadata": {
1144+
"deletable": false,
1145+
"editable": false,
1146+
"id": "cBZg6p1VtfJo",
1147+
"nbgrader": {
1148+
"cell_type": "code",
1149+
"checksum": "012a6f5c818c64f8ed1570906eea15b3",
1150+
"grade": true,
1151+
"grade_id": "cell-0a0cfb1ac7acc279",
1152+
"locked": true,
1153+
"points": 1,
1154+
"schema_version": 3,
1155+
"solution": false,
1156+
"task": false
1157+
}
1158+
},
1159+
"outputs": [],
1160+
"source": [
1161+
"# Task 12 - Test\n",
1162+
"\n",
1163+
"# Hidden tests - you will see the results when you submit to Canvas"
1164+
]
1165+
},
1166+
{
1167+
"cell_type": "markdown",
1168+
"metadata": {
1169+
"id": "wBQCMGrkw69w"
1170+
},
1171+
"source": [
1172+
"**Task 13** - Subset the DataFrame for 2017\n",
1173+
"\n",
1174+
"* Create a new dataframe called `df2017` that includes **only** records from `geo_cell_phone_population` that ocurred in 2017."
1175+
]
1176+
},
1177+
{
1178+
"cell_type": "code",
1179+
"execution_count": null,
1180+
"metadata": {
1181+
"deletable": false,
1182+
"id": "_EKPYqW-w698",
1183+
"nbgrader": {
1184+
"cell_type": "code",
1185+
"checksum": "50dd62f311ea30c217ec74206b8b42b9",
1186+
"grade": false,
1187+
"grade_id": "cell-f3fa17e15b5174ac",
1188+
"locked": false,
1189+
"schema_version": 3,
1190+
"solution": true,
1191+
"task": false
1192+
}
1193+
},
1194+
"outputs": [],
1195+
"source": [
1196+
"# Task 13\n",
1197+
"\n",
1198+
"# Create a new dataframe called df2017 that includes only records from geo_cell_phone_population that ocurred in 2017.\n",
1199+
"\n",
1200+
"# YOUR CODE HERE\n",
1201+
"raise NotImplementedError()"
1202+
]
1203+
},
1204+
{
1205+
"cell_type": "markdown",
1206+
"metadata": {
1207+
"id": "QuLy4qwrw698"
1208+
},
1209+
"source": [
1210+
"**Task 13 Test**"
1211+
]
1212+
},
1213+
{
1214+
"cell_type": "code",
1215+
"execution_count": null,
1216+
"metadata": {
1217+
"deletable": false,
1218+
"editable": false,
1219+
"id": "S0rU2pGWw698",
1220+
"nbgrader": {
1221+
"cell_type": "code",
1222+
"checksum": "3ec1b21a811f55002c03f4f9f54fcf3e",
1223+
"grade": true,
1224+
"grade_id": "cell-b19a6956b5dddadb",
1225+
"locked": true,
1226+
"points": 1,
1227+
"schema_version": 3,
1228+
"solution": false,
1229+
"task": false
1230+
}
1231+
},
1232+
"outputs": [],
1233+
"source": [
1234+
"# Task 13 - Test\n",
1235+
"\n",
1236+
"# Hidden tests - you will see the results when you submit to Canvas"
1237+
]
1238+
},
1239+
{
1240+
"cell_type": "markdown",
1241+
"metadata": {
1242+
"id": "Rww_2EpHyd4K"
1243+
},
1244+
"source": [
1245+
"**Task 14** - Identify the five countries with the most cell phones per person in 2017\n",
1246+
"\n",
1247+
"* Sort the `df2017` DataFrame by `phones_per_person` in descending order and assign the result to `df2017_top`. Your new DataFrame should only have **five** rows (Hint: use `.head()` to return only five rows).\n",
1248+
"* Print the first 5 records of `df2017_top`."
1249+
]
1250+
},
1251+
{
1252+
"cell_type": "code",
1253+
"execution_count": null,
1254+
"metadata": {
1255+
"deletable": false,
1256+
"id": "h6ym9Dwhyd4V",
1257+
"nbgrader": {
1258+
"cell_type": "code",
1259+
"checksum": "7a116d8d8ab46d5494cec232cfc25c12",
1260+
"grade": false,
1261+
"grade_id": "cell-c9b701bfe897edf5",
1262+
"locked": false,
1263+
"schema_version": 3,
1264+
"solution": true,
1265+
"task": false
1266+
}
1267+
},
1268+
"outputs": [],
1269+
"source": [
1270+
"# Task 14\n",
1271+
"\n",
1272+
"# Sort the df2017 dataframe by phones_per_person in descending order\n",
1273+
"# Return only five (5) rows\n",
1274+
"\n",
1275+
"# YOUR CODE HERE\n",
1276+
"raise NotImplementedError()\n",
1277+
"\n",
1278+
"# View the df2017_top DataFrame\n",
1279+
"df2017_top"
1280+
]
1281+
},
1282+
{
1283+
"cell_type": "markdown",
1284+
"metadata": {
1285+
"id": "7WBox_Axyd4W"
1286+
},
1287+
"source": [
1288+
"**Task 14 Test**"
1289+
]
1290+
},
1291+
{
1292+
"cell_type": "code",
1293+
"execution_count": null,
1294+
"metadata": {
1295+
"deletable": false,
1296+
"editable": false,
1297+
"id": "ePj-a6rLyd4W",
1298+
"nbgrader": {
1299+
"cell_type": "code",
1300+
"checksum": "7baa090a5b1136f9576ed19a1424a180",
1301+
"grade": true,
1302+
"grade_id": "cell-5f6fffc9db1b9492",
1303+
"locked": true,
1304+
"points": 1,
1305+
"schema_version": 3,
1306+
"solution": false,
1307+
"task": false
1308+
}
1309+
},
1310+
"outputs": [],
1311+
"source": [
1312+
"# Task 14 - Test\n",
1313+
"\n",
1314+
"assert df2017_top.shape == (5,6), 'Make sure you return only five rows'\n"
1315+
]
1316+
},
1317+
{
1318+
"cell_type": "code",
1319+
"execution_count": null,
1320+
"metadata": {},
1321+
"outputs": [],
1322+
"source": []
1323+
}
1324+
],
1325+
"metadata": {
1326+
"colab": {
1327+
"collapsed_sections": [],
1328+
"name": "LS_DS_Sprint1_AG.ipynb",
1329+
"provenance": []
1330+
},
1331+
"kernelspec": {
1332+
"display_name": "Python 3",
1333+
"language": "python",
1334+
"name": "python3"
1335+
},
1336+
"language_info": {
1337+
"codemirror_mode": {
1338+
"name": "ipython",
1339+
"version": 3
1340+
},
1341+
"file_extension": ".py",
1342+
"mimetype": "text/x-python",
1343+
"name": "python",
1344+
"nbconvert_exporter": "python",
1345+
"pygments_lexer": "ipython3",
1346+
"version": "3.8.8"
1347+
}
1348+
},
1349+
"nbformat": 4,
1350+
"nbformat_minor": 1
1351+
}

0 commit comments

Comments
 (0)
Please sign in to comment.