WEBVTT
1
00:00:01.440 --> 00:00:05.063
This lecture's gonna provide an overview
of the cycle of data analysis.
2
00:00:10.191 --> 00:00:14.060
Data analysis is a complex process
that can involve many pieces and
3
00:00:14.060 --> 00:00:15.790
many different tools.
4
00:00:15.790 --> 00:00:18.480
But fundamentally,
there are only three parts to it.
5
00:00:19.540 --> 00:00:22.180
The first part is setting expectations.
6
00:00:22.180 --> 00:00:24.460
The second part involves
collecting information and
7
00:00:24.460 --> 00:00:26.990
comparing your expectations to data, and
8
00:00:26.990 --> 00:00:32.220
the last part Involves reacting to data,
and revising your expectations.
9
00:00:32.220 --> 00:00:33.750
So that's basically it.
10
00:00:33.750 --> 00:00:37.990
Those are the three parts of data analysis
that you often will cycle through,
11
00:00:37.990 --> 00:00:41.290
many times, in the course of
analyzing any given data set.
12
00:00:41.290 --> 00:00:42.820
So I'm gonna break down each one of these
13
00:00:43.870 --> 00:00:47.190
pieces to give you a little bit of
a description of what they are, and I'll
14
00:00:47.190 --> 00:00:51.690
give you a little example of kinda how
they might be applied in the real world.
15
00:00:51.690 --> 00:00:55.913
So steady expectations involve
deliberately thinking about
16
00:00:55.913 --> 00:00:58.599
what you're gonna do before you do it.
17
00:00:58.599 --> 00:01:02.585
Now, the idea in any part of data analysis
is that everything you do is gonna
18
00:01:02.585 --> 00:01:06.511
have some sort of consequence,
whether it's collecting data, whether
19
00:01:06.511 --> 00:01:11.035
it's fitting a model, whether it's asking
a question or making some sort of plot.
20
00:01:11.035 --> 00:01:15.494
Everything you do there will be some sort
of action, and the point is you wanna
21
00:01:15.494 --> 00:01:19.490
think about what that consequence
is gonna be, before you do it.
22
00:01:19.490 --> 00:01:22.850
And that way you set the expectations for
yourself, and
23
00:01:22.850 --> 00:01:27.980
you can determine whether the reality
kind of meets that expectation.
24
00:01:29.635 --> 00:01:33.165
So that's the first part
of the data analysis cycle.
25
00:01:33.165 --> 00:01:37.685
Once you've set your expectations the next
thing you wanna do is collect some data or
26
00:01:37.685 --> 00:01:42.380
collect some information that will
allow you to compare those expectations
27
00:01:42.380 --> 00:01:43.775
to reality.
28
00:01:43.775 --> 00:01:48.170
And so,
collecting that information is key,
29
00:01:48.170 --> 00:01:51.030
because it will tell you whether or
not your expectations were right.
30
00:01:51.030 --> 00:01:51.930
Whether they were wrong.
31
00:01:51.930 --> 00:01:53.713
Whether they were too high, too low,
32
00:01:53.713 --> 00:01:56.568
whatever it is depending on
the problem you're working on.
33
00:01:56.568 --> 00:02:01.009
And then, once you've collected that
information, and compared it to your
34
00:02:01.009 --> 00:02:05.541
expectations you can react to it, and
maybe change your behavior in some way.
35
00:02:05.541 --> 00:02:09.702
So the last part of the data analysis
cycle is to think about what have
36
00:02:09.702 --> 00:02:14.309
we learned from the data, from our
expectations, and their comparison.
37
00:02:14.309 --> 00:02:16.350
What would we do differently next time.
38
00:02:17.400 --> 00:02:20.870
Did we match our expectations,
did they not match, why or why not.
39
00:02:20.870 --> 00:02:22.443
So that's the third part.
40
00:02:22.443 --> 00:02:27.341
And then, once you've completed the third
part and you've revised your expectations,
41
00:02:27.341 --> 00:02:31.401
you may go back, with these revised
expectations and collect more data and
42
00:02:31.401 --> 00:02:35.267
try to match them again, and
then this iteration continues, often for
43
00:02:35.267 --> 00:02:37.990
many different times in
any given data analysis.
44
00:02:40.470 --> 00:02:44.830
So I just wanna give you a quick
example of how you can use these
45
00:02:44.830 --> 00:02:50.190
three components in a kind of generic or
kind of commonplace setting.
46
00:02:50.190 --> 00:02:53.360
So the basic example I'm gonna present
here is going out to dinner with
47
00:02:53.360 --> 00:02:54.510
your friends.
48
00:02:54.510 --> 00:02:56.490
So suppose you're going out to dinner and
49
00:02:56.490 --> 00:02:59.140
the restaurant you're going
to is a cash only place.
50
00:02:59.140 --> 00:03:02.410
So the question you have to ask yourself
is how much money should you bring.
51
00:03:04.005 --> 00:03:09.235
And the basic activity you're gonna
engage in is eating a meal, and you're
52
00:03:09.235 --> 00:03:13.520
gonna check for the bill, and you're gonna
have to pay, money to pay for the meal.
53
00:03:13.520 --> 00:03:17.869
But before you do that, you gotta figure
out how much money to bring, and so
54
00:03:17.869 --> 00:03:22.585
you have to figure out well, what's your
expectation for the cost of this meal.
55
00:03:22.585 --> 00:03:25.299
Maybe you've dined at this
restaurant all the time, so
56
00:03:25.299 --> 00:03:27.448
you know exactly how much it's gonna cost.
57
00:03:27.448 --> 00:03:32.487
Maybe you know, well in this city, the
typical meal costs this many dollars, and
58
00:03:32.487 --> 00:03:37.403
so I'll just bring that much money,
cuz this is an average kind of restaurant.
59
00:03:37.403 --> 00:03:38.260
Maybe you know,
60
00:03:38.260 --> 00:03:42.065
well the most expensive restaurant in
this city costs this many dollars.
61
00:03:42.065 --> 00:03:44.936
So I know it's not gonna
cost this more than that, so
62
00:03:44.936 --> 00:03:49.180
I'll just bring that to kind of serve as
an upper bound on how much money I might
63
00:03:49.180 --> 00:03:51.223
end up spending at this restaurant.
64
00:03:51.223 --> 00:03:53.365
You might ask your friends,
if they've been their before,
65
00:03:53.365 --> 00:03:54.650
how much does this place cost.
66
00:03:54.650 --> 00:03:56.060
Or you might Google the restaurant and
67
00:03:56.060 --> 00:03:59.430
maybe look up the menu to see what
the meal typically costs there.
68
00:03:59.430 --> 00:04:04.388
At any rate, before you've gone to the
restaurant and eat the meal, you can use
69
00:04:04.388 --> 00:04:08.310
any sort of opreory information
to set up your expectations for
70
00:04:08.310 --> 00:04:10.684
what the cost is ultimately gonna be.
71
00:04:10.684 --> 00:04:13.220
Before you observe the real thing.
72
00:04:14.430 --> 00:04:19.176
So once you've set your expectations, you
can figure out how much money to bring.
73
00:04:19.176 --> 00:04:22.339
The actual collecting of the data
involves going to the restaurant and
74
00:04:22.339 --> 00:04:23.255
getting the check.
75
00:04:23.255 --> 00:04:25.839
So once you've gotten the check,
76
00:04:25.839 --> 00:04:29.500
you observed the reality
of what the meal costs.
77
00:04:30.640 --> 00:04:32.560
And there's two possibilities.
78
00:04:32.560 --> 00:04:35.250
One is that,
that cost meets your expectation.
79
00:04:35.250 --> 00:04:38.510
So suppose you thought
it was gonna be $30, and
80
00:04:38.510 --> 00:04:40.710
it ended up being $30, then that's great.
81
00:04:40.710 --> 00:04:41.630
You know exactly,
82
00:04:41.630 --> 00:04:45.810
you brought the right amount of money,
and then you can pay for the meal.
83
00:04:45.810 --> 00:04:49.696
The other possibility is that
the expectations don't match of what
84
00:04:49.696 --> 00:04:50.657
the reality is.
85
00:04:50.657 --> 00:04:55.386
So thought it was $30 and
it ended up being $40.
86
00:04:55.386 --> 00:04:59.120
And so, you have to ask yourself
then why do you have that mismatch.
87
00:04:59.120 --> 00:05:04.239
Why is it that you thought it was $30 and
the meal turned out to be $40.
88
00:05:04.239 --> 00:05:05.630
So there's two possibilities.
89
00:05:05.630 --> 00:05:08.000
One is that your expectations were wrong.
90
00:05:08.000 --> 00:05:11.820
So you thought that the restaurant
was cheaper than it actually was.
91
00:05:11.820 --> 00:05:15.393
Another possibility is that there's
something wrong with the data, for
92
00:05:15.393 --> 00:05:15.920
example.
93
00:05:15.920 --> 00:05:19.192
It's possible that they added up the check
wrong, maybe they charged you for
94
00:05:19.192 --> 00:05:21.169
something for
that you didn't actually eat.
95
00:05:21.169 --> 00:05:24.835
So you can look at the check to see if
there is a problem with the data that you
96
00:05:24.835 --> 00:05:25.500
collected.
97
00:05:29.040 --> 00:05:34.040
One thing to note about
this example is that it was
98
00:05:34.040 --> 00:05:39.450
easy to know whether your expectations
were matched with the data or not.
99
00:05:39.450 --> 00:05:43.892
So for example, if your expectation was
the meal would cost $30, and then it
100
00:05:43.892 --> 00:05:48.691
actually cost $40, you know immediately
that your expectations were not right.
101
00:05:48.691 --> 00:05:52.200
The meal was $10 more than you
actually thought it was gonna be.
102
00:05:52.200 --> 00:05:55.249
And so,
you can make that conclusion very quickly.
103
00:05:55.249 --> 00:05:57.298
Another possibility, for example,
104
00:05:57.298 --> 00:06:01.405
is that you could've said well
the meal being between 0 and $1,000.
105
00:06:01.405 --> 00:06:05.885
And so, when the data actually comes
in and you see the check is $40 then it
106
00:06:05.885 --> 00:06:10.930
actually matches your expectation which
is that it's between 0 and $1,000.
107
00:06:10.930 --> 00:06:15.830
But because your original
expectation was so diffused, and
108
00:06:15.830 --> 00:06:20.535
so kind of general,
you don't really learn that much from
109
00:06:20.535 --> 00:06:25.650
collecting the data given your
very diffused expectation.
110
00:06:25.650 --> 00:06:30.160
So this brings us to an important point
which is that it's important to have
111
00:06:30.160 --> 00:06:32.370
a very sharp expectation or
112
00:06:32.370 --> 00:06:36.180
a sharp hypothesis about what
you're trying to investigate.
113
00:06:36.180 --> 00:06:38.940
When I said that I expected
the meal to be $30,
114
00:06:38.940 --> 00:06:44.210
it was very easy to know when
my expectations were not met.
115
00:06:44.210 --> 00:06:48.575
But if my expectation was very diffused
and not sharp at all, like between 0 and
116
00:06:48.575 --> 00:06:52.880
1,000, then, collecting the data
doesn't really help you.
117
00:06:52.880 --> 00:06:57.430
Or it doesn't help you learn the process
you're trying to study or in this case,
118
00:06:57.430 --> 00:06:59.360
the cost of the meal at this place.
119
00:06:59.360 --> 00:07:02.600
So ultimately,
what we're leaning toward with
120
00:07:02.600 --> 00:07:06.820
setting your expectations in collecting
data is called a change in behavior or
121
00:07:06.820 --> 00:07:09.620
an understanding of the mechanism
you're trying to study.
122
00:07:09.620 --> 00:07:13.370
What did we learn, and
what would you do differently next time?
123
00:07:13.370 --> 00:07:16.465
So in this scenario where you
thought it was gonna be $30 and
124
00:07:16.465 --> 00:07:20.589
it ended up being $40, well then the next
time you might bring an extra $10.
125
00:07:20.589 --> 00:07:23.211
If you originally thought it
was gonna be between 0 and
126
00:07:23.211 --> 00:07:27.429
$1,000 then the cost ended up being $40,
it's not clear that you would change
127
00:07:27.429 --> 00:07:29.992
anything about your behaviour
based on this data.
128
00:07:29.992 --> 00:07:35.490
And so, if there is no change
in what you might think or
129
00:07:35.490 --> 00:07:38.950
what you might do based on
the collection of the data and
130
00:07:38.950 --> 00:07:43.140
matching it with your expectations,
then that's often a sign that
131
00:07:43.140 --> 00:07:46.740
either the evidence from your experiment
is not very strong or the data analysis
132
00:07:46.740 --> 00:07:51.420
was not able to generate enough evidence,
or there may be some other problem.
133
00:07:51.420 --> 00:07:55.890
With your study or
your data analysis process.
134
00:07:55.890 --> 00:08:00.130
So setting the right expectations and
making them as sharp as possible
135
00:08:00.130 --> 00:08:03.730
is a really key element to this
whole data analysis cycle.
Thursday, June 11, 2020
Sunday, March 22, 2020
Google Classroom (GC) - Asas
Apabila telah selesai membuat kelas, kelas anda akan dipaparkan pages baharu bersama Tab yang ada di page GC seperti berikut:
Stream
Stream kini memberi tumpuan kepada pengumuman dan siaran (jika anda membenarkan pelajar untuk menyiarkan dan mengulas di Bilik Darjah).
Anda juga akan melihat notifikasi dalam Stream jika sesuatu telah ditambah ke kerja kelas, seperti ketika anda membuat tugas baru.
Classwork
Majoriti masa anda kini akan digunakan di tab Classwork. Di sinilah guru boleh membuat tugasan, menambah soalan, membuat topik, dan menggunakan semula posts yang telah dibuat.
Bahagian Classwork dibahagikan mengikut topik untuk membuatnya lebih mudah dalam mencari tugasan. Anda boleh membuat tiga topik sebagai contoh, Kerja Harian, Sumber Rujukan, dan Unit 1: Tema. Anda boleh membuat topik untuk kelas anda dengan cara yang anda suka.
Butang "Create" telah berpindah dari bahagian bawah ke kiri atas halaman. Perhatikan, disini di mana anda menambah bahan kelas anda.
Selain itu, terdapat juga baris yang boleh dikembangkan dalam halaman Classwork yang memberikan anda keupayaan untuk melihat perkara-perkara seperti Done / Not Done, kini dinamakan semula "Turned In" dan "Assigned".
People
Apabila melihat seksyen ini dinamakan sebagai People, ianya amat mengelirukan untuk pengguna pertama kali. Anda tidak melihat jadual pelajar anda di tab Pelajar (Ianya telah dinamakan sebagai People), Dimana anda kini boleh menguruskan semua "orang" yang mungkin berada dalam kelas anda, termasuk guru bersama.
Stream
Stream kini memberi tumpuan kepada pengumuman dan siaran (jika anda membenarkan pelajar untuk menyiarkan dan mengulas di Bilik Darjah).
Anda juga akan melihat notifikasi dalam Stream jika sesuatu telah ditambah ke kerja kelas, seperti ketika anda membuat tugas baru.
Classwork
Majoriti masa anda kini akan digunakan di tab Classwork. Di sinilah guru boleh membuat tugasan, menambah soalan, membuat topik, dan menggunakan semula posts yang telah dibuat.
Bahagian Classwork dibahagikan mengikut topik untuk membuatnya lebih mudah dalam mencari tugasan. Anda boleh membuat tiga topik sebagai contoh, Kerja Harian, Sumber Rujukan, dan Unit 1: Tema. Anda boleh membuat topik untuk kelas anda dengan cara yang anda suka.
Butang "Create" telah berpindah dari bahagian bawah ke kiri atas halaman. Perhatikan, disini di mana anda menambah bahan kelas anda.
Selain itu, terdapat juga baris yang boleh dikembangkan dalam halaman Classwork yang memberikan anda keupayaan untuk melihat perkara-perkara seperti Done / Not Done, kini dinamakan semula "Turned In" dan "Assigned".
People
Apabila melihat seksyen ini dinamakan sebagai People, ianya amat mengelirukan untuk pengguna pertama kali. Anda tidak melihat jadual pelajar anda di tab Pelajar (Ianya telah dinamakan sebagai People), Dimana anda kini boleh menguruskan semua "orang" yang mungkin berada dalam kelas anda, termasuk guru bersama.
Subscribe to:
Posts (Atom)
5 Faedah RTOS Linux Kernel - Apa Itu Sistem Operasi Masa Nyata (RTOS) di Malaysia
Di VIENNA dimana selepas 20 Tahun, Real-Time Linux Akhirnya Masuk ke Dalam Kernel Utama Linux. Itulah pada yang memahami bagaimana berkemban...